Commit Graph

43 Commits

Author SHA1 Message Date
mpikulski
50c78179dd
[TRTLLM-8425][doc] document Torch Sampler details (#10606)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-13 12:01:20 +01:00
Patrice Castonguay
e8cceb06b2
[None][doc] Adding parallelism types in feature combination matrix (#9849)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2026-01-07 12:52:05 -05:00
Venky
aa1fe931de
[None][docs] Add --config preference over --extra_llm_api_options in CODING_GUIDELINES.md (#10426)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-01-05 22:05:47 -05:00
Mike Iovine
77712ed4ab
[None][chore] Update SWA + spec dec support matrix (#10421)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-01-05 20:26:23 -05:00
Cheng Hang
656c705ff1
[None][feat] sm100 weight-only kernel (#10190)
Signed-off-by: Cheng Hang <chang@nvidia.com>
2026-01-05 09:44:36 +08:00
Lucas Liebenwein
937f8f78a1
[None][doc] promote AutoDeploy to beta feature in docs (#10372)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-02 18:46:31 -05:00
Jatin Gangani
97b38ac403
[None] [doc] Update IFB performance guide & GPTOSS deployment guide (#10283)
Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
2025-12-25 05:52:04 -05:00
heyuhhh
7395ca93b6
[None][doc] Add Sparse Attention feature doc (#9648)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-25 00:26:18 -05:00
zackyoray
f6c3bc16b9
[None][docs] Add NIXL-Libfabric Usage to Documentation (#10205)
Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com>
2025-12-23 23:05:40 -05:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
Aurelien Chartier
7175d89b48
[None][fix] Fix iteration stats for spec-dec (#9855)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-16 14:11:38 -08:00
QI JUN
3daca4fea3 [https://nvbugs/5729847][doc] fix broken links to modelopt (#9868)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
xiweny
2756a0da60 [TRTLLM-4629][doc] Add B300 & GB300 in documents (#9663)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
William Zhang
28b02b4f5a
[None][docs] Add README for Nemotron Nano v3 (#10017)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-15 22:17:24 -08:00
Tian Zheng
ece3a8748f
[None][doc] Update doc for NVFP4 KV cache (#9475)
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-12-10 06:20:12 -08:00
Chenjie Luo
d252101a76
[OMNIML-3036][doc] Re-branding TensorRT-Model-Optimizer as Nvidia Model-Optimizer (#9679)
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
2025-12-07 07:14:05 -08:00
Enwei Zhu
b46e78e263 [TRTLLM-9157][doc] Guided decoding doc improvement (#9359)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Pengyun Lin
c6dc68a28e [None][doc] VDR 1.0 trtllm-serve doc enhancement (#9443)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
jthomson04
6332bf27e6 [TRTLLM-9199][docs] KV Connector Docs (#9325)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
brb-nv
5d6edc3944
[None][doc] Add feature docs for helix parallelism (#9684)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-04 18:08:40 -08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase (#9522)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
Lucas Liebenwein
2f8bd6fb36
[#9150][feat] AutoDeploy Nemotron-Flash support (#9504)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-27 18:03:57 +01:00
mpikulski
1944fb15af
[None][fix] add missing CLI option in multimodal example (#8977)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-07 09:06:08 +01:00
Guoming Zhang
65b793c77e
[None][doc] Add the missing content for model support section and fix valid links for long_sequence.md (#8869)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-11-03 02:06:04 -08:00
Yi Zhang
496b419791
[None][doc] Add doc for torch.compile & piecewise cuda graph (#8527)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
2025-10-29 21:15:46 -07:00
Sharan Chetlur
a2e964d9a8
[None][doc] Minor doc update to disagg-serving (#8768)
Signed-off-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
2025-10-29 17:38:06 -07:00
Robin Kobus
990b0c0c47
[TRTLLM-7159][docs] Add documentation for additional outputs (#8325)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-27 09:52:04 +01:00
Yueh-Ting (eop) Chen
85088dce05
[None][chore] Update feature combination matrix for SWA kv cache reuse (#8529)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-10-21 04:41:44 -04:00
Bo Deng
dd25595ae8
[TRTLLM-7964][infra] Set nixl to default cache transceiver backend (#7926)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-10-19 19:24:43 +08:00
h-guo18
55fed1873c
[None][chore] AutoDeploy: cleanup old inference optimizer configs (#8039)
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-17 15:55:57 -04:00
Leslie Fang
023e515d33
[None][chore] Combine two documents of feature combination matrix (#8442)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2025-10-17 14:31:33 +08:00
Erin
f4e7738f65
[None][doc] Ray orchestrator initial doc (#8373)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-14 21:17:57 -07:00
Chuang Zhu
f98fa0cf8b
[None][feat] Optimize kv cache transfer TEP (#7613)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-09-25 20:20:04 -07:00
Guoming Zhang
9f0f52249e [None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-25 21:02:35 +08:00
Guoming Zhang
ab915fb333 [None][doc] Use hash id for external link (#7641)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-22 14:28:38 +08:00
Guoming Zhang
5c54173054 [None][doc] Fix a invalid link and a typo. (#7634)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-22 14:28:38 +08:00
QI JUN
39248320d4
[None][feat] add an example of KV cache host offloading (#7767)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-09-17 13:51:15 +08:00
Chang Liu
98f533453a
[TRTLLM-7398][doc] Add doc for KV cache salting support (#7772)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-09-16 14:49:14 -07:00
Shi Xiaowei
809c4d20c0
[None][doc] Fix the link in the doc (#7713)
Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
2025-09-16 09:50:25 +08:00
Guoming Zhang
7f3f658d5f [None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00
Guoming Zhang
35dac55716 [None][doc] Update kvcache part (#7549)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00
Guoming Zhang
f53fb4c803 [TRTLLM-5930][doc] 1.0 Documentation. (#6696)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00