Guoming Zhang
0371cbfd88
[None][doc] Update Qwen3-Next doc by adding known issues section ( #10582 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-11 14:47:47 +08:00
Fanrong Li
4632a8642d
[None][doc] blog: Optimizing DeepSeek-V3.2 on NVIDIA Blackwell GPUs ( #10565 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-09 05:16:00 -05:00
dongfengy
8d4b09dac6
[None][doc] Update GPTOSS Doc ( #10536 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2026-01-08 02:30:53 -05:00
Patrice Castonguay
e8cceb06b2
[None][doc] Adding parallelism types in feature combination matrix ( #9849 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2026-01-07 12:52:05 -05:00
Venky
aa1fe931de
[None][docs] Add --config preference over --extra_llm_api_options in CODING_GUIDELINES.md ( #10426 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-01-05 22:05:47 -05:00
Mike Iovine
77712ed4ab
[None][chore] Update SWA + spec dec support matrix ( #10421 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-01-05 20:26:23 -05:00
Pengyun Lin
c04cf4334e
[TRTLLM-8242][feat] Add stability tags for serve subcommand ( #10012 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2026-01-05 14:16:15 +08:00
Cheng Hang
656c705ff1
[None][feat] sm100 weight-only kernel ( #10190 )
...
Signed-off-by: Cheng Hang <chang@nvidia.com>
2026-01-05 09:44:36 +08:00
Lucas Liebenwein
937f8f78a1
[None][doc] promote AutoDeploy to beta feature in docs ( #10372 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-02 18:46:31 -05:00
Jatin Gangani
4a5ef84dc2
[None] [doc] Document perfect MoE router feature for perf analysis ( #10303 )
...
Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
2025-12-26 04:27:40 -05:00
Jatin Gangani
97b38ac403
[None] [doc] Update IFB performance guide & GPTOSS deployment guide ( #10283 )
...
Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
2025-12-25 05:52:04 -05:00
heyuhhh
7395ca93b6
[None][doc] Add Sparse Attention feature doc ( #9648 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-25 00:26:18 -05:00
Venky
c059e6caa1
[TRTC-121] [feat] Add recipe selector UI to complement the recipe database ( #10125 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-12-24 23:56:54 -05:00
zackyoray
f6c3bc16b9
[None][docs] Add NIXL-Libfabric Usage to Documentation ( #10205 )
...
Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com>
2025-12-23 23:05:40 -05:00
Harshini Komali
d691371eaf
[TRTLLM-9091] [feat] Replace GenAI-Perf with AIPerf ( #9310 )
...
Signed-off-by: lkomali <lkomali@nvidia.com>
Signed-off-by: Harshini Komali <157742537+lkomali@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-23 13:25:55 +08:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests ( #10005 )
2025-12-19 13:48:43 -05:00
Anish Shanbhag
91a9ae42d2
[TRTC-71][feat] Add regression testing for config database ( #9832 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-12-18 16:15:38 -08:00
Aurelien Chartier
7175d89b48
[None][fix] Fix iteration stats for spec-dec ( #9855 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-16 14:11:38 -08:00
QI JUN
dba9036072
[None][doc] remove nano-vl-v2 model support in release notes ( #9887 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
3daca4fea3
[ https://nvbugs/5729847 ][doc] fix broken links to modelopt ( #9868 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
e6ab864066
[None][doc] Update release notes ( #9739 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Co-authored-by: Laikh Tewari <laikhtewari1@gmail.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Zac Patel
1ffa2c8937
[IB-1920][doc] Update Perf_Overview.md with Benchmarking Results for Release 1.1 ( #9723 )
...
Signed-off-by: Zachary Patel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
xiweny
2756a0da60
[TRTLLM-4629][doc] Add B300 & GB300 in documents ( #9663 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Iman Tabrizian
1fc8bd3cd8
[TRTLLM-9082][doc] Address Dynamo Example feedback ( #9619 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Kaiyu Xie
e41b060fe6
[TRTLLM-9090] [doc] Update online benchmarking docs ( #9611 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
William Zhang
28b02b4f5a
[None][docs] Add README for Nemotron Nano v3 ( #10017 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-15 22:17:24 -08:00
Grzegorz Kwasniewski
83885c69e7
[TRTLLM-9136][feat] 2D parallel EP TP support ( #9459 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
2025-12-15 09:52:29 +01:00
JunyiXu-nv
af899d2fe7
[TRTLLM-9860][doc] Add docs and examples for Responses API ( #9946 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-14 21:46:13 -08:00
Kaiyu Xie
0788635d6c
[TRTLLM-9762] [doc] Update documents for GB300 NVL72 ( #9987 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-14 19:30:28 -08:00
Venky
fd1270b9ab
[TRTC-43] [feat] Add config db and docs ( #9420 )
...
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-12-12 04:00:03 +08:00
Fanrong Li
af2849cc7a
[None][doc] Add DeepSeek-V3.2 to the supported models ( #9893 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-11 18:04:48 +08:00
Tian Zheng
ece3a8748f
[None][doc] Update doc for NVFP4 KV cache ( #9475 )
...
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-12-10 06:20:12 -08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench ( #9250 )
2025-12-08 10:37:40 -08:00
Kaiyu Xie
069b05cf3d
[TRTLLM-9706] [doc] Update wide EP documents ( #9724 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 11:21:11 +08:00
Chenjie Luo
d252101a76
[OMNIML-3036][doc] Re-branding TensorRT-Model-Optimizer as Nvidia Model-Optimizer ( #9679 )
...
Signed-off-by: Chenjie Luo <chenjiel@nvidia.com>
2025-12-07 07:14:05 -08:00
QI JUN
d4f68195c3
[TRTLLM-9092][doc] link to modelopt checkpoints in quick start guide ( #9571 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
QI JUN
0406949f32
[TRTLLM-9093][doc] update hyper links in overview ( #9568 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Yiqing Yan
6ebdf1c304
[None][infra] Updated Linux installation guide ( #9485 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Enwei Zhu
b46e78e263
[TRTLLM-9157][doc] Guided decoding doc improvement ( #9359 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
QI JUN
0915c4e3a1
[TRTLLM-9086][doc] Clean up TODOs in documentation ( #9292 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Pengyun Lin
c6dc68a28e
[None][doc] VDR 1.0 trtllm-serve doc enhancement ( #9443 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
jthomson04
6332bf27e6
[TRTLLM-9199][docs] KV Connector Docs ( #9325 )
...
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders ( #9583 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
Kaiyu Xie
cb87c44912
[TRTLLM-9562] [doc] Add Deployment Guide for Kimi K2 Thinking on TensorRT LLM - Blackwell ( #9711 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-04 19:20:06 -08:00
Thor Johnsen
33224560b8
[None][doc] Added line about partial reuse ( #7846 )
...
Signed-off-by: thorjohnsen <41591019+thorjohnsen@users.noreply.github.com>
2025-12-04 18:19:32 -08:00
brb-nv
5d6edc3944
[None][doc] Add feature docs for helix parallelism ( #9684 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-04 18:08:40 -08:00
QI JUN
d11acee22d
[TRTLLM-9085][doc] fix math formula rendering issues in github ( #9605 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-02 10:18:16 +08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase ( #9522 )
...
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
Grzegorz Kwasniewski
cff54fcae3
[ #8948 ][feat] Support custom sharding config ( #9143 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
2025-11-29 05:28:05 +08:00
Lucas Liebenwein
2f8bd6fb36
[ #9150 ][feat] AutoDeploy Nemotron-Flash support ( #9504 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-27 18:03:57 +01:00