Simeng Liu
|
12085536df
|
[TRTLLM-10487][feat] Add user-provided UUID support for multimodal KV cache identification. (#11075)
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
|
2026-02-12 00:48:47 -05:00 |
|
Bo Li
|
18c992efb1
|
[None][doc] Update Skip Softmax attention blog. (#11443)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2026-02-11 16:08:16 +08:00 |
|
Lizhi Zhou
|
c233692485
|
[None][doc] add multiple-instances section in disaggregated serving doc (#11412)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2026-02-10 02:31:45 -05:00 |
|
Bo Li
|
66caa67357
|
[None][doc] Add sparse attention docs to index. (#11342)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2026-02-06 17:53:41 +08:00 |
|
Bo Li
|
639051e98b
|
[TRTLLM-10021][docs] Skip Softmax Attention blog and docs. (#10592)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2026-02-06 12:11:21 +08:00 |
|
Jiayu Chang
|
e483c7263d
|
[None][docs] Add CUDA Graph + LoRA in Feature Combination Matrix (#11187)
Signed-off-by: Jiayu Chang <jiayuc@nvidia.com>
|
2026-02-05 15:01:59 +01:00 |
|
Lucas Liebenwein
|
925d911fc0
|
[#10966][feat] AutoDeploy: kv cache manager integration [2/2] (#11149)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2026-02-04 09:44:27 -05:00 |
|
Yiqing Yan
|
13420178fc
|
[TRTLLM-10561][infra] Fix jaraco-context and wheel vulnerability (#10901)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2026-02-03 09:54:11 +08:00 |
|
Venky
|
897eb0df2b
|
[None][doc] Fix GLM4-MoE Eagle support documentation (#11198)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2026-02-02 13:36:09 -08:00 |
|
Lizhi Zhou
|
b00e8338ec
|
[https://nvbugs/5834212][fix] prevent routing ctx and gen requests to the same worker; update doc for unique disagg ID (#11095)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2026-02-02 09:54:33 +08:00 |
|
Balaram Buddharaju
|
531f85dc9b
|
[None][feat] Perfect routing for Deepseek models (#11127)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-30 23:46:35 -05:00 |
|
Venky
|
492ed27cdf
|
[None][doc] Add Glm4MoeForCausalLM to model support matrix (#11156)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2026-01-31 10:20:53 +08:00 |
|
nvyocox
|
4af47208d8
|
[None][feat] Export ONNX for DriveOS LLM (#10117)
Signed-off-by: yocox <yocox@nvidia.com>
|
2026-01-30 15:43:11 -05:00 |
|
Yechan Kim
|
a669a163ff
|
[None][doc] Update Qwen2/3-VL's model on supported_models.md (#10797)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2026-01-30 19:40:23 +09:00 |
|
dongfengy
|
4f0c1b2489
|
[TRTLLM-10733][feat] Make TRTLLM MOE the default one for GPTOSS on Blackwell (#11074)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2026-01-29 23:59:19 -08:00 |
|
Lucas Liebenwein
|
ff3a494f5c
|
[#10013][feat] AutoDeploy: native cache manager integration (#10635)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2026-01-27 11:23:22 -05:00 |
|
Linda
|
ce556290c9
|
[None][chore] Removing pybind11 bindings and references (#10550)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2026-01-26 08:19:12 -05:00 |
|
Faraz
|
aa410c57bc
|
[TRTLLM-5366][chore] Add dgx-spark beta notes (#10766)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2026-01-25 18:12:21 +08:00 |
|
Patrice Castonguay
|
93e7ae73ea
|
[None][doc] 1.2 Release Notes Headers (#10722)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2026-01-25 18:12:21 +08:00 |
|
Venky
|
b3146d095d
|
[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2026-01-22 07:24:11 -08:00 |
|
Yechan Kim
|
70caa779a4
|
[None][feat] K-EXAONE MTP support (#10796)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2026-01-22 13:43:00 +09:00 |
|
Yanchao Lu
|
0096b50ba0
|
[None][infra] Update upgrade related docs for release 1.2 (#10760) (#10773)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
|
2026-01-18 00:14:27 +08:00 |
|
Stefan Niebler
|
c4db030b88
|
[TRTLLM-8425][doc] Update sampling documentation (#10083)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
|
2026-01-16 16:58:49 +08:00 |
|
jmydurant
|
b163e66182
|
[None][doc] update doc (add minimax model) (#10746)
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
|
2026-01-16 14:54:52 +08:00 |
|
Anish Shanbhag
|
faa80e73fd
|
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-14 21:06:07 -08:00 |
|
mpikulski
|
052c36ddd2
|
[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-14 10:31:03 +01:00 |
|
mpikulski
|
50c78179dd
|
[TRTLLM-8425][doc] document Torch Sampler details (#10606)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-13 12:01:20 +01:00 |
|
Guoming Zhang
|
0371cbfd88
|
[None][doc] Update Qwen3-Next doc by adding known issues section (#10582)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-11 14:47:47 +08:00 |
|
Fanrong Li
|
4632a8642d
|
[None][doc] blog: Optimizing DeepSeek-V3.2 on NVIDIA Blackwell GPUs (#10565)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-09 05:16:00 -05:00 |
|
dongfengy
|
8d4b09dac6
|
[None][doc] Update GPTOSS Doc (#10536)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2026-01-08 02:30:53 -05:00 |
|
Patrice Castonguay
|
e8cceb06b2
|
[None][doc] Adding parallelism types in feature combination matrix (#9849)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2026-01-07 12:52:05 -05:00 |
|
Venky
|
aa1fe931de
|
[None][docs] Add --config preference over --extra_llm_api_options in CODING_GUIDELINES.md (#10426)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2026-01-05 22:05:47 -05:00 |
|
Mike Iovine
|
77712ed4ab
|
[None][chore] Update SWA + spec dec support matrix (#10421)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2026-01-05 20:26:23 -05:00 |
|
Pengyun Lin
|
c04cf4334e
|
[TRTLLM-8242][feat] Add stability tags for serve subcommand (#10012)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2026-01-05 14:16:15 +08:00 |
|
Cheng Hang
|
656c705ff1
|
[None][feat] sm100 weight-only kernel (#10190)
Signed-off-by: Cheng Hang <chang@nvidia.com>
|
2026-01-05 09:44:36 +08:00 |
|
Lucas Liebenwein
|
937f8f78a1
|
[None][doc] promote AutoDeploy to beta feature in docs (#10372)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2026-01-02 18:46:31 -05:00 |
|
Jatin Gangani
|
4a5ef84dc2
|
[None] [doc] Document perfect MoE router feature for perf analysis (#10303)
Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
|
2025-12-26 04:27:40 -05:00 |
|
Jatin Gangani
|
97b38ac403
|
[None] [doc] Update IFB performance guide & GPTOSS deployment guide (#10283)
Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>
|
2025-12-25 05:52:04 -05:00 |
|
heyuhhh
|
7395ca93b6
|
[None][doc] Add Sparse Attention feature doc (#9648)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-12-25 00:26:18 -05:00 |
|
Venky
|
c059e6caa1
|
[TRTC-121] [feat] Add recipe selector UI to complement the recipe database (#10125)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2025-12-24 23:56:54 -05:00 |
|
zackyoray
|
f6c3bc16b9
|
[None][docs] Add NIXL-Libfabric Usage to Documentation (#10205)
Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com>
|
2025-12-23 23:05:40 -05:00 |
|
Harshini Komali
|
d691371eaf
|
[TRTLLM-9091] [feat] Replace GenAI-Perf with AIPerf (#9310)
Signed-off-by: lkomali <lkomali@nvidia.com>
Signed-off-by: Harshini Komali <157742537+lkomali@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-12-23 13:25:55 +08:00 |
|
Venky
|
dfa11d810e
|
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005)
|
2025-12-19 13:48:43 -05:00 |
|
Anish Shanbhag
|
91a9ae42d2
|
[TRTC-71][feat] Add regression testing for config database (#9832)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2025-12-18 16:15:38 -08:00 |
|
Aurelien Chartier
|
7175d89b48
|
[None][fix] Fix iteration stats for spec-dec (#9855)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-12-16 14:11:38 -08:00 |
|
QI JUN
|
dba9036072
|
[None][doc] remove nano-vl-v2 model support in release notes (#9887)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-16 13:33:20 -05:00 |
|
QI JUN
|
3daca4fea3
|
[https://nvbugs/5729847][doc] fix broken links to modelopt (#9868)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-16 13:33:20 -05:00 |
|
QI JUN
|
e6ab864066
|
[None][doc] Update release notes (#9739)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Co-authored-by: Laikh Tewari <laikhtewari1@gmail.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-16 13:33:20 -05:00 |
|
Zac Patel
|
1ffa2c8937
|
[IB-1920][doc] Update Perf_Overview.md with Benchmarking Results for Release 1.1 (#9723)
Signed-off-by: Zachary Patel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-16 13:33:20 -05:00 |
|
xiweny
|
2756a0da60
|
[TRTLLM-4629][doc] Add B300 & GB300 in documents (#9663)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-16 13:33:20 -05:00 |
|