Chang Liu
26901e4aa0
[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM ( #11462 )
...
Signed-off-by: Chang Liu (Enterprise Products) <liuc@nvidia.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Zhenhua Wang <zhenhuaw@nvidia.com>
Co-authored-by: Freddy Qi <junq@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Zhenhua Wang <zhenhuaw@nvidia.com>
2026-02-14 06:11:11 +08:00
tburt-nv
07cd3d4ff2
[None][chore] Bump version to 1.3.0rc4 ( #11485 )
...
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2026-02-12 16:55:23 -05:00
Eran Geva
31314b9fed
[None][chore] added AutoDeploy nano_v3_scale.yaml ( #10845 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-02-12 01:37:42 -05:00
Mandar Deshpande
936220e746
[None][fix] glm engine build dtype ( #11246 )
...
Signed-off-by: Mandar Deshpande <razzormandar@gmail.com>
2026-02-12 13:27:04 +08:00
Lucas Liebenwein
a2fb5afecf
[ #11032 ][feat] MLA revisited and GLM 4.7 Flash support ( #11324 )
2026-02-09 23:26:51 -05:00
tcherckez-nvidia
ea81a03dd1
[None][chore] update model list ( #11364 )
...
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-02-09 21:27:39 +02:00
Lizhi Zhou
e719721a60
[TRTLLM-10866][feat] implement disaggregated harmony chat ( #11336 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-02-09 12:09:03 -05:00
Yechan Kim
36cb5f8c93
[ https://nvbugs/5747920 ][fix] Fix multimodal serve test ( #11296 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2026-02-05 15:12:53 +09:00
Gal Hubara-Agam
767b8dcab3
[None][chore] AutoDeploy: Set nanov3 and superv3 configs to use flashinfer ssm ( #11183 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
2026-02-04 09:46:15 -08:00
Lucas Liebenwein
925d911fc0
[ #10966 ][feat] AutoDeploy: kv cache manager integration [2/2] ( #11149 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-02-04 09:44:27 -05:00
Xianjie Qiao
e2bd9cce1e
[None][feat] Support disagg slurm jobs rescheduling ( #11218 )
2026-02-04 22:10:36 +08:00
Zhenhuan Chen
3d8c1a51bd
[None][feat] move some disagg script's env configs from bash to submit.py ( #10223 )
...
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-02-04 04:32:04 -05:00
tburt-nv
588db0ed64
[None][chore] bump version to 1.3.0rc3 ( #11238 )
...
Signed-off-by: Tyler Burt <tburt@nvidia.com>
2026-02-04 09:30:45 +08:00
Yi Zhang
0306c0f12c
[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime ( #10659 )
...
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
2026-02-02 14:29:02 +08:00
Frida Hou
7910d4d2a9
[ #8242 ][feat] Add int4 GPTQ support for AutoDeploy ( #8248 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-30 23:07:24 -08:00
nvyocox
4af47208d8
[None][feat] Export ONNX for DriveOS LLM ( #10117 )
...
Signed-off-by: yocox <yocox@nvidia.com>
2026-01-30 15:43:11 -05:00
yuanjingx87
f42a6cbae0
[None][infra] Add source code pulse scan to PLC nightly pipeline ( #10961 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2026-01-30 11:06:48 -08:00
dongfengy
4f0c1b2489
[TRTLLM-10733][feat] Make TRTLLM MOE the default one for GPTOSS on Blackwell ( #11074 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2026-01-29 23:59:19 -08:00
Tailing Yuan
4345636b04
[None][chore] Clean up layer-wise benchmarks code ( #11092 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-29 14:29:37 -05:00
Tailing Yuan
91528365a9
[None][feat] Add performance alignment to layer-wise benchmarks ( #11018 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-29 14:01:51 +08:00
Enwei Zhu
34a730aaf7
[None][fix] Fix enable_alltoall passed to CutlassFusedMoE ( #11016 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-29 12:11:07 +08:00
Anish Shanbhag
24ac86c485
[ https://nvbugs/5761391 ][fix] Include triton-kernels as a packaged dependency ( #10471 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-28 19:56:32 -08:00
Lucas Liebenwein
ff3a494f5c
[ #10013 ][feat] AutoDeploy: native cache manager integration ( #10635 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-27 11:23:22 -05:00
Lizhi Zhou
93ae8a14ab
[ #10889 ][fix] fix pydantic deepcopy bug ( #11004 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-01-27 02:40:13 -05:00
Yiqing Yan
ea5d811aec
[None][chore] Bump version to 1.3.0rc2 ( #11021 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-27 15:26:03 +08:00
Wanli Jiang
4a206351bb
[TRTLLM-10453][feat] Update mamba decode kernel to flashinfer ( #10757 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-27 13:04:40 +08:00
tcherckez-nvidia
43b8a5561c
[None][chore] update AD model list ( #10981 )
...
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-01-26 16:49:50 +02:00
William Zhang
2146c23786
[ #9306 ][refactor] Refactor AutoDeployConfig into LlmArgs ( #10613 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-22 16:02:49 -05:00
Venky
b3146d095d
[TRTC-122][feat] Eagle3 Specdec UX improvements ( #10124 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-01-22 07:24:11 -08:00
Yiqing Yan
0243abee22
[None][chore] Bump version to 1.3.0rc1 ( #10923 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-22 18:45:40 +08:00
Yechan Kim
70caa779a4
[None][feat] K-EXAONE MTP support ( #10796 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2026-01-22 13:43:00 +09:00
Xianjie Qiao
87073d1ce4
[None][fix] Fix copy start_logs in disagg slurm scripts ( #10840 )
...
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
2026-01-21 13:31:25 +08:00
Zhenhuan Chen
066fa4cd93
[None][chore] update config.yaml of slurm scripts to align with submit.py change ( #10802 )
...
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-19 14:46:23 -05:00
Xianjie Qiao
cc0bbde745
[None][feat] Update disagg slurm scripts ( #10712 )
...
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
2026-01-19 15:53:48 +08:00
Zhanrui Sun
df845a028b
[TRTLLM-9581][infra] Use /home/scratch.trt_llm_data_ci in computelab ( #10616 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2026-01-19 00:40:40 -05:00
Kaiyu Xie
4f86c5f5ce
[None] [feat] Support multiple accuracy tasks for slurm scripts ( #10500 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-16 15:50:32 +08:00
heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests ( #10439 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Yiqing Yan
f4ace99218
[None][chore] Bump version to 1.3.0rc0 ( #10681 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-15 13:55:44 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias ( #10099 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00
Yuxian Qiu
39cefd6125
[None][refactor] Unify the usage of MPIDist and TorchDist. ( #10380 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-14 14:05:47 +08:00
Tailing Yuan
38296a472b
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading ( #10562 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-13 19:17:03 +08:00
Wanli Jiang
11da7e3605
[None][fix] Solve pillow version conflict ( #10537 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-12 04:05:54 -05:00
Yechan Kim
8e0d20d901
[TRTLLM-10195][feat] K-EXAONE support ( #10355 )
...
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>
2026-01-12 00:29:51 +09:00
tcherckez-nvidia
f6c4dd885f
[None][chore] Update AutoDeploy model list ( #10505 )
...
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-01-10 08:47:37 +02:00
Yukun He
c5331e6dbb
[None][fix] Setup dist for AutoTuner in Layerwise benchmarking. ( #10534 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-09 14:16:39 +08:00
bhsueh_NV
bea61bb17d
[None][fix] Mistral large 3 few code refine ( #10405 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-08 06:38:49 -05:00
Yiqing Yan
dc6b743fb6
[None][chore] Bump version to 1.2.0rc8 ( #10542 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-08 04:51:44 -05:00
Kaiyu Xie
810249c304
[ https://nvbugs/5769926 ] [fix] Add no container mount home WAR ( #10431 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2026-01-06 13:09:25 +08:00
Venky
aa1fe931de
[None][docs] Add --config preference over --extra_llm_api_options in CODING_GUIDELINES.md ( #10426 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-01-05 22:05:47 -05:00
Gal Hubara-Agam
e98c27ee4f
[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime ( #10397 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
2026-01-05 18:17:27 +02:00