Bo Deng
be88fe33be
[None][fix] fix tinygemm accuracy ( #11411 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2026-02-10 05:09:30 -05:00
mpikulski
adc0d82500
[ https://nvbugs/5791242 ][chore] remove obsolete code ( #11388 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-02-10 10:55:29 +01:00
Yiqing Yan
21cdc39e83
[TRTLLM-10331][infra] Upload unittest sub results in slurm ( #10834 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2026-02-10 17:53:35 +08:00
dominicshanshan
2a4e70b4a9
[None][chore] Unwaive tests after last MI ( #11400 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-10 17:12:39 +08:00
Emma Qiao
8a74ccc57e
[None][infra] Waive failed cases for main branch on 02/10 ( #11413 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-10 03:21:59 -05:00
Yuxian Qiu
5f4df89109
[None][feat] Fully non-blocking pipeline parallelism executor loop. ( #10349 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-02-10 15:43:28 +08:00
Lizhi Zhou
c233692485
[None][doc] add multiple-instances section in disaggregated serving doc ( #11412 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-02-10 02:31:45 -05:00
Emma Qiao
17cc1c13d6
[None][infra] Enable sparck ci since spark cloud migration is done ( #11407 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-10 01:47:22 -05:00
shuyixiong
c3cdc93211
[TRTLLM-9771][feat] Make update_weights compatible with CUDA Graph ( #11267 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2026-02-10 01:12:49 -05:00
Jonas Li
8b2dc57823
[None][chore] Mass merge commits from release/1.2.0rc6.post1 branch ( #11384 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Co-authored-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
2026-02-10 14:00:42 +08:00
Venky
0c8b5221b4
[TRTC-264][doc] Add CLAUDE.md and AGENTS.md ( #11358 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-02-09 20:29:58 -08:00
Lucas Liebenwein
a2fb5afecf
[ #11032 ][feat] MLA revisited and GLM 4.7 Flash support ( #11324 )
2026-02-09 23:26:51 -05:00
Venky
d50f010fa9
[TRTC-265][chore] Add CODEOWNERS coverage for serve/ and commands/ directories ( #11359 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-02-09 22:52:09 -05:00
Emma Qiao
85919d9517
[None][infra] Disable spark stages due to migration of spark cloud ( #11401 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-09 22:31:09 -05:00
Yuan Tong
4fc3644705
[None][fix] Avoid reserved filename on Windows ( #11382 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2026-02-10 11:22:59 +08:00
JennyLiu
b5508ed75b
[None][test] Add DGX-Spark multinode perf cases including eagle3 ( #11184 )
...
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-02-10 10:44:41 +08:00
Mike Iovine
f33086914f
[ https://nvbugs/5843112 ][chore] Unwaive ngram test ( #11320 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-02-09 21:31:29 -05:00
Yuxian Qiu
af68c29d3d
[None][chore] Reduce attention module repeated warnings. ( #11335 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-02-10 08:58:21 +08:00
Lucas Liebenwein
fe4c690b6c
[ https://nvbugs/5855540 ][fix] AutoDeploy: thread cleanup of eagle test ( #11289 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-02-09 18:01:12 -05:00
Ziyi Xiong
e76b634251
[TRTLLM-10321][feat] Support different KV cache layout for one-model spec dec ( #10502 )
...
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
2026-02-10 05:16:02 +08:00
Mike Iovine
092f4ce774
[ https://nvbugs/5853997 ][chore] Unwaive gpt-oss test ( #11287 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-02-09 16:04:41 -05:00
Patrice Castonguay
c68d916b6f
[None][chore] Unit test for disagg gen cancellation ( #11108 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2026-02-09 14:39:02 -05:00
tcherckez-nvidia
ea81a03dd1
[None][chore] update model list ( #11364 )
...
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-02-09 21:27:39 +02:00
Bala Marimuthu
4a743338c3
[None][infra] AutoDeploy: Dump graph IR after every transform ( #11045 )
...
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
2026-02-09 10:43:44 -08:00
Lizhi Zhou
e719721a60
[TRTLLM-10866][feat] implement disaggregated harmony chat ( #11336 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-02-09 12:09:03 -05:00
Harris Nover
100bfdc516
[None][fix] Respect CUDA_LAUNCH_BLOCKING by fixing doCheckError ( #11261 )
...
Signed-off-by: Harris Nover <249353502+hnover-nv@users.noreply.github.com>
2026-02-09 11:49:56 -05:00
Guiju Zhang
c37531c3f7
[TRTLLM-10669][fix] Fix Eagle3 draft model weight loading for throughput checkpoint ( #11010 )
...
Signed-off-by: Guiju Zhang <7135567+cascade812@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Ivy Zhang
9384cf8458
[ https://nvbugs/5839569 ][test] update test constraint ( #11054 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Emma Qiao
03b635bb08
[None][infra] Waive failed case for release on 1/28 ( #11055 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Lizhi Zhou
1524c172a4
[ https://nvbugs/5821433 ][fix] WAR for popen in QA env ( #10989 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Balaram Buddharaju
5f8b1b8cbb
[ https://nvbugs/5811087 ][chore] Unwaive Gemma3 27B multimodal test ( #11049 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Enwei Zhu
1ba039f044
[ https://nvbugs/5819452 ][ci] Unwaive LLaMA2 7B FP8 case ( #10997 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
William Zhang
abb8106c01
[ https://nvbugs/5835925 ][fix] Add EPD disagg support for Qwen3 VL MoE ( #10962 )
...
* Why?
Trying to instantiate a `MultimodalEncoder` for a Qwen3 VL MoE model
would fail during weight loading.
* What?
This commit fixes the bug, alongside:
- explicit, intentional support for EPD for Qwen3 VL MoE.
- extends EPD unit tests for Qwen3 VL MoE, albeit with dummy weights.
- unit tests for the weight mapper fixes.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Jin Li
0ead17bb85
[ https://nvbugs/5800646 ][fix] Fix hang issue by avoid exposing UB buf… ( #10842 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
yingguo-trt
d348dd95a7
[None][feat] support Lyris GB200 and increase disagg test timeout ( #11019 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
yufeiwu-nv
fd4e6132e5
[None][test] Fix missing test cases ( #10881 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Stefan Niebler
d50010cd1f
[ https://nvbugs/5769815 ][fix] Fix offset calculation in _are_stop_words when using speculative decoding ( #10854 )
...
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Lizhi Zhou
6c4e0c3dbe
[ https://nvbugs/5826689 ][fix] replace etcd3 with etcd-sdk-python ( #10886 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Emma Qiao
c659280445
[None][infra] Waive failed cases for release branch on 01/26 ( #10999 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
Pengbo Wang
59f59efb83
[ https://nvbugs/5779536 ][fix] Unwaive DeepSeekR1 nvfp4 pp4 mtp test case ( #10902 )
...
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
JunyiXu-nv
90ea6c1e09
[ https://nvbugs/5804146 ][fix] Enable responses tests and remove ds to… ( #10925 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-09 23:53:40 +08:00
mpikulski
196d94a419
[TRTLLM-10030][perf] avoid syncs in beam search + other improvements ( #11349 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-02-09 16:13:58 +01:00
Gal Hubara-Agam
2b60cc181c
[ #10780 ][feat] AutoDeploy: Support per-expert scales in FP8 and NVFP4 MoE ( #11322 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Gal Hubara-Agam <96368689+galagam@users.noreply.github.com>
2026-02-09 10:07:37 -05:00
Lizhi Zhou
540fb0f29e
[ https://nvbugs/5834212 ][chore] unwaive test_disaggregated_mixed ( #11372 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-02-09 09:16:25 -05:00
Robin Kobus
b3e4ddc953
[None][test] Enhance multi-GPU tests for IFB stats ( #11239 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2026-02-09 17:25:32 +08:00
Robin Kobus
31db399042
[ https://nvbugs/5829097 ][fix] Disaggregated serving: Only send finished context requests to the KV cache transceiver ( #11354 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2026-02-09 17:11:45 +08:00
Bo Li
ab73f6ebc6
[None][chore] Add microbench for MoE Comm methods. ( #10317 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-02-09 02:57:01 -05:00
Yihan Wang
635d65f9fe
[None][chore] Move test_trtllm_flashinfer_symbol_collision.py to tests/unittest/_torch ( #11168 )
...
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
2026-02-09 13:57:57 +08:00
Emma Qiao
ad8f6748a3
[None][infra] Waive failed case for main branch on 02/09 ( #11369 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-08 23:05:33 -05:00
TensorRT LLM
fe9192f120
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-02-09 03:16:42 +00:00