Yanchao Lu
cd7762a2fa
[None][test] Fix an invalid test name ( #11195 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-02-02 23:25:51 +08:00
Rundong Li
f1b85fea4c
[None][feat] Integrate cuda.tile RMS norm kernels ( #9725 )
...
Signed-off-by: Rundong (David) Li <davidli@nvidia.com>
Co-authored-by: Jinman Xie <jinmanx@nvidia.com>
Co-authored-by: Alexey Bylinkin <abylinkin@nvidia.com>
Co-authored-by: Qiqi Xiao <qiqix@nvidia.com>
Co-authored-by: Biao Wang <biaow@nvidia.com>
Co-authored-by: Thomas Schmid <thschmid@nvidia.com>
2026-02-02 19:44:27 +08:00
Ivy Zhang
fa5c3ead05
[None][test] Update test list ( #10883 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Zheyu Fu
d31482686c
[ https://nvbugs/5680911 ][fix] Remove @cache decorator to enhance CI stability for unit tests using single process mode ( #10730 )
...
Signed-off-by: Zheyu Fu <zheyuf@NVIDIA.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Enwei Zhu
7e5e5b90b9
[ https://nvbugs/5748600 ][ci] Update guided decoding waive list ( #10904 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Yuxian Qiu
dd0a5491ba
[ https://nvbugs/5701445 ][chore] unwaive tests. ( #10913 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Yuxian Qiu
40d6f23dad
[ https://nvbugs/5784543 ][chore] unwaive test. ( #10906 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Lucas Liebenwein
68a18f7a3a
[ https://nvbugs/5814247 ][fix] AutoDeploy: skip mxfp4_moe test unless on Hopper ( #10729 ) ( #10850 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Michal Guzek
fafc22e3d4
[ https://nvbugs/5691730 ][fix] Have LoRa bf16 ckpts work with Llama 3.3-70B-fp8 ( #9808 )
...
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
William Zhang
bc2487bc2c
[ https://nvbugs/5826962 ][fix] Fix PD disaggregation for VLMs that use mrope ( #10865 )
...
* Why?
Commit a6a8898 enabled EPD disaggregation for VLMs that use mrope (e.g.
qwen). However, this broke PD disaggregation for these sames models.
* What?
This commit fixes this, and adds a unit test that guards against it.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Lizhi Zhou
4d282bd7c1
[ https://nvbugs/5821433 ][fix] fix test_auto_scaling for 2 GPUs ( #10866 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
HuiGao-NV
8fd22ac72d
[ https://nvbugs/5740377 ][fix] Prevent out-of-bounds read ( #10868 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
Co-authored-by: Thor Johnsen <41591019+thorjohnsen@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
JunyiXu-nv
2a5b8800e1
[ https://nvbugs/5754977 ][fix] Use free port for serve test ( #10878 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-02 16:26:46 +08:00
Yi Zhang
0306c0f12c
[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime ( #10659 )
...
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
2026-02-02 14:29:02 +08:00
Emma Qiao
d3df3f6feb
[None][infra] Waive failed cases and disable a stage on 02/02 ( #11177 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-02 13:28:53 +08:00
Jin Li
77afcbddae
[ https://nvbugs/5823284 ][fix] Unwaive no repro hang issue ( #11138 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2026-02-01 23:02:27 -05:00
Liao Lanyu
fef0e4b17d
[TRTLLM-10666][chore] Refactor request fetching logic for better separation of concerns ( #10988 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>
Signed-off-by: Liao Lanyu <108499334+lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2026-02-02 10:36:08 +08:00
Lizhi Zhou
b00e8338ec
[ https://nvbugs/5834212 ][fix] prevent routing ctx and gen requests to the same worker; update doc for unique disagg ID ( #11095 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-02-02 09:54:33 +08:00
Emma Qiao
1c8f8bed00
[None][infra] Waive failed cases for main on 1/30 ( #11142 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-02-01 22:38:24 +08:00
Yanchao Lu
2e757e8151
[None][ci] Waive a flaky test on A10 ( #11163 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-02-01 00:07:23 +08:00
shuyixiong
278ced972b
[TRTLLM-9771][feat] Allow overriding quantization configs ( #11062 )
...
Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>
2026-01-31 10:48:51 -05:00
bhsueh_NV
d1e4527c06
[ https://nvbugs/5804683 ][infra] unwaive Mistral Large3 test ( #10680 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-31 17:50:34 +08:00
Frida Hou
7910d4d2a9
[ #8242 ][feat] Add int4 GPTQ support for AutoDeploy ( #8248 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-30 23:07:24 -08:00
Guoming Zhang
6bace84167
[TRTLLM-10398][feat] Enable TRTLLM moe backend for Nemotron Super ( #10791 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-31 13:48:25 +08:00
Karthik
5a97374f3c
[ #9525 ][feat] add L2 norm pattern matcher and fusion transform ( #10767 )
...
Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>
2026-01-30 16:05:53 -05:00
nvyocox
4af47208d8
[None][feat] Export ONNX for DriveOS LLM ( #10117 )
...
Signed-off-by: yocox <yocox@nvidia.com>
2026-01-30 15:43:11 -05:00
dominicshanshan
5d7411e131
[ https://nvbugs/5853997 ][chore] Waive test ( #11132 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-30 23:39:27 +08:00
Yao Yao
53cb762ee5
[None][feat] New KVCacheManagerV2 APIs for Transceiver ( #11003 )
...
Signed-off-by: Yao Yao <lowsfer@users.noreply.github.com>
2026-01-30 18:09:53 +08:00
Enwei Zhu
5ff244ce54
[ https://nvbugs/5837281 ][fix] Fix trtllm-serve guided decoding test ( #11101 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-30 16:59:55 +08:00
JennyLiu
6506d63466
[None][test] Add DGX-Spark VLM gemm3-12b bfp16/fp4/fp8 accuracy and perf cases ( #11096 )
...
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-01-30 00:38:19 -05:00
Yueh-Ting (eop) Chen
e1e3bb8592
[ https://nvbugs/5775544 ][fix] Unwaive test ( #11023 )
...
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2026-01-30 09:39:08 +08:00
Chang Su
dbad94715b
[None][feat] Add gRPC server for high-performance external router integration ( #11037 )
...
Signed-off-by: Chang Su <chang.s.su@oracle.com>
2026-01-30 07:48:27 +08:00
Chenghao Zhang
e033929221
[None][feat] AutoDeploy: Flashinfer kernels bringup ( #10867 )
...
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-29 14:59:29 -08:00
Mike Iovine
0ad87895f5
[ https://nvbugs/5836592 ][fix] Fix qwen3 eagle test ( #11030 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-01-29 14:49:08 -08:00
Lucas Liebenwein
a4880ffdbb
[None][fix] AutoDeploy: remove mem check for a log unit test ( #11120 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-29 15:41:51 -05:00
Stefan Niebler
7d31532850
[TRTLLM-10312][perf] Improve performance of _write_finish_reasons in TorchSampler ( #10459 )
...
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
2026-01-29 11:06:09 -05:00
WeiHaocheng
80dd6e70c6
[TRTLLM-10415][feat] Dump thread stacks for hanging tests before time… ( #10708 )
...
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
2026-01-29 20:43:34 +08:00
Balaram Buddharaju
c7a86f89de
[TRTLLM-10264][feat] Support attention DP + Helix CP ( #10477 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-29 02:57:13 -05:00
Zhanrui Sun
21d475a391
[None][infra] Waived flaky tests ( #11091 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2026-01-29 02:18:30 -05:00
Tailing Yuan
91528365a9
[None][feat] Add performance alignment to layer-wise benchmarks ( #11018 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-29 14:01:51 +08:00
Anish Shanbhag
24ac86c485
[ https://nvbugs/5761391 ][fix] Include triton-kernels as a packaged dependency ( #10471 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-28 19:56:32 -08:00
Bala Marimuthu
393c3d259e
[ #10245 ][feat] AutoDeploy: Add Minimax M2 support ( #10525 )
...
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
2026-01-28 17:22:32 -05:00
gramnarayan
744a955cbb
[None][chore] AutoDeploy: Eagle One-Model [1/n]: PyTorch impl for Eagle3 Llama checkpoint ( #10674 )
...
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2026-01-28 12:10:49 -08:00
Emma Qiao
0ffa77af51
[None][infra] Waive failed cases for main on 1/28 ( #11053 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-28 06:11:06 -05:00
yingguo-trt
e70a55bd94
[None][feat] support multi_acc and Lyris GB200 test ( #11024 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-28 06:01:48 -05:00
Grzegorz Kwasniewski
38bcee189c
[TRTLLM-10362][feat] Added Mamba and MLA layers to the sharding tests ( #10364 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
2026-01-28 10:34:10 +01:00
Pengbo Wang
d008494232
[ https://nvbugs/5779536 ][fix] Cherry-pick #10902 : Unwaive DeepSeekR1 nvfp4 pp4 mtp test case ( #10902 ) ( #11000 )
...
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2026-01-28 14:18:53 +08:00
xinhe-nv
dc5eda546b
[None][fix] unwaive tests ( #11047 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-27 23:49:07 -05:00
dongfengy
1c2e415b3a
[ https://nvbugs/5756804 ][fix] Re-enable passing test ( #10986 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2026-01-28 11:23:43 +08:00
Simeng Liu
bae2fac834
[ https://nvbugs/5721661 ][chore] Unwaive fixed bug. ( #11009 )
...
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
2026-01-27 11:41:48 -08:00