heyuhhh
a08eb81cce
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 ( #9572 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2025-12-03 11:33:46 +08:00
yufeiwu-nv
21f2ba74e8
[None][test] Remove duplicate test cases ( #9623 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-03 10:35:26 +08:00
brb-nv
55c7023c92
[None][chore] Waive test failing on pre-merge ( #9638 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 07:31:10 +08:00
Patrice Castonguay
3991aa9c72
[ https://nvbugs/5688388 ][fix] fix: Reducing num request in disagg test to speed up ( #9598 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-02 12:48:53 -05:00
Shi Xiaowei
227d42e492
[ https://nvbugs/5651854 ][fix] Fix dist-serving perf by clearing CPU affinity ( #9549 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-03 01:17:03 +08:00
Mike Iovine
d5b7f0c8ad
[TRTLLM-8980][test] Clean up spec dec tests in test_llm_api_pytorch ( #8889 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-02 10:32:02 -05:00
Yan Chunwei
b86256eb54
[TRTLLM-9144][fix] enhance RPC robustness ( #8711 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-12-02 21:37:59 +08:00
brb-nv
be48cdf1d1
[TRTLLM-9466][test] Evaluate helix parallelism with DSV3 Lite ( #9597 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-02 20:10:07 +08:00
Emma Qiao
4a8766c11d
[None][infra] Remove an invalid test name in waives.txt ( #9620 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 18:05:17 +08:00
Emma Qiao
3e4f2388a9
[None][infra] Waive failed cases for main branch ( #9615 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 15:48:27 +08:00
shuyixiong
1a2118b8fe
[ https://nvbugs/5702793 ][fix] Fix uncontiguous tensor view ( #9576 )
...
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-12-02 15:41:32 +08:00
xinhe-nv
ad46d19027
[None][chore] Add failed cases into waives.txt ( #9588 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 14:24:11 +08:00
ruodil
4586b5f42f
[ https://nvbugs/5582091 ][test] increase warmup times in testing for multi-gpu cases ( #9578 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-12-02 14:22:49 +08:00
Wanli Jiang
5657a00ec0
[FMDL-1328][feat] Add support for nano-v3 and super-v3 with pytorch backend ( #9261 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-02 13:40:20 +08:00
xinhe-nv
3911d0496e
[None][fix] Waive gb200 ( #9580 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 12:09:21 +08:00
JunyiXu-nv
9a6df980cd
[ https://nvbugs/5703953 ][fix] Use random port for disagg tests ( #9582 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-02 11:40:14 +08:00
Iman Tabrizian
356a52edf5
[None][feat] Add support for KVCache reuse for DSv32 ( #9383 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-12-02 11:14:30 +08:00
Venky
639c939a4f
[TRTC-1943][feat] Env vars override support in LLM API ( #9104 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-12-01 10:04:49 -08:00
Yanchao Lu
7127c4407a
[None][test] [None][test] Waive main branch test failures 12/1 ( #9566 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-01 21:54:53 +08:00
Shi Xiaowei
48b1d31895
[ https://nvbugs/5651854 ][infra] Enable perf metrics during accuracy testing ( #9140 )
2025-12-01 20:15:32 +08:00
JadoTu
a92af27411
[None][chore] remove qwen3-next accuracy tests ( #9534 )
...
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-12-01 11:49:37 +08:00
Pengbo Wang
aa3310f64f
[ https://nvbugs/5503479 ][fix] Temporarily lower reference accuracy to stabilize CI ( #9398 )
...
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2025-12-01 11:49:14 +08:00
Enwei Zhu
2e3ac3c48f
[ https://nvbugs/5684703 ][fix] Unwaive disagg guided decoding test ( #9466 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-01 11:39:40 +08:00
JunyiXu-nv
3f588198dc
[None][fix] Fix port conflict in disagg tests ( #9474 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-30 17:33:22 +08:00
Emma Qiao
c927ccf510
[None][infra] Wiave failed tests for main branch on 11/30 ( #9555 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-30 16:13:20 +08:00
brb-nv
b77f4ffe54
[TRTLLM-5971][feat] Integrate helix parallelism ( #9342 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-29 15:17:30 -08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase ( #9522 )
...
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
dominicshanshan
70efa3ac43
[None][infra] Waive failed case in pre-merge on 11/28 ( #9537 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-11-28 20:53:45 +08:00
Emma Qiao
2d7421b314
[None][infra] Waive failed cases for main branch on 11/28 ( #9539 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-28 17:19:55 +08:00
yufeiwu-nv
08755a809d
[ https://nvbugs/5689658 ][test] Fix gpu lock issue running on cluster ( #9441 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-28 13:59:22 +08:00
JunyiXu-nv
c87e81c1d8
[ https://nvbugs/5685015 ][fix] Update invalid max_token test ( #9435 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-28 11:41:16 +08:00
Bo Li
19f3f4e520
[ https://nvbugs/5637037 ][chore] Update waive lists. ( #9386 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-28 10:45:22 +08:00
Yueh-Ting (eop) Chen
4cbfc10b28
[ https://nvbugs/5674665 ][chore] Add test coverage for https://nvbugspro.nvidia.com/bug/5674665 ( #9518 )
...
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-11-27 21:40:34 +08:00
Fanrong Li
2d5eadf65f
[None][fix] fix TP support for DeepSeek-V3.2 on hopper ( #9484 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-27 21:02:25 +08:00
JadoTu
51bf7164d3
[None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 ( #9330 )
...
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-11-27 18:05:00 +08:00
Lizhi Zhou
8104a78931
[None][chore] revert batch_size=1 to prevent timeout and lower accuracy reference by 0.12% as a WAR ( #9447 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 14:25:44 +08:00
Emma Qiao
0442510304
[None][infra] Waive failed case in pre-merge on 11/27 ( #9507 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-27 13:53:33 +08:00
HuiGao-NV
03331bc43d
[ https://nvbugs/5547414 ][fix] enable case after using local cache model ( #9473 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-27 12:18:20 +08:00
Patrice Castonguay
1b2da426cd
[ https://nvbugs/5680310 ][fix] Fix ctx only timed out test ( #9410 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-27 11:21:21 +08:00
Shi Xiaowei
e76e149861
[ https://nvbugs/5608930 ][fix] Fix a typo ( #9487 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 09:05:17 +08:00
Chang Liu
b10137fdd5
[None][feat] Support MLA chunked prefill for DeepSeek V3.2 model ( #9376 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-11-26 16:38:25 +08:00
JunyiXu-nv
b7308a4000
[ https://nvbugs/5580099 ][fix] Cherry pick IMA issue fix from release/1.1 ( #9032 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-26 13:09:06 +08:00
Wanli Jiang
d100599ea7
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm ( #9246 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-26 11:12:35 +08:00
QI JUN
5972119e1c
[None][ci] move some slow test cases of DGX-B200 to post merge ( #9467 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-26 10:48:53 +08:00
fredricz-20070104
6a64cb4c71
[TRTLLM-8936][test] Add disagg and wideep multi-node multi-gpu test cases ( #9356 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-11-26 10:34:49 +08:00
Chuang Zhu
0e9c7f8c07
[ https://nvbugs/5685143 ][fix] avoid cudaFree overlap with cuda graph ( #9438 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-11-25 16:20:29 -08:00
Suyog Gupta
e484bec82f
[None][chore] AutoDeploy add multi stream moe pass to default.yaml ( #9430 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-25 14:16:13 -08:00
Fanrong Li
8da59103d6
[ https://nvbugs/5680905 ][fix] Relax the MMLU accuracy requirement for DS-v3.2 ( #9439 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-26 00:32:20 +08:00
Yan Chunwei
1f43dc8174
[None][ci] waive a test ( #9458 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-25 07:04:20 -08:00
YueWeng
cc336c4abd
[TRTLLM-8160][feat] Add draft token tree runtime on CDL ( #8586 )
...
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-11-25 09:40:55 -05:00
Shi Xiaowei
60786574db
[None][fix] Mitigate test timeout issues ( #9445 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-25 20:17:54 +08:00
Chao Ni
a2d9e6250a
[ https://nvbugs/5667922 ][fix] Update long context evaluation config ( #9426 )
...
Signed-off-by: mni <125171826+baize97@users.noreply.github.com>
2025-11-25 19:33:38 +08:00
Yanchao Lu
ff02e0f05c
[None][ci] Move more test stages to use OCI machines ( #9395 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Matt Lefebvre <matthewelefebvre@gmail.com>
2025-11-25 15:59:13 +08:00
Eran Geva
6af01dc664
[ #8391 ][chore] test_perf.py to lock clocks read from gpu_configs.yml instead of max freq ( #9409 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-11-25 09:20:33 +02:00
Emma Qiao
15616e3ee5
[None][infra] Waive failed cases for main branch on 11/25 ( #9429 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 23:18:15 -08:00
Suyog Gupta
efd503751f
[ #9271 ][perf] Enable multi-stream MOE optimization in AutoDeploy ( #9322 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-24 19:50:10 -08:00
kris1025
d1c724958d
[None][chore] unwaive ampere kernels test ( #9389 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-11-25 11:28:43 +08:00
xinhe-nv
0a9ae2e3e6
[None][chore] Remove closed bugs ( #9381 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-24 18:49:57 -08:00
QI JUN
786d308b88
[ https://nvbugs/5685428 ][fix] fix test_openai_chat_multimodal.py ( #9406 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 16:56:33 -08:00
Yibin Li
1ce483c999
[TRTLLM-7967][feat] Adding Starcoder2 PyTorch Backend Support ( #8923 )
...
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-11-24 11:23:22 -08:00
Emma Qiao
2c869f2bda
[None][infra] Waive failed cases for main ( #9400 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 17:42:19 +08:00
Emma Qiao
af72d93fa9
[None][infra] Waive failed cases on main branch ( #9384 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-23 22:53:02 -08:00
brb-nv
c045e359a7
[ https://nvbugs/5637012 ][fix] Fix helix unit tests ( #9369 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-23 19:34:22 -08:00
QI JUN
34a6d2d28f
[TRTLLM-9302][chore] Move build config from BaseLlmArgs to TrtLlmArgs ( #9249 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 10:54:41 +08:00
Chenghao Zhang
e1c9aa7d6a
[None][chore] AutoDeploy: Add the Nemotron MOE to CI ( #9328 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-23 12:12:12 -08:00
Yan Chunwei
1ef69ecbb1
[None][ci] waive two ray tests ( #9375 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-23 15:39:01 +08:00
dongfengy
268ea9bb8a
[None][test] Add one-model and overlap-scheduling to eagle tests for GPTOSS ( #9312 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-11-21 22:52:53 -08:00
Enwei Zhu
13fbd4366a
[TRTLLM-9370][feat] Integration of CuteDSL NVFP4 grouped GEMM (Part 2: SwiGLU Fusion and Finalize Fusion) ( #9288 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-21 14:03:38 -08:00
Emma Qiao
041564188c
[None][infra] Waive failed cases in main post-merge on 11/21 ( #9360 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-21 18:01:53 +08:00
QI JUN
b6483ef3e7
[None][ci] waive a test case of test_ad_build_small_multi.py ( #9355 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-21 16:25:04 +08:00
Ivy Zhang
28e9bf6167
[None][chore] add periodic junit xml path in conftest ( #9337 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-20 22:46:25 -08:00
QI JUN
e2a372a3b1
[None][ci] waive test_llm_context_only_timed_out_kv_cache_exhausted ( #9351 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-20 20:20:57 -08:00
Barry Kang
a3433dd54e
[ https://nvbugs/5325296 ][fix] Enable relaxed acceptance test on Blackwell ( #8709 )
...
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jin Li
6185225501
[ https://nvbugs/5488118 ][fix] Unwaive passed tests ( #8758 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
xiweny
05aabfbc1e
[ https://nvbugs/5601203 ] [fix]Restrict fp8 blockscale moe case ( #8583 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Eran Geva
3d66e56adb
[ https://nvbugs/5572320 ][fix] Ported test_ad_trtllm_bench.py from main ( #8671 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Yukun He
9a79f32f7a
[ https://nvbugs/5608489 ][fix] Fix output unpack issues for Llama3/4 NVFP4 models. ( #8679 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750
[None][test] Clean cache for certain easily hang cases ( #8619 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jie Li
36e244f35e
[ https://nvbugs/5587456 ][fix] Remove multimodal test cases using TRT backend ( #8611 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
348668e3ae
[ https://nvbugs/5575902 ][fix] set max_batch_size=1 to stabilize accuracy test result ( #8609 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
33b0b945c7
[ https://nvbugs/5582277 ][fix] rework DisaggPPTerminationHandler to fix hang issue ( #8519 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Pengyun Lin
81fd9be87d
[ https://nvbugs/5575829 ][fix] Unwaive gpt-oss test ( #8576 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Bo Deng
4ca6fe83d8
[ https://nvbugs/5565549 ][fix] unwaive test_disaggregated_spec_dec_bat… ( #8500 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
JunyiXu-nv
ee6944bfa2
[ https://nvbugs/5569713 ][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 ( #8429 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
yufeiwu-nv
0e746fad45
[ https://nvbugs/5667454 ][test] Fix Test Case as Chunked Attention not Supported on sm_120 ( #9260 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-20 00:58:42 -08:00
Liao Lanyu
04ad9f96fa
[ https://nvbugs/5667687 ][fix] Set correct lm_head_tp_size_upper_bound ( #9300 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-11-20 00:41:00 -08:00
Emma Qiao
b018b2698d
[TRTLLM-9164][infra] Enable checking duplicate items in waives.txt in pre-commit ( #9265 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-20 15:47:23 +08:00
QI JUN
1bdd3ba173
[None][ci] waive test_disagg_server_restart ( #9326 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-19 22:34:03 -08:00
Yechan Kim
d5622b2689
[None][fix] Multimodal InputProcessor dummy builder fix ( #8916 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-19 22:32:21 -08:00
Chenghao Zhang
cd44f80abd
[ #9316 ][feat] AutoDeploy: Add the accuracy test for Nemotron MOE models ( #9317 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-19 21:48:50 -08:00
Bo Deng
2128f73d58
[TRTLLM-9247][infra] Upgrade NIXL to 0.7.1 ( #9055 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-20 11:01:02 +08:00
brb-nv
f6ec6e2222
[None][chore] Waive tests timing out on main ( #9315 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-19 13:10:06 -08:00
mpikulski
46dd9886bb
[ https://nvbugs/5661877 ][fix] fix test regression in TestBatchedSampling::test_samples ( #9215 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-19 01:44:44 -08:00
xinhe-nv
0f77fec932
[None][chore] Add failed cases into waives.txt ( #9289 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-19 17:03:43 +08:00
nvxuanyuc
a79c0dfb43
[None][fix] Update GLM model accuracy test ( #9286 )
...
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 21:59:01 -08:00
Emma Qiao
67d3eb26af
[None][infra] Waive failed cases for main branch on 11/17 ( #9266 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-18 20:07:03 -08:00
xinhe-nv
286ace22ed
[None][chore] Add failed cases into waives.txt ( #9242 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 19:27:55 -08:00
Ivy Zhang
782dfca7e8
[TRTLLM-9050][test] add llama4 disagg case to cover kv cache overflow error ( #9172 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 18:26:32 -08:00
xinhe-nv
35658eab55
[None][chore] Add failed cases into waives.txt ( #9193 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 17:47:55 -08:00
Enwei Zhu
7c4777a571
[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM ( #8880 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-18 17:40:12 -08:00
Lizhi Zhou
c789000a62
[ https://nvbugs/5649010 ][fix] increase status-checking interval to avoid instability ( #9203 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-19 08:55:42 +08:00
Bo Deng
34f845bf69
[TRTLLM-9287][infra] Use NIXL backend for accuracy tests ( #9247 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-11-18 14:46:20 -08:00
Ajinkya Rasane
8d7cda2318
[None][chore] Update the Flux autodeploy example ( #8434 )
...
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
2025-11-18 14:16:04 -08:00
Kaiyu Xie
d076aa44d3
[None] [tests] Unwaive wide ep related tests ( #9204 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-11-18 08:54:46 -08:00
Ivy Zhang
160b361588
[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check ( #9088 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 05:55:00 -08:00
Ivy Zhang
ca41a71f92
[TRTLLM-8948][test] Add long bench case ( #9165 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 04:41:48 -08:00
Tri Dao
fc088e642c
[None][feat] Support Glm4MoeForCausalLM ( #8256 )
...
Signed-off-by: Tri Dao <daominhtri0503@gmail.com>
Co-authored-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 09:43:21 +08:00
QI JUN
c3376fa114
[None][ci] split speculative test case into several small cases ( #9209 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-17 17:02:25 -08:00
Robin Kobus
df41f220a2
[TRTLLM-8831][feat] Enable early exit with overlap scheduler ( #8587 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-11-17 18:07:13 +01:00
Emma Qiao
d16b1a84c5
[None][infra] Waive a failed case in pre-merge stage 11/16 ( #9192 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-17 09:36:56 +08:00
Emma Qiao
2854f0cf3d
[None][infra] Waive failed tests for main branch 11/15 ( #9187 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-16 01:48:25 -08:00
brb-nv
63237494db
[None][chore] Waive failing tests blocking pre-merge ( #9189 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-16 01:06:03 -08:00
Chang Liu
bed4e95e9f
[ https://nvbugs/5629887 ][fix] Add missing device count guard for DSv32 multiGPU tests ( #9159 )
2025-11-14 07:52:23 -08:00
xinhe-nv
49b7e6301a
[None][chore] Add failed cases into waives.txt ( #9156 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-14 06:28:22 -08:00
yuanjingx87
d72321a32e
[None][ci] Waive unittest/_torch/sampler/test_torch_sampler.py::TestBatchedSampling ( #9161 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-11-14 01:49:26 -08:00
QI JUN
3c950910a0
[None][ci] waive test_disaggregated.py::test_disaggregated_mixed[TinyLlama-1.1B-Chat-v1.0] ( #9162 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-13 18:56:37 -08:00
Tailing Yuan
cc4c980e03
[None][feat] Add Qwen3-Next to layer-wise benchmarks ( #9065 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-14 10:03:00 +08:00
Erin
44d1c75701
[TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server ( #8765 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-11-13 17:21:24 -08:00
William Zhang
121140cfec
[None][fixes] Add tool call parsing fixes and Qwen3 coder parser ( #8817 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-11-13 04:34:38 -08:00
Lizhi Zhou
48a27c7bef
[ https://nvbugs/5633340 ][chore] unwaive test_auto_scaling.py::test_disagg_server_restart ( #9131 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-11-13 01:45:36 -08:00
Emma Qiao
d0ea417ec8
[None][infra] Waive failed tests for main 11/13 ( #9132 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-13 01:00:40 -08:00
xinhe-nv
548f5ce4bc
[None][fix] waive failed tests ( #9090 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 23:40:00 -08:00
xinhe-nv
8fa3c55c76
[None][chore] Remove closed bugs ( #9114 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 22:49:37 -08:00
ruodil
c86e36fe38
[None][test] add deepseek and qwen cases for rtx series ( #8839 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-11-12 22:28:02 -08:00
HuiGao-NV
cde18c12da
[ https://nvbugs/5640873 ][fix] Move thop tests to pre-merge ( #9094 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-13 13:08:13 +08:00
Yan Chunwei
4fd93bdc2c
[None][ci] Waive test_llm_rpc and test_llm_rpc_streaming ( #9118 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-12 19:55:09 -08:00
Zhenhuan Chen
943b05e2d3
[TRTLLM-9179][feat] add pp_partition to customize each rank's layer number ( #9003 )
...
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2025-11-13 10:34:17 +08:00
QI JUN
3416efbc29
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_chunked_prefill ( #9111 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-13 10:06:32 +08:00
dongxuy04
9241ccaf27
[None][feat] Enable EPLB for trtllm-gen and cutlass backend ( #8886 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
2025-11-12 12:30:27 -08:00
Chenghao Zhang
5f26c31954
[ https://nvbugs/5636912 ][fix] AutoDeploy: Unwaive the test ( #9018 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-12 12:26:38 -08:00
Fanrong Li
780d4f9dc5
[None][feat] Add MTP>1 support for DS-v3.2 ( #9045 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-12 09:56:12 -08:00
Iman Tabrizian
cdde15b275
[TRTLLM-8540][feat] Add support for disagg in DSv3.2 ( #8735 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-11-12 08:21:11 -08:00
yufeiwu-nv
b7a2574c60
[ https://nvbugs/5568991 ][test] Remove Phi-3 models ( #9066 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-12 03:16:36 -08:00
QI JUN
4003dc7574
[None][ci] waive some test cases of disaggregated serving ( #9085 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-12 15:06:21 +08:00
Emma Qiao
bb6eb9510d
[None][infra] Waive a failed case of disaggregated/test_disaggregated.py ( #9074 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 19:38:32 -08:00
QI JUN
fd703fbb7b
[None][ci] run speculative unit tests serially ( #9080 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 19:06:44 -08:00
Lucas Liebenwein
aca56097cb
[None][fix] AutoDeploy: update nano3 accuracy test ( #9061 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-11 12:26:31 -08:00
Wanli Jiang
ebdd1cc8e0
[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm ( #8840 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-11 07:48:23 -08:00
QI JUN
0ce22ce928
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] ( #9069 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 02:11:15 -08:00
Yiqing Yan
b7d51c5549
[None][chore] Remove duplicated waive test ( #9067 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-11-11 16:49:49 +08:00
Emma Qiao
da1f0e2465
[None][infra] Waive failed tests on main 11/11 ( #9058 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 13:19:30 +08:00
xinhe-nv
fac522056c
[None][chore] Add failed cases into waives.txt ( #8998 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-11-11 12:40:59 +08:00
Yechan Kim
0938a3ad2a
[ https://nvbugs/5644187 ][fix] Llava-Next MMMU bugfix and Phi4 test bugfix ( #9034 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-11 10:24:31 +09:00
xiweny
50c486367a
[ https://nvbugs/5619396 ][fix] Add sm103 to CutlassFP8RowwiseGemm ( #9042 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-10 08:12:14 -08:00
xinhe-nv
f848d844d9
[None][chore] Add failed cases into waives.txt ( #9030 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-09 23:36:05 -08:00
Fanrong Li
a7033a9193
[TRTLLM-9001][feat] add TP support for DeepSeek-V3.2 ( #8943 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-10 12:16:01 +08:00
Bo Li
67af7c15a5
[ https://nvbugs/5637037 ][fix] Update unwaive list. ( #9001 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-11-10 08:53:07 +08:00
Emma Qiao
183778d58a
[None][infra] Waive failed tests for main 11/07 ( #9008 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-08 08:51:35 -08:00
Emma Qiao
2af6a537ad
[TRTLLM-8999][infra] Reduce gb200 multi-node test stages ( #8778 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-08 06:34:24 -08:00
Yuxian Qiu
7b82ba90da
[ https://nvbugs/5629790 ][chore] unwaive test. ( #8967 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-11-07 18:41:32 +08:00