heyuhhh
a08eb81cce
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 ( #9572 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2025-12-03 11:33:46 +08:00
yufeiwu-nv
21f2ba74e8
[None][test] Remove duplicate test cases ( #9623 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-03 10:35:26 +08:00
brb-nv
55c7023c92
[None][chore] Waive test failing on pre-merge ( #9638 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 07:31:10 +08:00
Patrice Castonguay
3991aa9c72
[ https://nvbugs/5688388 ][fix] fix: Reducing num request in disagg test to speed up ( #9598 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-02 12:48:53 -05:00
Shi Xiaowei
227d42e492
[ https://nvbugs/5651854 ][fix] Fix dist-serving perf by clearing CPU affinity ( #9549 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-03 01:17:03 +08:00
Mike Iovine
d5b7f0c8ad
[TRTLLM-8980][test] Clean up spec dec tests in test_llm_api_pytorch ( #8889 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-02 10:32:02 -05:00
Yan Chunwei
b86256eb54
[TRTLLM-9144][fix] enhance RPC robustness ( #8711 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-12-02 21:37:59 +08:00
brb-nv
be48cdf1d1
[TRTLLM-9466][test] Evaluate helix parallelism with DSV3 Lite ( #9597 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-02 20:10:07 +08:00
Emma Qiao
4a8766c11d
[None][infra] Remove an invalid test name in waives.txt ( #9620 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 18:05:17 +08:00
Emma Qiao
3e4f2388a9
[None][infra] Waive failed cases for main branch ( #9615 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 15:48:27 +08:00
shuyixiong
1a2118b8fe
[ https://nvbugs/5702793 ][fix] Fix uncontiguous tensor view ( #9576 )
...
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-12-02 15:41:32 +08:00
xinhe-nv
ad46d19027
[None][chore] Add failed cases into waives.txt ( #9588 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 14:24:11 +08:00
ruodil
4586b5f42f
[ https://nvbugs/5582091 ][test] increase warmup times in testing for multi-gpu cases ( #9578 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-12-02 14:22:49 +08:00
Wanli Jiang
5657a00ec0
[FMDL-1328][feat] Add support for nano-v3 and super-v3 with pytorch backend ( #9261 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-02 13:40:20 +08:00
xinhe-nv
3911d0496e
[None][fix] Waive gb200 ( #9580 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 12:09:21 +08:00
JunyiXu-nv
9a6df980cd
[ https://nvbugs/5703953 ][fix] Use random port for disagg tests ( #9582 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-02 11:40:14 +08:00
Iman Tabrizian
356a52edf5
[None][feat] Add support for KVCache reuse for DSv32 ( #9383 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-12-02 11:14:30 +08:00
Venky
639c939a4f
[TRTC-1943][feat] Env vars override support in LLM API ( #9104 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-12-01 10:04:49 -08:00
Yanchao Lu
7127c4407a
[None][test] [None][test] Waive main branch test failures 12/1 ( #9566 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-01 21:54:53 +08:00
Shi Xiaowei
48b1d31895
[ https://nvbugs/5651854 ][infra] Enable perf metrics during accuracy testing ( #9140 )
2025-12-01 20:15:32 +08:00
JadoTu
a92af27411
[None][chore] remove qwen3-next accuracy tests ( #9534 )
...
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-12-01 11:49:37 +08:00
Pengbo Wang
aa3310f64f
[ https://nvbugs/5503479 ][fix] Temporarily lower reference accuracy to stabilize CI ( #9398 )
...
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2025-12-01 11:49:14 +08:00
Enwei Zhu
2e3ac3c48f
[ https://nvbugs/5684703 ][fix] Unwaive disagg guided decoding test ( #9466 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-01 11:39:40 +08:00
JunyiXu-nv
3f588198dc
[None][fix] Fix port conflict in disagg tests ( #9474 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-30 17:33:22 +08:00
Emma Qiao
c927ccf510
[None][infra] Wiave failed tests for main branch on 11/30 ( #9555 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-30 16:13:20 +08:00
brb-nv
b77f4ffe54
[TRTLLM-5971][feat] Integrate helix parallelism ( #9342 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-29 15:17:30 -08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase ( #9522 )
...
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
dominicshanshan
70efa3ac43
[None][infra] Waive failed case in pre-merge on 11/28 ( #9537 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-11-28 20:53:45 +08:00
Emma Qiao
2d7421b314
[None][infra] Waive failed cases for main branch on 11/28 ( #9539 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-28 17:19:55 +08:00
yufeiwu-nv
08755a809d
[ https://nvbugs/5689658 ][test] Fix gpu lock issue running on cluster ( #9441 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-28 13:59:22 +08:00
JunyiXu-nv
c87e81c1d8
[ https://nvbugs/5685015 ][fix] Update invalid max_token test ( #9435 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-28 11:41:16 +08:00
Bo Li
19f3f4e520
[ https://nvbugs/5637037 ][chore] Update waive lists. ( #9386 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-28 10:45:22 +08:00
Yueh-Ting (eop) Chen
4cbfc10b28
[ https://nvbugs/5674665 ][chore] Add test coverage for https://nvbugspro.nvidia.com/bug/5674665 ( #9518 )
...
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-11-27 21:40:34 +08:00
Fanrong Li
2d5eadf65f
[None][fix] fix TP support for DeepSeek-V3.2 on hopper ( #9484 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-27 21:02:25 +08:00
JadoTu
51bf7164d3
[None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 ( #9330 )
...
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-11-27 18:05:00 +08:00
Lizhi Zhou
8104a78931
[None][chore] revert batch_size=1 to prevent timeout and lower accuracy reference by 0.12% as a WAR ( #9447 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 14:25:44 +08:00
Emma Qiao
0442510304
[None][infra] Waive failed case in pre-merge on 11/27 ( #9507 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-27 13:53:33 +08:00
HuiGao-NV
03331bc43d
[ https://nvbugs/5547414 ][fix] enable case after using local cache model ( #9473 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-27 12:18:20 +08:00
Patrice Castonguay
1b2da426cd
[ https://nvbugs/5680310 ][fix] Fix ctx only timed out test ( #9410 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-27 11:21:21 +08:00
Shi Xiaowei
e76e149861
[ https://nvbugs/5608930 ][fix] Fix a typo ( #9487 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 09:05:17 +08:00
Chang Liu
b10137fdd5
[None][feat] Support MLA chunked prefill for DeepSeek V3.2 model ( #9376 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-11-26 16:38:25 +08:00
JunyiXu-nv
b7308a4000
[ https://nvbugs/5580099 ][fix] Cherry pick IMA issue fix from release/1.1 ( #9032 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-26 13:09:06 +08:00
Wanli Jiang
d100599ea7
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm ( #9246 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-26 11:12:35 +08:00
QI JUN
5972119e1c
[None][ci] move some slow test cases of DGX-B200 to post merge ( #9467 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-26 10:48:53 +08:00
fredricz-20070104
6a64cb4c71
[TRTLLM-8936][test] Add disagg and wideep multi-node multi-gpu test cases ( #9356 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-11-26 10:34:49 +08:00
Chuang Zhu
0e9c7f8c07
[ https://nvbugs/5685143 ][fix] avoid cudaFree overlap with cuda graph ( #9438 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-11-25 16:20:29 -08:00
Suyog Gupta
e484bec82f
[None][chore] AutoDeploy add multi stream moe pass to default.yaml ( #9430 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-25 14:16:13 -08:00
Fanrong Li
8da59103d6
[ https://nvbugs/5680905 ][fix] Relax the MMLU accuracy requirement for DS-v3.2 ( #9439 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-26 00:32:20 +08:00
Yan Chunwei
1f43dc8174
[None][ci] waive a test ( #9458 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-25 07:04:20 -08:00
YueWeng
cc336c4abd
[TRTLLM-8160][feat] Add draft token tree runtime on CDL ( #8586 )
...
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-11-25 09:40:55 -05:00
Shi Xiaowei
60786574db
[None][fix] Mitigate test timeout issues ( #9445 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-25 20:17:54 +08:00
Chao Ni
a2d9e6250a
[ https://nvbugs/5667922 ][fix] Update long context evaluation config ( #9426 )
...
Signed-off-by: mni <125171826+baize97@users.noreply.github.com>
2025-11-25 19:33:38 +08:00
Yanchao Lu
ff02e0f05c
[None][ci] Move more test stages to use OCI machines ( #9395 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Matt Lefebvre <matthewelefebvre@gmail.com>
2025-11-25 15:59:13 +08:00
Eran Geva
6af01dc664
[ #8391 ][chore] test_perf.py to lock clocks read from gpu_configs.yml instead of max freq ( #9409 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-11-25 09:20:33 +02:00
Emma Qiao
15616e3ee5
[None][infra] Waive failed cases for main branch on 11/25 ( #9429 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 23:18:15 -08:00
Suyog Gupta
efd503751f
[ #9271 ][perf] Enable multi-stream MOE optimization in AutoDeploy ( #9322 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-24 19:50:10 -08:00
kris1025
d1c724958d
[None][chore] unwaive ampere kernels test ( #9389 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-11-25 11:28:43 +08:00
xinhe-nv
0a9ae2e3e6
[None][chore] Remove closed bugs ( #9381 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-24 18:49:57 -08:00
QI JUN
786d308b88
[ https://nvbugs/5685428 ][fix] fix test_openai_chat_multimodal.py ( #9406 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 16:56:33 -08:00
Yibin Li
1ce483c999
[TRTLLM-7967][feat] Adding Starcoder2 PyTorch Backend Support ( #8923 )
...
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-11-24 11:23:22 -08:00
Emma Qiao
2c869f2bda
[None][infra] Waive failed cases for main ( #9400 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 17:42:19 +08:00
Emma Qiao
af72d93fa9
[None][infra] Waive failed cases on main branch ( #9384 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-23 22:53:02 -08:00
brb-nv
c045e359a7
[ https://nvbugs/5637012 ][fix] Fix helix unit tests ( #9369 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-23 19:34:22 -08:00
QI JUN
34a6d2d28f
[TRTLLM-9302][chore] Move build config from BaseLlmArgs to TrtLlmArgs ( #9249 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 10:54:41 +08:00
Chenghao Zhang
e1c9aa7d6a
[None][chore] AutoDeploy: Add the Nemotron MOE to CI ( #9328 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-23 12:12:12 -08:00
Yan Chunwei
1ef69ecbb1
[None][ci] waive two ray tests ( #9375 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-23 15:39:01 +08:00
dongfengy
268ea9bb8a
[None][test] Add one-model and overlap-scheduling to eagle tests for GPTOSS ( #9312 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-11-21 22:52:53 -08:00
Enwei Zhu
13fbd4366a
[TRTLLM-9370][feat] Integration of CuteDSL NVFP4 grouped GEMM (Part 2: SwiGLU Fusion and Finalize Fusion) ( #9288 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-21 14:03:38 -08:00
Emma Qiao
041564188c
[None][infra] Waive failed cases in main post-merge on 11/21 ( #9360 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-21 18:01:53 +08:00
QI JUN
b6483ef3e7
[None][ci] waive a test case of test_ad_build_small_multi.py ( #9355 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-21 16:25:04 +08:00
Ivy Zhang
28e9bf6167
[None][chore] add periodic junit xml path in conftest ( #9337 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-20 22:46:25 -08:00
QI JUN
e2a372a3b1
[None][ci] waive test_llm_context_only_timed_out_kv_cache_exhausted ( #9351 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-20 20:20:57 -08:00
Barry Kang
a3433dd54e
[ https://nvbugs/5325296 ][fix] Enable relaxed acceptance test on Blackwell ( #8709 )
...
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jin Li
6185225501
[ https://nvbugs/5488118 ][fix] Unwaive passed tests ( #8758 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
xiweny
05aabfbc1e
[ https://nvbugs/5601203 ] [fix]Restrict fp8 blockscale moe case ( #8583 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Eran Geva
3d66e56adb
[ https://nvbugs/5572320 ][fix] Ported test_ad_trtllm_bench.py from main ( #8671 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Yukun He
9a79f32f7a
[ https://nvbugs/5608489 ][fix] Fix output unpack issues for Llama3/4 NVFP4 models. ( #8679 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750
[None][test] Clean cache for certain easily hang cases ( #8619 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jie Li
36e244f35e
[ https://nvbugs/5587456 ][fix] Remove multimodal test cases using TRT backend ( #8611 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
348668e3ae
[ https://nvbugs/5575902 ][fix] set max_batch_size=1 to stabilize accuracy test result ( #8609 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
33b0b945c7
[ https://nvbugs/5582277 ][fix] rework DisaggPPTerminationHandler to fix hang issue ( #8519 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Pengyun Lin
81fd9be87d
[ https://nvbugs/5575829 ][fix] Unwaive gpt-oss test ( #8576 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Bo Deng
4ca6fe83d8
[ https://nvbugs/5565549 ][fix] unwaive test_disaggregated_spec_dec_bat… ( #8500 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
JunyiXu-nv
ee6944bfa2
[ https://nvbugs/5569713 ][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 ( #8429 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
yufeiwu-nv
0e746fad45
[ https://nvbugs/5667454 ][test] Fix Test Case as Chunked Attention not Supported on sm_120 ( #9260 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-20 00:58:42 -08:00
Liao Lanyu
04ad9f96fa
[ https://nvbugs/5667687 ][fix] Set correct lm_head_tp_size_upper_bound ( #9300 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-11-20 00:41:00 -08:00
Emma Qiao
b018b2698d
[TRTLLM-9164][infra] Enable checking duplicate items in waives.txt in pre-commit ( #9265 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-20 15:47:23 +08:00
QI JUN
1bdd3ba173
[None][ci] waive test_disagg_server_restart ( #9326 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-19 22:34:03 -08:00
Yechan Kim
d5622b2689
[None][fix] Multimodal InputProcessor dummy builder fix ( #8916 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-19 22:32:21 -08:00
Chenghao Zhang
cd44f80abd
[ #9316 ][feat] AutoDeploy: Add the accuracy test for Nemotron MOE models ( #9317 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-19 21:48:50 -08:00
Bo Deng
2128f73d58
[TRTLLM-9247][infra] Upgrade NIXL to 0.7.1 ( #9055 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-20 11:01:02 +08:00
brb-nv
f6ec6e2222
[None][chore] Waive tests timing out on main ( #9315 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-19 13:10:06 -08:00
mpikulski
46dd9886bb
[ https://nvbugs/5661877 ][fix] fix test regression in TestBatchedSampling::test_samples ( #9215 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-19 01:44:44 -08:00
xinhe-nv
0f77fec932
[None][chore] Add failed cases into waives.txt ( #9289 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-19 17:03:43 +08:00
nvxuanyuc
a79c0dfb43
[None][fix] Update GLM model accuracy test ( #9286 )
...
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 21:59:01 -08:00
Emma Qiao
67d3eb26af
[None][infra] Waive failed cases for main branch on 11/17 ( #9266 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-18 20:07:03 -08:00
xinhe-nv
286ace22ed
[None][chore] Add failed cases into waives.txt ( #9242 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 19:27:55 -08:00
Ivy Zhang
782dfca7e8
[TRTLLM-9050][test] add llama4 disagg case to cover kv cache overflow error ( #9172 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 18:26:32 -08:00
xinhe-nv
35658eab55
[None][chore] Add failed cases into waives.txt ( #9193 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 17:47:55 -08:00
Enwei Zhu
7c4777a571
[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM ( #8880 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-18 17:40:12 -08:00
Lizhi Zhou
c789000a62
[ https://nvbugs/5649010 ][fix] increase status-checking interval to avoid instability ( #9203 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-19 08:55:42 +08:00
Bo Deng
34f845bf69
[TRTLLM-9287][infra] Use NIXL backend for accuracy tests ( #9247 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-11-18 14:46:20 -08:00
Ajinkya Rasane
8d7cda2318
[None][chore] Update the Flux autodeploy example ( #8434 )
...
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
2025-11-18 14:16:04 -08:00
Kaiyu Xie
d076aa44d3
[None] [tests] Unwaive wide ep related tests ( #9204 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-11-18 08:54:46 -08:00
Ivy Zhang
160b361588
[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check ( #9088 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 05:55:00 -08:00
Ivy Zhang
ca41a71f92
[TRTLLM-8948][test] Add long bench case ( #9165 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 04:41:48 -08:00
Tri Dao
fc088e642c
[None][feat] Support Glm4MoeForCausalLM ( #8256 )
...
Signed-off-by: Tri Dao <daominhtri0503@gmail.com>
Co-authored-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 09:43:21 +08:00
QI JUN
c3376fa114
[None][ci] split speculative test case into several small cases ( #9209 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-17 17:02:25 -08:00
Robin Kobus
df41f220a2
[TRTLLM-8831][feat] Enable early exit with overlap scheduler ( #8587 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-11-17 18:07:13 +01:00
Emma Qiao
d16b1a84c5
[None][infra] Waive a failed case in pre-merge stage 11/16 ( #9192 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-17 09:36:56 +08:00
Emma Qiao
2854f0cf3d
[None][infra] Waive failed tests for main branch 11/15 ( #9187 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-16 01:48:25 -08:00
brb-nv
63237494db
[None][chore] Waive failing tests blocking pre-merge ( #9189 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-16 01:06:03 -08:00
Chang Liu
bed4e95e9f
[ https://nvbugs/5629887 ][fix] Add missing device count guard for DSv32 multiGPU tests ( #9159 )
2025-11-14 07:52:23 -08:00
xinhe-nv
49b7e6301a
[None][chore] Add failed cases into waives.txt ( #9156 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-14 06:28:22 -08:00
yuanjingx87
d72321a32e
[None][ci] Waive unittest/_torch/sampler/test_torch_sampler.py::TestBatchedSampling ( #9161 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-11-14 01:49:26 -08:00
QI JUN
3c950910a0
[None][ci] waive test_disaggregated.py::test_disaggregated_mixed[TinyLlama-1.1B-Chat-v1.0] ( #9162 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-13 18:56:37 -08:00
Tailing Yuan
cc4c980e03
[None][feat] Add Qwen3-Next to layer-wise benchmarks ( #9065 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-14 10:03:00 +08:00
Erin
44d1c75701
[TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server ( #8765 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-11-13 17:21:24 -08:00
William Zhang
121140cfec
[None][fixes] Add tool call parsing fixes and Qwen3 coder parser ( #8817 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-11-13 04:34:38 -08:00
Lizhi Zhou
48a27c7bef
[ https://nvbugs/5633340 ][chore] unwaive test_auto_scaling.py::test_disagg_server_restart ( #9131 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-11-13 01:45:36 -08:00
Emma Qiao
d0ea417ec8
[None][infra] Waive failed tests for main 11/13 ( #9132 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-13 01:00:40 -08:00
xinhe-nv
548f5ce4bc
[None][fix] waive failed tests ( #9090 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 23:40:00 -08:00
xinhe-nv
8fa3c55c76
[None][chore] Remove closed bugs ( #9114 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 22:49:37 -08:00
ruodil
c86e36fe38
[None][test] add deepseek and qwen cases for rtx series ( #8839 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-11-12 22:28:02 -08:00
HuiGao-NV
cde18c12da
[ https://nvbugs/5640873 ][fix] Move thop tests to pre-merge ( #9094 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-13 13:08:13 +08:00
Yan Chunwei
4fd93bdc2c
[None][ci] Waive test_llm_rpc and test_llm_rpc_streaming ( #9118 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-12 19:55:09 -08:00
Zhenhuan Chen
943b05e2d3
[TRTLLM-9179][feat] add pp_partition to customize each rank's layer number ( #9003 )
...
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2025-11-13 10:34:17 +08:00
QI JUN
3416efbc29
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_chunked_prefill ( #9111 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-13 10:06:32 +08:00
dongxuy04
9241ccaf27
[None][feat] Enable EPLB for trtllm-gen and cutlass backend ( #8886 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
2025-11-12 12:30:27 -08:00
Chenghao Zhang
5f26c31954
[ https://nvbugs/5636912 ][fix] AutoDeploy: Unwaive the test ( #9018 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-12 12:26:38 -08:00
Fanrong Li
780d4f9dc5
[None][feat] Add MTP>1 support for DS-v3.2 ( #9045 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-12 09:56:12 -08:00
Iman Tabrizian
cdde15b275
[TRTLLM-8540][feat] Add support for disagg in DSv3.2 ( #8735 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-11-12 08:21:11 -08:00
yufeiwu-nv
b7a2574c60
[ https://nvbugs/5568991 ][test] Remove Phi-3 models ( #9066 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-12 03:16:36 -08:00
QI JUN
4003dc7574
[None][ci] waive some test cases of disaggregated serving ( #9085 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-12 15:06:21 +08:00
Emma Qiao
bb6eb9510d
[None][infra] Waive a failed case of disaggregated/test_disaggregated.py ( #9074 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 19:38:32 -08:00
QI JUN
fd703fbb7b
[None][ci] run speculative unit tests serially ( #9080 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 19:06:44 -08:00
Lucas Liebenwein
aca56097cb
[None][fix] AutoDeploy: update nano3 accuracy test ( #9061 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-11 12:26:31 -08:00
Wanli Jiang
ebdd1cc8e0
[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm ( #8840 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-11 07:48:23 -08:00
QI JUN
0ce22ce928
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] ( #9069 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 02:11:15 -08:00
Yiqing Yan
b7d51c5549
[None][chore] Remove duplicated waive test ( #9067 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-11-11 16:49:49 +08:00
Emma Qiao
da1f0e2465
[None][infra] Waive failed tests on main 11/11 ( #9058 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 13:19:30 +08:00
xinhe-nv
fac522056c
[None][chore] Add failed cases into waives.txt ( #8998 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-11-11 12:40:59 +08:00
Yechan Kim
0938a3ad2a
[ https://nvbugs/5644187 ][fix] Llava-Next MMMU bugfix and Phi4 test bugfix ( #9034 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-11 10:24:31 +09:00
xiweny
50c486367a
[ https://nvbugs/5619396 ][fix] Add sm103 to CutlassFP8RowwiseGemm ( #9042 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-10 08:12:14 -08:00
xinhe-nv
f848d844d9
[None][chore] Add failed cases into waives.txt ( #9030 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-09 23:36:05 -08:00
Fanrong Li
a7033a9193
[TRTLLM-9001][feat] add TP support for DeepSeek-V3.2 ( #8943 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-10 12:16:01 +08:00
Bo Li
67af7c15a5
[ https://nvbugs/5637037 ][fix] Update unwaive list. ( #9001 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-11-10 08:53:07 +08:00
Emma Qiao
183778d58a
[None][infra] Waive failed tests for main 11/07 ( #9008 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-08 08:51:35 -08:00
Emma Qiao
2af6a537ad
[TRTLLM-8999][infra] Reduce gb200 multi-node test stages ( #8778 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-08 06:34:24 -08:00
Yuxian Qiu
7b82ba90da
[ https://nvbugs/5629790 ][chore] unwaive test. ( #8967 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-11-07 18:41:32 +08:00
QI JUN
1c6e490894
[TRTLLM-9065][chore] remove PyTorchConfig completely ( #8856 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-06 22:37:03 -08:00
Lizhi Zhou
b26e1617f2
[ https://nvbugs/5633340 ][fix] kill processes properly after test ( #8970 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-06 21:45:38 -08:00
xiweny
ee20e679a9
[ https://nvbugs/5636986 ][fix] Fix DeepGemmMoe get_buffer calls ( #8939 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-06 19:57:19 -08:00
Simeng Liu
9f8d93f89a
[ https://nvbugs/5606136 ][ci] Remove tests for deprecating triton multimodal models. ( #8926 )
...
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-11-06 17:58:42 -08:00
jthomson04
fcae852cef
[None][fix] Fix KV cache clearing with KV Connector API ( #8750 )
...
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-06 14:28:27 -08:00
shuyixiong
c73efe12e7
[None][chore] Use cached model in all ray tests ( #8962 )
...
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-11-06 15:14:15 +01:00
Fanrong Li
d246f62868
[ https://nvbugs/5630345 ] [chore] skip deepseek-v3.2 fp8 kv tests on pre-Blackwell architectures ( #8973 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-06 03:41:37 -08:00
xinhe-nv
e822184cd7
[None][feat] add waive by sm version ( #8928 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-05 19:20:43 -08:00
Lucas Liebenwein
7a552c450a
[ https://nvbugs/5606166 ][fix] AutoDeploy: unwaive test for use tuples for cudagraph shape lookup ( #8957 )
...
also updated test waive for another nvbug
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-05 16:27:00 -08:00
Fanrong Li
c2feed798a
[ https://nvbugs/5630345 ][chore] unwaive DS-v32 nvfp4 and fp8 tests ( #8887 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-05 03:49:23 -08:00
Chuang Zhu
595f78078c
[ https://nvbugs/5624367 ][fix] Fix disagg GPT-OSS test ( #8870 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-11-05 01:47:09 -08:00
Emma Qiao
31116825b3
[None][infra] Waive failed cases on main 11/05 ( #8936 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-04 22:54:45 -08:00
xinhe-nv
cc4aa29523
[None][chore] Add failed cases into waives.txt ( #8865 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-04 19:26:50 -08:00
Yechan Kim
ed81173c55
[None][ci] Add test on waives ( #8915 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-05 08:42:08 +08:00
Patrice Castonguay
782824533e
[ https://nvbugs/5587574 ][fix] Increase server timeout to wait for weight loading ( #8806 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-04 12:11:08 -08:00
Yanchao Lu
e2b2675120
[None][fix] Remove duplicated test waives ( #8914 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-04 23:04:33 +08:00
Robin Kobus
7e4b87b17c
[None][ci] Remove outdated test entries ( #8909 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-11-04 05:32:46 -08:00
xiweny
cae468cc8e
[ https://nvbugs/5596343 ] [test] Waive flaky GPT-OSS cases ( #8904 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-04 03:00:00 -08:00
Zhanrui Sun
4de31bece2
[TRTLLM-8994][infra] upgrade to DLFW 25.10 and pytorch 2.9.0 / triton 3.5.0 ( #8838 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-04 18:59:34 +08:00
Ivy Zhang
23717cdb3f
[TRTLLM-8580][test] save runtime report periodically ( #8312 ) ( #8455 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yukun He
6c8ba3be27
[None][chore] Remove duplicate log outputs in test_perf.py ( #8418 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
ruodil
102e556863
[None][test] cherry-pick: add test-model-suites in integration conftest.py ( #8388 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Patrice Castonguay
65c138108e
[ https://nvbugs/5552889 ][fix] fix: Prevent empty batch when using attention DP with disagg ( #8372 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Ivy Zhang
9bcd2e6c0a
[None][chore] Update nim test list ( #8356 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Stanley Sun
def9c0004d
[TRTLLM-8113][test] Add pytorch workflow e2e tests with pp enabled ( #8357 )
...
Signed-off-by: Stanley Sun <stsun@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
xiweny
fcac2022e2
[ https://nvbugs/5565565 ] [fix] fp8 wideep support sm103 ( #8228 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yueh-Ting (eop) Chen
bd1c9c0af4
[ https://nvbugs/5625990 ][chore] Add test coverage for current incapability of the KV cache manager ( #8829 )
...
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-11-04 16:35:45 +08:00
Emma Qiao
4fe47faf47
[None][infra] Waive failed tests for main branch ( #8897 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-03 22:21:28 -08:00
Zhanrui Sun
9ec6a6b68f
[None][infra] waive failed test on main 11/4 ( #8896 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-11-03 21:37:09 -08:00
Mike Iovine
5e6f1bcd24
[TRTLLM-8979][test] Improve qwen3 spec dec test coverage ( #8767 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-03 10:12:10 -08:00
Yechan Kim
f48968b6cc
[TRTLLM-6928][fix] Refactor multimodal unittest ( #8453 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 06:01:07 -08:00
Emma Qiao
14bc8571ae
[TRTLLM-8435][infra] Test existing rtxpro6000 stages on rtxpro6000d ( #8319 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-03 05:26:17 -08:00
Emma Qiao
d7176768cd
[None][infra] Waive the failed test for main on 11/3 ( #8875 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-03 02:52:52 -08:00
Tailing Yuan
8303cfa477
[None][fix] Fix import issues in layer-wise benchmarks ( #8827 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-03 02:32:48 -08:00
xinhe-nv
4873ca04cc
[ https://nvbugs/5521799 ][fix] add harmony channel validation ( #8837 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 02:31:54 -08:00
xinhe-nv
64540451e7
[None][chore] Add failed cases into waives.txt ( #8872 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 01:19:04 -08:00
Fanrong Li
e9f78c687a
[ https://nvbugs/5625962 ][chore] unwaive DS-v32-fp4 tests ( #8853 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-03 00:34:52 -08:00
Yechan Kim
00c0e6c440
[ https://nvbugs/5523315 ][fix] Fix serve benchmark test ( #8255 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 00:30:13 -08:00
chenfeiz0326
cc4ab8d9d1
[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database ( #8653 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-11-03 16:23:13 +08:00
yufeiwu-nv
b4d17d1a4c
[TRTLLM-8991][test] Add Llama 3.3 70B model with different performance config ( #8753 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-11-03 13:34:06 +08:00
dongfengy
6d6797c792
[None][test] Enhance GPT-OSS CI with GPQA Diamond and additional Spec Decoding Test ( #8661 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-11-02 16:44:02 -08:00
Yan Chunwei
1551ed8e5f
[ https://nvbugs/5437384 ][test] CHERRY-PICK: fix trtllm-llmapi-launch multi tests ( #8567 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-11-01 06:49:33 -07:00
dongxuy04
bba2519726
[TRTLLM-7008][fix] Enable GDRCopy and unwaive online eplb tests ( #8720 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-10-31 16:39:51 -07:00
Fanrong Li
f0dc746738
[TRTLLM-8541][feat] Add trtllm-gen sparse MLA kernels to support per-Tensor FP8 KV Cache ( #8692 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-31 14:38:31 -07:00
Tailing Yuan
98453d2bb7
[None][fix] Waive layer-wise benchmark tests ( #8823 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 22:51:31 -07:00
Emma Qiao
aecc9655a0
[None][info] Waive failed case for main ( #8826 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 20:43:59 -07:00
Yuxian Qiu
025d2926df
[ https://nvbugs/5599515 ][fix] Fix PP bubbles. ( #8687 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-31 10:13:56 +08:00
Mike Iovine
b87448b009
[TRTLLM-8978][test] Remove llama 4 spec dec tests ( #8766 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-30 15:47:04 -04:00
Tailing Yuan
ec31363a86
[None][fix] Layer wise benchmarks: use local models, lint ( #8799 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 09:47:46 -07:00
Emma Qiao
9112cffaf3
[None][infra] Waive failed case for main branch ( #8797 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 07:57:35 -07:00
Tailing Yuan
f9c7786dc8
[None][feat] Add layer wise benchmarks ( #8777 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 20:29:34 +08:00
Emma Qiao
a5cc9fe0aa
[TRTLLM-5453][infra] Check all steps for test name and also check the test in waives.txt also exists in l0 or qa test list. ( #6256 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-10-30 01:56:04 -07:00
xinhe-nv
a4f75399b9
[ https://nvbugs/5481206 ][fix] update waives ( #8774 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-30 00:43:38 -07:00
Emma Qiao
7d3cebf34e
[None][infra] Unwaive the tests passed in latest CI and disable a perf stage ( #8775 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 12:48:23 +08:00
Emma Qiao
db99a936b0
[TRTLLM-8971][infra] Update gpu key for B300/GB300 ( #8724 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 20:36:44 -07:00
Yuxian Qiu
3176bd3815
[None][fix] Fix UnboundLocalError. ( #8756 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-29 19:41:37 -07:00
HuiGao-NV
ae57738bae
[ https://nvbugs/5547414 ][fix] Use cached models ( #8755 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-29 19:10:10 -07:00
Iman Tabrizian
ae6875fe10
[TRTLLM-8976][feat] Move indexer-k-cache to KVCacheManager ( #8699 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-10-29 08:04:26 -07:00
Emma Qiao
579e1067bf
[None][infra] Waive failed tests on main ( #8759 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 21:32:23 +08:00
Yan Chunwei
fc3b6f5331
[None][ci] waive test_rpc.py ( #8745 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-29 05:17:40 -07:00
Chang Liu
81eb861df0
[None][chore] Enable GPQA in CI for DeepSeek V3.2 ( #8712 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-29 04:22:22 -07:00
Zheng Duan
d626d13d37
[ https://nvbugs/5607238 ][test] fix working dir in disagg worker test ( #8648 )
...
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-10-29 16:13:52 +08:00
Pengyun Lin
2aade46d18
[TRTLLM-8214][feat] Support Qwen3 tool parser ( #8216 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-29 15:48:29 +08:00
xinhe-nv
7ba98a6b20
[None][chore] Add failed cases into waives.txt ( #8684 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-28 20:30:01 -07:00
Yan Chunwei
f2faf2809f
[None][ci] waive test_rpc.py temporarily ( #8743 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-28 19:20:27 -07:00
Zheng Duan
fea5bfbda7
[None][feat] add detailed KV cache transfer time breakdown ( #8521 )
...
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-10-29 10:11:09 +08:00
ruodil
f444fe2deb
[None][test] fix a typo in perf test sampler config ( #8726 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-29 09:53:53 +08:00
Lizhi Zhou
24167d00eb
[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests ( #8602 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-10-28 17:04:53 -07:00
dongfengy
083f3637f1
[ https://nvbugs/5596343 ][test] Update test waive to get back some coverage ( #8702 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-10-28 14:05:48 -07:00
Anish Shanbhag
a09b38a862
[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum ( #8330 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-28 09:17:26 -07:00
dongfengy
5a01f382c1
[ https://nvbugs/5575913 ][fix] Use separate thresholds for 120b/20b gptoss ( #8664 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-10-28 10:35:07 -04:00
Robin Kobus
e8e2b0697a
[None][chore] Revert "[TRTLLM-7835][test] add default sample config for perf test ( #8523 ) ( #8725 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-28 14:23:38 +01:00
ruodil
6b9b73ee27
[ https://nvbugs/5564465 ][test] ensure deepseek_v3_lite isl + osl < max_seq_len ( #8565 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-28 15:25:52 +08:00
ruodil
bf72eb045e
[TRTLLM-7835][test] add default sample config for perf test ( #8523 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-28 02:22:47 -04:00
yufeiwu-nv
0e36484fba
[None][test] Add gpt_oss_20b Model to Sanity Perf Test ( #8265 )
2025-10-28 13:36:28 +08:00
Aurelien Chartier
0a02f5f25d
[None][chore] Use a cached model path for Ray integration test ( #8660 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-10-27 19:16:06 -07:00
HuiGao-NV
49974eed75
[None][chore] ISOLATE some cases ( #8690 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-27 22:10:44 -04:00
chenfeiz0326
f5265a087b
[None][infra] Minor Update on Perf Sanity Testdb Files ( #8607 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-10-28 09:54:48 +08:00
gramnarayan
88b0fbc8ff
[ #8245 ][feat] Autodeploy: Guided Decoding Support ( #8551 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-28 09:29:57 +08:00
Yechan Kim
a6017f6266
[ https://nvbugs/5608723 ][fix] Use local data on multimodal tests and unwaive tests ( #8673 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-10-28 09:20:02 +09:00
Emma Qiao
73a5479b26
[None][infra] Skip failed tests for main 10/27 ( #8686 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-28 08:04:30 +08:00
Bo Li
9c4432f8a4
[TRTLLM-7318][feat] MnnvlThroughput AlltoAll implementation. ( #7499 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-10-27 13:23:06 -04:00
mpikulski
7c8ba71b49
[TRTLLM-8832][feat] fully async _select_generated_logits with tests ( #8628 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-27 16:15:32 +01:00
Kaiyu Xie
c9b08790c2
[None] [test] Add MNNVL AlltoAll tests to pre-merge ( #8601 )
2025-10-27 21:39:44 +08:00
Jie Li
ce0d76135d
[ https://nvbugs/5546507 ][fix] skip TRT-Flow test case due to CMake Error in building ( #8677 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2025-10-27 05:11:47 -04:00
xinhe-nv
8090c9641c
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8672 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-27 03:20:46 -04:00
xinhe-nv
0ac5cbcac4
[None][chore] Add failed cases into waives.txt ( #8669 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-27 02:36:28 -04:00
QI JUN
cc5b8b6d28
[None][ci] move some time-consuming benchmark test cases to post merge ( #8641 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-26 22:47:17 -04:00
Emma Qiao
e0728ba8a7
[None][infra] Waive failed case on main 10/26 ( #8668 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-26 22:02:32 +08:00
Chenghao Zhang
a6d20f6f9b
[None][feat] AutoDeploy: Add FP8 MOE for Nemotron ( #8599 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2025-10-25 15:26:45 -04:00
Simeng Liu
2b27810198
[ https://nvbugs/5494718 ][fix] Fix Single GPU Multi-node issue and OOM on DGX Spark ( #8514 )
...
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-10-24 19:09:07 -07:00
jthomson04
02081e2390
[None][feat] Support KV Connector with Disagg Prefill Worker ( #8246 )
...
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-10-24 11:09:06 -07:00
Chang Liu
e47c787dd7
[TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache ( #8405 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-24 13:40:41 -04:00
Chuang Zhu
2420918e5b
[TRTLLM-7078][chore] optimal kvcache transfer for VWSA ( #7952 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-10-24 08:58:16 -04:00
Emma Qiao
35e35db422
[None][infra] Waive tests on main and remove lines which missed in MI ( #8639 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-10-24 02:49:23 -04:00
xinhe-nv
2aaedd08cd
[TRTLLM-8638][fix] fix test issues ( #8557 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:16:55 -04:00
xinhe-nv
9a9d647292
[None][chore] Add failed cases into waives.txt ( #8630 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:11:03 -04:00
ruodil
07a957e5cb
[None][test] remove redunctant runtime backend in perf test ( #8358 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-10-24 02:01:34 -04:00
Stanley Sun
6b793d5c3d
[TRTLLM-8738][test] Add end-to-end trtllm-serve negative tests ( #8580 )
...
Signed-off-by: Stanley Sun <stsun@nvidia.com>
2025-10-24 13:23:47 +08:00
xinhe-nv
59375e8bed
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8590 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 00:02:42 -04:00
xinhe-nv
95d39e6e76
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8588 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-23 23:08:52 -04:00
xinhe-nv
04e2b2752a
[None][feat] add Nemotron-Ultra multi nodes eval tests ( #8577 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-23 02:44:26 -04:00
Anthony Chang
8a3b870e09
[None][feat] Update TRTLLM MoE MxFP4 cubins; autotune tileN ( #8156 )
...
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2025-10-23 09:14:18 +08:00
Anish Shanbhag
15de45d782
[TRTLLM-8682][chore] Remove auto_parallel module ( #8329 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-22 20:53:08 -04:00
xinhe-nv
b8b2c9efb4
[None][chore] add precommit hook to remove redundant tab and white space ( #8534 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-22 09:21:54 -04:00
Eran Geva
d4b3bae5af
[ #8391 ][fix] check perf by device subtype ( #8428 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-10-22 12:38:05 +03:00
Ivy Zhang
912cf4f603
[TRTLLM-8785][fix] fix conflicts between periodic-junit and store-durations ( #8518 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-10-22 04:36:47 -04:00
Emma Qiao
92e99b6545
[None][infra] Waive failed cases for main branch 10/22 ( #8573 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-22 04:21:56 -04:00
Shi Xiaowei
77940635bb
[ https://nvbugs/5451272 ][fix] unwaive the test ( #8537 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-10-22 14:28:42 +08:00
xinhe-nv
187cf12d8f
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8554 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-22 01:26:15 -04:00
Emma Qiao
2b4e812aea
[None][infra] Let CI continue running other isolation tests when an isolation test get hanging ( #8471 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-10-22 00:07:35 -04:00
chenfeiz0326
6cf1c3fba4
[TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 ( #7985 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-10-22 10:17:22 +08:00
sunnyqgg
90080e0e09
[ https://nvbugs/5556020 ][fix] test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_eagle3 dimension mismatch ( #8517 )
...
Signed-off-by: qgai <qgai@nvidia.com>
2025-10-22 09:58:22 +08:00
Chenghao Zhang
bac9e8c2ad
[None][feat] AutoDeploy: Add Nemotron MOE support for AutoDeploy ( #8469 )
2025-10-21 15:32:01 -07:00
Lizhi Zhou
23d5280a90
[TRTLLM-7843][feat] implement disagg cluster auto-scaling ( #8215 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-10-21 17:25:07 -04:00
Emma Qiao
653aa6b6dc
[None][infra] Waive failed tests for main 10/21 ( #8524 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-21 06:24:15 -04:00
xinhe-nv
c566890624
[TRTLLM-8638][fix] Remove closed bugs ( #8478 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-21 03:48:58 -04:00
xinhe-nv
3264d605fb
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8486 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-21 01:20:29 -04:00
ruodil
ab4b9966b2
[TRTLLM-7287][test] add multimodal chunked_prefill cases ( #8011 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-10-20 22:43:47 -04:00
Suyog Gupta
7050b1ea49
[ #8272 ][feat] Enable chunked prefill for SSMs in AutoDeploy ( #8477 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-10-20 15:31:52 -07:00
dongfengy
9b289d5230
[ https://nvbugs/5568676 ][fix] Remove test waive ( #8437 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-10-20 12:03:50 -07:00
HuiGao-NV
d0663e16e0
[ https://nvbugs/5492250 ][fix] Remove isolated cases and unwaive cases ( #8492 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-20 07:40:07 -04:00
Pamela Peng
b818a912d7
[ https://nvbugs/5540752 ][fix] Support quantized Phi4 MM models ( #8190 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-10-20 06:36:09 -04:00
QI JUN
d05079ba4b
[None][ci] move some test cases from H100 to A10 ( #8449 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-20 01:58:34 -04:00
xiweny
f7722e2b65
[TRTLLM-4866] [test] Support waiving unit tests by waives.txt ( #8359 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-10-20 09:52:51 +08:00
xinhe-nv
9aa086d3bb
[None][chore] update test duration ( #8377 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-19 20:45:51 -04:00
Bo Deng
dd25595ae8
[TRTLLM-7964][infra] Set nixl to default cache transceiver backend ( #7926 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-10-19 19:24:43 +08:00
Emma Qiao
e185173240
[None][infra] Waive test for main branch on 10/18 ( #8472 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-19 04:36:42 -04:00
brb-nv
7cc65a6296
[None][chore] Waive failing transceiver test ( #8473 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-10-18 17:22:10 -04:00
Lucas Liebenwein
41169fb20c
[None][feat] AutoDeploy: chunked prefill support ( #8158 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-18 00:47:35 -07:00
h-guo18
55fed1873c
[None][chore] AutoDeploy: cleanup old inference optimizer configs ( #8039 )
...
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-17 15:55:57 -04:00
xinhe-nv
bc833d3de3
[TRTLLM-8638][fix] add waives tests ( #8445 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-17 03:37:53 -07:00
zhhuang-nv
7a2bab93f0
[None][test] Add post merge test for Seed-OSS-36B-Instruct ( #8321 )
...
Signed-off-by: Zhen Huang <145532724+zhhuang-nv@users.noreply.github.com>
2025-10-17 02:30:33 -07:00
yufeiwu-nv
1e1f430163
[None][test] Filter out all fp8 test case for A100. ( #8420 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-10-16 20:42:50 -07:00
Ivy Zhang
70a0f5beb6
[TRTLLM-8580][test] save runtime report periodically ( #8312 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-10-17 10:47:26 +08:00
John Calderon
46ee7acb33
[TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling ( #7539 )
...
Signed-off-by: John Calderon <johncalesp@gmail.com>
Signed-off-by: John Calderon <jcalderon@nvidia.com>
Signed-off-by: john calderon <jcalderon@nvidia.com>
Signed-off-by: John Calderon <jcalderon@nvidia>
2025-10-16 17:49:22 +02:00
Yiqing Yan
05dd437084
[ https://nvbugs/5565541 ][fix] Add timeout threshold for H100 FHMA test ( #8354 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
bhsueh_NV
69325e1aa3
[ https://nvbugs/5574556 ][fix] fix bug of Qwen3_235B_A22B::test_fp8 CI ( #8351 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Lizhi Zhou
982d4b65e8
[ https://nvbugs/5550671 ][fix] fix disagg-serving multinodes test failure ( #8307 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Chuang Zhu
18a534d2b4
[ https://nvbugs/5465642 ][fix] Increase server timeout to wait weight loading ( #8297 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Enwei Zhu
526cad37d7
[ https://nvbugs/5568951 ][fix] Fix guided decoding disagg tests ( #8311 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
1b559ba91d
[None][chore] Update test configs for release ( #8224 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
4789c1e588
[TRTLLM-8246][test] add multimodal kvcache+chunked_prefil cases in to QA test list ( #8212 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
be2ab98233
[None][chore] Update constaintfor release ( #8211 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Yukun He
179c7dc501
[ https://nvbugs/5536131 ][fix] Fix illegal access issue when scale is not provided in Llama3/4. ( #7960 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
xinhe-nv
f70eff30b3
[TRTLLM-8638][fix] waive llam4 tests on H20 ( #8416 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-16 03:14:56 -07:00
HuiGao-NV
4e6a492aa3
[None][chore] Isolate several intermittent cases ( #8408 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-15 23:48:31 -07:00
xiweny
4143887370
[ https://nvbugs/5541494 ] [fix] Remove waivers ( #8353 )
...
Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-10-15 19:10:35 -07:00
Chuang Zhu
40d129a415
[None][fix] Fix cache buffer size for window ( #8320 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-10-16 09:01:11 +08:00
dongfengy
7a0aa64973
[None][fix] Refactor triton paddings ( #6980 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com>
2025-10-15 12:59:01 -07:00
mpikulski
0510b34588
[TRTLLM-8551][feat] add cache_salt in LLM.generate and refactor test_return_logits.py ( #8317 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-15 02:53:57 -07:00
QI JUN
1a1c9a29ab
[None][ci] move all llama4 test cases to post merge ( #8387 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-15 16:36:37 +08:00
mpikulski
93a4b7f1b6
[None][chore] update torch_dtype -> dtype in 'transformers' ( #8263 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-15 17:09:30 +09:00
Jin Li
206a9930df
[ https://nvbugs/5547435 ][fix] Fix a merge conflict ( #8365 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-10-15 10:43:10 +08:00
Emma Qiao
493da020c1
[TRTLLM-7351][infra] Add isolate marker for L0 ( #7497 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-10-14 16:58:14 -07:00
dongfengy
9d855f47ad
[None][fix] Remove outdated test waives for GPTOSS ( #8183 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-10-14 16:20:38 -07:00
Michal Guzek
1cdb0b62c3
[ https://nvbugs/5563469 ][fix] Temporarily disable test_nemotron_nano_8b_lora_torch in L0 due to Torch non-determinism ( #8206 )
...
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-10-14 17:55:28 +02:00
William Zhang
72d65d079a
[ https://nvbugs/5542878 ][fix] Unwaive test ( #8027 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-10-14 07:58:07 +02:00
xinhe-nv
371fcb0338
[TRTLLM-8366][feat] add kimi multi nodes case ( #8025 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-13 21:36:03 -07:00
Yuxian Qiu
3450fe9944
[None][fix] Fix dummy load format for key models. ( #7993 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-14 11:18:39 +08:00
Robin Kobus
db8c63b9b1
[TRTLLM-4517] [feat] Additional model outputs ( #7206 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-13 15:33:18 +02:00
xinhe-nv
9fe63dd8db
[None][chore] Add failed cases into waives.txt ( #8290 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-13 00:07:00 -07:00
xinhe-nv
72fcff1044
[None][fix] add timeout for llama4 ( #8254 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-12 21:04:20 -07:00
Guoming Zhang
989c25fcba
[None][doc] Add qwen3-next doc into deployment guid and test case into L0. ( #8288 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Faradawn Yang <faradawny@gmail.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-13 10:25:45 +08:00
Emma Qiao
fdbeea51d3
[None][infra] Skip failed cases for main branch ( #8293 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-12 08:04:09 -07:00
brb-nv
56a539cd37
[None][chore] Waive failing pre-merge test on main ( #8282 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-10-10 23:52:05 -07:00
Yilin Fan
2695d70d42
[None][feat] Add request timing breakdown option in benchmark_serving ( #8128 )
...
Signed-off-by: nv-yilinf <206948969+nv-yilinf@users.noreply.github.com>
2025-10-10 09:24:54 -07:00
xinhe-nv
2655995a09
[None][fix] add gc for test fixture ( #8220 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-10 02:50:25 -07:00
bhsueh_NV
d3059dbd8a
[ https://nvbugs/5547416 ][fix] unwaive no_cache test ( #8213 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-10-10 01:50:13 -07:00
xinhe-nv
b555f1ff98
[None][chore] Add failed cases into waives.txt ( #8229 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-09 23:45:28 -07:00
xinhe-nv
e8c9bae37e
[None][chore] Remove closed bugs ( #8151 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-10 16:39:40 +11:00
Emma Qiao
ccd949ea5b
[None][infra] Waive failed tests on main 10/09 ( #8230 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-09 22:46:07 +08:00
bhsueh_NV
27677a36f5
[ https://nvbugs/5516666 ][fix] unwaive some Qwen3 CI tests ( #8130 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-10-09 09:44:58 +08:00
Lizhi Zhou
fdf29ab8fa
[TRTLLM-7846][feat] Http disagg-cluster management implemention ( #7869 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-10-09 09:44:01 +08:00
QI JUN
6884d06aed
[None][ci] move some llama4 test cases to pre merge ( #8189 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-08 18:34:08 -07:00
Liao Lanyu
ed8e00ad4a
[ https://nvbugs/5522746 ][fix] unwaive tests caused by node issues after rebooting ( #8193 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-10-09 08:45:56 +08:00
Mike Iovine
c88913dc03
[ https://nvbugs/5541545 ][fix] Remove test_llama4 ( #8031 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-08 15:20:15 -07:00
brb-nv
80517b7812
[None][chore] Waive some tests failing on main post merge ( #8186 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-10-08 06:52:30 -07:00
mpikulski
8298e93bd8
[TRTLLM-8414][chore] BREAKING CHANGE: refine sampling strategy selection ( #8132 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-08 15:46:50 +02:00
Liao Lanyu
d57b8f0951
[ https://nvbugs/5455140 ][fix] unwaive tests related to GB200 OOM ( #8159 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-10-08 13:14:12 +08:00
ruodil
971610e3ff
[None][test] add test-model-suites option in integration conftest.py ( #8016 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-08 10:38:31 +08:00
Mike Iovine
7facac077b
[None][fix] Fix MTP illegal memory access ( #8161 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-07 14:02:55 -04:00
Emma Qiao
ca9da1f1c2
[None][infra] Skip failed cases for main ( #8176 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-07 06:37:51 -07:00
xiweny
9298f1bdcc
[None] [test] Add B300 cases to CI ( #8056 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-10-06 19:23:31 -07:00
Faraz
27a5091fcb
[None][feat] GPT-OSS Sm120/Sm121 Support ( #7937 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Vincent Huang <vincenth@nvidia.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Vincent Huang <vincenth@nvidia.com>
2025-10-06 16:59:06 -04:00
Lucas Liebenwein
3492391feb
[None][chore] AutoDeploy: clean up accuracy test configs ( #8134 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-06 12:51:01 -07:00
Yan Chunwei
fb51de6c2e
[TRTLLM-8189][chore] enhance GenerationExecutor with RPC (part1) ( #5543 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: chunweiy <chunweiy@nvidia.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: chunweiy <328693+Superjomn@users.noreply.github.com>
2025-10-05 17:28:20 +08:00
Jonas Yang CN
88ea2c4ee9
[TRTLLM-7349][feat] Adding new orchestrator type -- ray ( #7520 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-04 08:12:24 +08:00
Lucas Liebenwein
2c454e8003
[None][feat] AutoDeploy: Nemotron-H accuracy test ( #8133 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-03 15:39:03 -07:00
Michal Guzek
38da871db3
[TRTLLM-6496][feat] Add LoRa Torch tests for the latest NIM model list ( #6806 )
...
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-10-03 12:10:48 -07:00
Mike Iovine
ca8291133a
[None][fix] Fix MTP 2-model ( #8115 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-10-03 10:13:50 -07:00
Lucas Liebenwein
5faa5e9dd8
[None][feat] AutoDeploy: dive deeper into token generation bugs + enable_block_reuse ( #8108 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-03 04:57:26 -07:00
Yilin Fan
01423ac183
[None][feat] perf_metrics endpoint functionality improvement ( #8005 )
...
Signed-off-by: Yilin Fan <206948969+nv-yilinf@users.noreply.github.com>
Signed-off-by: nv-yilinf <206948969+nv-yilinf@users.noreply.github.com>
2025-10-02 17:43:25 -07:00
Eran Geva
4136942436
[ #7588 ][fix] fixed the kv cache size parsing in test_perf.py AD backend ( #8092 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-10-02 15:55:31 -04:00
Erin
293637e0a1
[ https://nvbugs/5556020 ][chore] waive test_eagle3 ( #8119 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-02 05:33:21 -04:00
mpikulski
fc7f78c400
[TRTLLM-8269][test] do not explicitly pass temperature=0 to select greedy sampling ( #8110 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-02 10:20:32 +02:00
Eran Geva
32c7f8c36f
[ #7588 ][feat] lock gpu clocks in test_perf.py to reliably detect perf regressions ( #8099 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-10-02 11:18:10 +03:00
Patrice Castonguay
b77f19f4ff
[ https://nvbugs/5434320 ][fix] fix: Unwaiving disagg pp tests ( #8069 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-10-01 00:33:59 -04:00
Emma Qiao
b1e3fef8aa
[None][infra] Skip failed tests in post-merge for main ( #8102 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-01 10:12:10 +08:00
brb-nv
84aa3c981e
[None][chore] Waive failing MNNVL alltoall multi-gpu test ( #8106 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-09-30 20:05:42 -04:00