Commit Graph

1715 Commits

Author SHA1 Message Date
Emma Qiao
75bc386b65
[None][infra] Waive failed cases for main branch on 12/09 (#9839)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 19:39:29 +08:00
QI JUN
58c29957d9
[TRTLLM-9794][ci] move qwen3-next test cases to gb200 (#9827)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 01:58:25 -08:00
Robin Kobus
76f49c903b
[None][fix] Additional model outputs for pipeline parallelism (#9794)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-09 10:41:22 +01:00
yufeiwu-nv
fbcf03040f
[None][test] Refactor qa/llm_perf_nim.yml test list (#9700)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-08 22:00:43 -08:00
QI JUN
252769c930
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-08 21:51:30 -08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
JunyiXu-nv
90890785eb
[https://nvbugs/5722653][fix] Fix config file used by disagg_client (#9783)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-08 20:34:55 -08:00
Balaram Buddharaju
bafb60c1bc
[None][chore] Fix tests failing on pre-merge 12/08 (#9819)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-08 20:08:52 -08:00
Bo Li
f2006a1f74
[https://nvbugs/5726066][infra] Waive timeout disaggregated/test_auto_scaling tests. (#9815)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-08 19:51:43 -08:00
Jiagan Cheng
4a3a66b124
[https://nvbugs/5677746][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659)
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-12-08 18:43:52 -08:00
yuanjingx87
390391ebf1
[None][infra] Correct the waived test names due to a merge conflict (#9803)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-09 09:48:21 +08:00
Chenghao Zhang
75f5446d67
[#9753][feat] AutoDeploy: Implement add rms_norm fusion (#9754)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-08 14:24:27 -08:00
Yibin Li
faabc1a387
[TRTLLM-7967][chore] Add more tests (#9415)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-12-08 11:57:32 -08:00
Jhao-Ting Chen
0a09465089
[https://nvbugs/5567586][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 11:16:05 -08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250) 2025-12-08 10:37:40 -08:00
Lizhi Zhou
52f78e4000
[http://nvbugs/5649010][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-08 03:26:01 -08:00
fredricz-20070104
96d9b67d65
[https://nvbugs/5527655][test] Add test case for RCCA 5527655 (#9511)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 01:27:13 -08:00
fredricz-20070104
ededeecb0f
[None][test] Add Kimi k2 WIDEEP perf and accuracy cases (#9686)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 01:25:07 -08:00
xinhe-nv
3f55c07223
[None][chore] Remove closed bugs (#9770)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-07 22:51:55 -08:00
Fanrong Li
2f526583fb
[None][chore] Move the rocketkv e2e test to post-merge (#9768)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-08 13:22:16 +08:00
Emma Qiao
137713a869
[None][infra] Waive failed cases for main on 12/08 (#9773)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 20:18:29 -08:00
ruodil
d232709568
[https://nvbugs/5666804][test] only adding sampler config for limited models (#9512)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-07 19:40:29 -08:00
fredricz-20070104
9bfb6179ec
[https://nvbugs/5422621][test] Add GB 200 WIDEEP test case for RCCA 5422621 (#9506)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 10:41:40 +08:00
xxi
8e27ce7084
[TRTLLM-9603][feat] Enable ConfigurableMoE test in the CI (#9645) 2025-12-08 10:19:40 +08:00
Zheng Duan
4da0e1473c
[None][test] add ntp tolerance in time metrics verification (#9741)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-12-08 09:51:10 +08:00
chenfeiz0326
383178c00a
[TRTLLM-9000][feat] Add multi-node Perf Tests into CI (#8800)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-08 09:00:44 +08:00
Ludwig Schneider
41ce14ab04
[None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314)
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
2025-12-07 09:43:26 -08:00
Emma Qiao
7c6c493993
[None][infra] Waive failed cases for main branch on 12/07 (#9769)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 06:26:47 -08:00
JunyiXu-nv
b210f22c7e
[https://nvbugs/5703953][fix] Preserving ip:port for trtllm-serve before initializing llm (#9646)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-06 20:13:48 -08:00
Mike Iovine
31ab367576
[None][chore] Waive flakey disagg tests (#9749)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 13:07:05 -08:00
jthomson04
299601aebf
[https://nvbugs/5670672][fix] Fix flaky KV connector tests (#9676)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-12-05 10:04:54 -08:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
yufeiwu-nv
68253d9d29
[https://nvbugs/5518713][test] Refactor core test lists by merging with llm_perf_cluster.yml (#9714)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-05 01:15:37 -08:00
Kaiyu Xie
e06c582648
[None] [tests] Unwaive EPLB tests (#9625)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-05 00:13:24 -08:00
Lizhi Zhou
dc766fc126
[https://nvbugs/5633340][fix] start disagg workers and servers on free ports (#9694)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:51:29 +08:00
Lizhi Zhou
0d0a16fff4
[TRTLLM-8920][feat] decouple disagg service from fastapi (#8714)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:44:16 +08:00
xinhe-nv
530af1a98e
[None][chore] Add failed cases into waives.txt (#9662)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-04 22:33:22 +08:00
ruodil
8a392af28f
[None][test] rename wide ep and disagg metric name in perf test (#9704)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-12-04 18:16:06 +08:00
Yan Chunwei
05058f5e2a
[None][ci] unwaive tests (#9651)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-04 15:06:07 +08:00
JunyiXu-nv
6d2daec5d0
[TRTLLM-8274][feat] Check if executor is shutdown in /health entrypoint (#9057)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-04 13:49:40 +08:00
Jin Li
87e0c8a749
[TRTLLM-7073][feat] Support torch compile for PP for Llama and DeepSeekV3 (#7838)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-12-04 13:32:11 +08:00
mpikulski
744f0eff1b
[TRTLLM-9522][fix] restore trtllm-serve mm_embedding_serve (#9669) 2025-12-03 19:27:11 -08:00
Yiqing Yan
e31142202e
[TRTLLM-7181][infra] Generate test results when pytest timeout happens (#9396)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-04 10:05:38 +08:00
gramnarayan
098b9ff226
[#9147][feat] AutoDeploy: Draft Target Speculative Decoding (#9275)
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-04 05:13:49 +08:00
Michal Guzek
4e5b10da48
[https://nvbugs/5552132][fix] Enable LoRa for GPT OSS Torch (#8253)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-12-03 15:42:15 +01:00
Patrice Castonguay
ae8d8a266a
[https://nvbugs/5705197][chore] Unwaive timeout disagg tests (#9637)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-03 22:18:36 +08:00
Guoming Zhang
79e872de31
[None][test] Update Qwen3-next accuracy testing by setting the cuda … (#9613)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-12-03 20:52:53 +08:00
xinhe-nv
3a748b166b
[None][chore] Add failed cases into waives.txt (#9593)
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-12-03 16:26:06 +08:00
fredricz-20070104
80ff9015ce
[https://nvbugs/5561153][test] Fix log error for perf test (#9622)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-03 15:27:13 +08:00
brb-nv
43f6ad7813
[https://nvbugs/5708475][fix] Fix e2e eval accuracy for helix parallelism (#9647)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 15:13:59 +08:00
heyuhhh
a08eb81cce
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 (#9572)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2025-12-03 11:33:46 +08:00
yufeiwu-nv
21f2ba74e8
[None][test] Remove duplicate test cases (#9623)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-03 10:35:26 +08:00
brb-nv
55c7023c92
[None][chore] Waive test failing on pre-merge (#9638)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 07:31:10 +08:00
Patrice Castonguay
3991aa9c72
[https://nvbugs/5688388][fix] fix: Reducing num request in disagg test to speed up (#9598)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-02 12:48:53 -05:00
Shi Xiaowei
227d42e492
[https://nvbugs/5651854][fix] Fix dist-serving perf by clearing CPU affinity (#9549)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-03 01:17:03 +08:00
Mike Iovine
d5b7f0c8ad
[TRTLLM-8980][test] Clean up spec dec tests in test_llm_api_pytorch (#8889)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-02 10:32:02 -05:00
Yan Chunwei
b86256eb54
[TRTLLM-9144][fix] enhance RPC robustness (#8711)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-12-02 21:37:59 +08:00
brb-nv
be48cdf1d1
[TRTLLM-9466][test] Evaluate helix parallelism with DSV3 Lite (#9597)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-02 20:10:07 +08:00
Emma Qiao
4a8766c11d
[None][infra] Remove an invalid test name in waives.txt (#9620)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 18:05:17 +08:00
Emma Qiao
3e4f2388a9
[None][infra] Waive failed cases for main branch (#9615)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-02 15:48:27 +08:00
shuyixiong
1a2118b8fe
[https://nvbugs/5702793][fix] Fix uncontiguous tensor view (#9576)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-12-02 15:41:32 +08:00
xinhe-nv
ad46d19027
[None][chore] Add failed cases into waives.txt (#9588)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 14:24:11 +08:00
ruodil
4586b5f42f
[https://nvbugs/5582091][test] increase warmup times in testing for multi-gpu cases (#9578)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-12-02 14:22:49 +08:00
Wanli Jiang
5657a00ec0
[FMDL-1328][feat] Add support for nano-v3 and super-v3 with pytorch backend (#9261)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-02 13:40:20 +08:00
xinhe-nv
3911d0496e
[None][fix] Waive gb200 (#9580)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-02 12:09:21 +08:00
JunyiXu-nv
9a6df980cd
[https://nvbugs/5703953][fix] Use random port for disagg tests (#9582)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-02 11:40:14 +08:00
Iman Tabrizian
356a52edf5
[None][feat] Add support for KVCache reuse for DSv32 (#9383)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-12-02 11:14:30 +08:00
Venky
639c939a4f
[TRTC-1943][feat] Env vars override support in LLM API (#9104)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-12-01 10:04:49 -08:00
Yanchao Lu
7127c4407a
[None][test] [None][test] Waive main branch test failures 12/1 (#9566)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-01 21:54:53 +08:00
Shi Xiaowei
48b1d31895
[https://nvbugs/5651854][infra] Enable perf metrics during accuracy testing (#9140) 2025-12-01 20:15:32 +08:00
JadoTu
a92af27411
[None][chore] remove qwen3-next accuracy tests (#9534)
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-12-01 11:49:37 +08:00
Pengbo Wang
aa3310f64f
[https://nvbugs/5503479][fix] Temporarily lower reference accuracy to stabilize CI (#9398)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2025-12-01 11:49:14 +08:00
Enwei Zhu
2e3ac3c48f
[https://nvbugs/5684703][fix] Unwaive disagg guided decoding test (#9466)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-01 11:39:40 +08:00
JunyiXu-nv
3f588198dc
[None][fix] Fix port conflict in disagg tests (#9474)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-30 17:33:22 +08:00
Emma Qiao
c927ccf510
[None][infra] Wiave failed tests for main branch on 11/30 (#9555)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-30 16:13:20 +08:00
brb-nv
b77f4ffe54
[TRTLLM-5971][feat] Integrate helix parallelism (#9342)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-29 15:17:30 -08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase (#9522)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
dominicshanshan
70efa3ac43
[None][infra] Waive failed case in pre-merge on 11/28 (#9537)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-11-28 20:53:45 +08:00
Emma Qiao
2d7421b314
[None][infra] Waive failed cases for main branch on 11/28 (#9539)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-28 17:19:55 +08:00
yufeiwu-nv
08755a809d
[https://nvbugs/5689658][test] Fix gpu lock issue running on cluster (#9441)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-28 13:59:22 +08:00
JunyiXu-nv
c87e81c1d8
[https://nvbugs/5685015][fix] Update invalid max_token test (#9435)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-28 11:41:16 +08:00
Bo Li
19f3f4e520
[https://nvbugs/5637037][chore] Update waive lists. (#9386)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-28 10:45:22 +08:00
Yueh-Ting (eop) Chen
4cbfc10b28
[https://nvbugs/5674665][chore] Add test coverage for https://nvbugspro.nvidia.com/bug/5674665 (#9518)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-11-27 21:40:34 +08:00
Fanrong Li
2d5eadf65f
[None][fix] fix TP support for DeepSeek-V3.2 on hopper (#9484)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-27 21:02:25 +08:00
JadoTu
51bf7164d3
[None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 (#9330)
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-11-27 18:05:00 +08:00
Lizhi Zhou
8104a78931
[None][chore] revert batch_size=1 to prevent timeout and lower accuracy reference by 0.12% as a WAR (#9447)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 14:25:44 +08:00
Emma Qiao
0442510304
[None][infra] Waive failed case in pre-merge on 11/27 (#9507)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-27 13:53:33 +08:00
HuiGao-NV
03331bc43d
[https://nvbugs/5547414][fix] enable case after using local cache model (#9473)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-27 12:18:20 +08:00
Patrice Castonguay
1b2da426cd
[https://nvbugs/5680310][fix] Fix ctx only timed out test (#9410)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-27 11:21:21 +08:00
Shi Xiaowei
e76e149861
[https://nvbugs/5608930][fix] Fix a typo (#9487)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-27 09:05:17 +08:00
Chang Liu
b10137fdd5
[None][feat] Support MLA chunked prefill for DeepSeek V3.2 model (#9376)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-11-26 16:38:25 +08:00
JunyiXu-nv
b7308a4000
[https://nvbugs/5580099][fix] Cherry pick IMA issue fix from release/1.1 (#9032)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-26 13:09:06 +08:00
Wanli Jiang
d100599ea7
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm (#9246)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-26 11:12:35 +08:00
QI JUN
5972119e1c
[None][ci] move some slow test cases of DGX-B200 to post merge (#9467)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-26 10:48:53 +08:00
fredricz-20070104
6a64cb4c71
[TRTLLM-8936][test] Add disagg and wideep multi-node multi-gpu test cases (#9356)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-11-26 10:34:49 +08:00
Chuang Zhu
0e9c7f8c07
[https://nvbugs/5685143][fix] avoid cudaFree overlap with cuda graph (#9438)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-11-25 16:20:29 -08:00
Suyog Gupta
e484bec82f
[None][chore] AutoDeploy add multi stream moe pass to default.yaml (#9430)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-25 14:16:13 -08:00
Fanrong Li
8da59103d6
[https://nvbugs/5680905][fix] Relax the MMLU accuracy requirement for DS-v3.2 (#9439)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-26 00:32:20 +08:00
Yan Chunwei
1f43dc8174
[None][ci] waive a test (#9458)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-25 07:04:20 -08:00
YueWeng
cc336c4abd
[TRTLLM-8160][feat] Add draft token tree runtime on CDL (#8586)
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-11-25 09:40:55 -05:00
Shi Xiaowei
60786574db
[None][fix] Mitigate test timeout issues (#9445)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-11-25 20:17:54 +08:00
Chao Ni
a2d9e6250a
[https://nvbugs/5667922][fix] Update long context evaluation config (#9426)
Signed-off-by: mni <125171826+baize97@users.noreply.github.com>
2025-11-25 19:33:38 +08:00
Yanchao Lu
ff02e0f05c
[None][ci] Move more test stages to use OCI machines (#9395)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Matt Lefebvre <matthewelefebvre@gmail.com>
2025-11-25 15:59:13 +08:00
Eran Geva
6af01dc664
[#8391][chore] test_perf.py to lock clocks read from gpu_configs.yml instead of max freq (#9409)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-11-25 09:20:33 +02:00
Emma Qiao
15616e3ee5
[None][infra] Waive failed cases for main branch on 11/25 (#9429)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 23:18:15 -08:00
Suyog Gupta
efd503751f
[#9271][perf] Enable multi-stream MOE optimization in AutoDeploy (#9322)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-24 19:50:10 -08:00
kris1025
d1c724958d
[None][chore] unwaive ampere kernels test (#9389)
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-11-25 11:28:43 +08:00
xinhe-nv
0a9ae2e3e6
[None][chore] Remove closed bugs (#9381)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-24 18:49:57 -08:00
QI JUN
786d308b88
[https://nvbugs/5685428][fix] fix test_openai_chat_multimodal.py (#9406)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 16:56:33 -08:00
Yibin Li
1ce483c999
[TRTLLM-7967][feat] Adding Starcoder2 PyTorch Backend Support (#8923)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-11-24 11:23:22 -08:00
Emma Qiao
2c869f2bda
[None][infra] Waive failed cases for main (#9400)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-24 17:42:19 +08:00
Emma Qiao
af72d93fa9
[None][infra] Waive failed cases on main branch (#9384)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-23 22:53:02 -08:00
brb-nv
c045e359a7
[https://nvbugs/5637012][fix] Fix helix unit tests (#9369)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-23 19:34:22 -08:00
QI JUN
34a6d2d28f
[TRTLLM-9302][chore] Move build config from BaseLlmArgs to TrtLlmArgs (#9249)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-24 10:54:41 +08:00
Chenghao Zhang
e1c9aa7d6a
[None][chore] AutoDeploy: Add the Nemotron MOE to CI (#9328)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-11-23 12:12:12 -08:00
Yan Chunwei
1ef69ecbb1
[None][ci] waive two ray tests (#9375)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-23 15:39:01 +08:00
dongfengy
268ea9bb8a
[None][test] Add one-model and overlap-scheduling to eagle tests for GPTOSS (#9312)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-11-21 22:52:53 -08:00
Enwei Zhu
13fbd4366a
[TRTLLM-9370][feat] Integration of CuteDSL NVFP4 grouped GEMM (Part 2: SwiGLU Fusion and Finalize Fusion) (#9288)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-21 14:03:38 -08:00
Emma Qiao
041564188c
[None][infra] Waive failed cases in main post-merge on 11/21 (#9360)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-21 18:01:53 +08:00
QI JUN
b6483ef3e7
[None][ci] waive a test case of test_ad_build_small_multi.py (#9355)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-21 16:25:04 +08:00
Ivy Zhang
28e9bf6167
[None][chore] add periodic junit xml path in conftest (#9337)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-20 22:46:25 -08:00
QI JUN
e2a372a3b1
[None][ci] waive test_llm_context_only_timed_out_kv_cache_exhausted (#9351)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-20 20:20:57 -08:00
Barry Kang
a3433dd54e [https://nvbugs/5325296][fix] Enable relaxed acceptance test on Blackwell (#8709)
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jin Li
6185225501 [https://nvbugs/5488118][fix] Unwaive passed tests (#8758)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
xiweny
05aabfbc1e [https://nvbugs/5601203] [fix]Restrict fp8 blockscale moe case (#8583)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Eran Geva
3d66e56adb [https://nvbugs/5572320][fix] Ported test_ad_trtllm_bench.py from main (#8671)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Yukun He
9a79f32f7a [https://nvbugs/5608489][fix] Fix output unpack issues for Llama3/4 NVFP4 models. (#8679)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750 [None][test] Clean cache for certain easily hang cases (#8619)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jie Li
36e244f35e [https://nvbugs/5587456][fix] Remove multimodal test cases using TRT backend (#8611)
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
348668e3ae [https://nvbugs/5575902][fix] set max_batch_size=1 to stabilize accuracy test result (#8609)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
33b0b945c7 [https://nvbugs/5582277][fix] rework DisaggPPTerminationHandler to fix hang issue (#8519)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Pengyun Lin
81fd9be87d [https://nvbugs/5575829][fix] Unwaive gpt-oss test (#8576)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Bo Deng
4ca6fe83d8 [https://nvbugs/5565549][fix] unwaive test_disaggregated_spec_dec_bat… (#8500)
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
JunyiXu-nv
ee6944bfa2 [https://nvbugs/5569713][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 (#8429)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
yufeiwu-nv
0e746fad45
[https://nvbugs/5667454][test] Fix Test Case as Chunked Attention not Supported on sm_120 (#9260)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-20 00:58:42 -08:00
Liao Lanyu
04ad9f96fa
[https://nvbugs/5667687][fix] Set correct lm_head_tp_size_upper_bound (#9300)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-11-20 00:41:00 -08:00
Emma Qiao
b018b2698d
[TRTLLM-9164][infra] Enable checking duplicate items in waives.txt in pre-commit (#9265)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-20 15:47:23 +08:00
QI JUN
1bdd3ba173
[None][ci] waive test_disagg_server_restart (#9326)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-19 22:34:03 -08:00
Yechan Kim
d5622b2689
[None][fix] Multimodal InputProcessor dummy builder fix (#8916)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-19 22:32:21 -08:00
Chenghao Zhang
cd44f80abd
[#9316][feat] AutoDeploy: Add the accuracy test for Nemotron MOE models (#9317)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-19 21:48:50 -08:00
Bo Deng
2128f73d58
[TRTLLM-9247][infra] Upgrade NIXL to 0.7.1 (#9055)
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-20 11:01:02 +08:00
brb-nv
f6ec6e2222
[None][chore] Waive tests timing out on main (#9315)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-19 13:10:06 -08:00
mpikulski
46dd9886bb
[https://nvbugs/5661877][fix] fix test regression in TestBatchedSampling::test_samples (#9215)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-19 01:44:44 -08:00
xinhe-nv
0f77fec932
[None][chore] Add failed cases into waives.txt (#9289)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-19 17:03:43 +08:00
nvxuanyuc
a79c0dfb43
[None][fix] Update GLM model accuracy test (#9286)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 21:59:01 -08:00
Emma Qiao
67d3eb26af
[None][infra] Waive failed cases for main branch on 11/17 (#9266)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-18 20:07:03 -08:00
xinhe-nv
286ace22ed
[None][chore] Add failed cases into waives.txt (#9242)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 19:27:55 -08:00
Ivy Zhang
782dfca7e8
[TRTLLM-9050][test] add llama4 disagg case to cover kv cache overflow error (#9172)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 18:26:32 -08:00
xinhe-nv
35658eab55
[None][chore] Add failed cases into waives.txt (#9193)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 17:47:55 -08:00
Enwei Zhu
7c4777a571
[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-18 17:40:12 -08:00
Lizhi Zhou
c789000a62
[https://nvbugs/5649010][fix] increase status-checking interval to avoid instability (#9203)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-19 08:55:42 +08:00
Bo Deng
34f845bf69
[TRTLLM-9287][infra] Use NIXL backend for accuracy tests (#9247)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-11-18 14:46:20 -08:00
Ajinkya Rasane
8d7cda2318
[None][chore] Update the Flux autodeploy example (#8434)
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
2025-11-18 14:16:04 -08:00
Kaiyu Xie
d076aa44d3
[None] [tests] Unwaive wide ep related tests (#9204)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-11-18 08:54:46 -08:00
Ivy Zhang
160b361588
[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check (#9088)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 05:55:00 -08:00
Ivy Zhang
ca41a71f92
[TRTLLM-8948][test] Add long bench case (#9165)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 04:41:48 -08:00
Tri Dao
fc088e642c
[None][feat] Support Glm4MoeForCausalLM (#8256)
Signed-off-by: Tri Dao <daominhtri0503@gmail.com>
Co-authored-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 09:43:21 +08:00
QI JUN
c3376fa114
[None][ci] split speculative test case into several small cases (#9209)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-17 17:02:25 -08:00
Robin Kobus
df41f220a2
[TRTLLM-8831][feat] Enable early exit with overlap scheduler (#8587)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-11-17 18:07:13 +01:00
Emma Qiao
d16b1a84c5
[None][infra] Waive a failed case in pre-merge stage 11/16 (#9192)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-17 09:36:56 +08:00
Emma Qiao
2854f0cf3d
[None][infra] Waive failed tests for main branch 11/15 (#9187)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-16 01:48:25 -08:00
brb-nv
63237494db
[None][chore] Waive failing tests blocking pre-merge (#9189)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-16 01:06:03 -08:00
Chang Liu
bed4e95e9f
[https://nvbugs/5629887][fix] Add missing device count guard for DSv32 multiGPU tests (#9159) 2025-11-14 07:52:23 -08:00
xinhe-nv
49b7e6301a
[None][chore] Add failed cases into waives.txt (#9156)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-14 06:28:22 -08:00
yuanjingx87
d72321a32e
[None][ci] Waive unittest/_torch/sampler/test_torch_sampler.py::TestBatchedSampling (#9161)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-11-14 01:49:26 -08:00
QI JUN
3c950910a0
[None][ci] waive test_disaggregated.py::test_disaggregated_mixed[TinyLlama-1.1B-Chat-v1.0] (#9162)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-13 18:56:37 -08:00
Tailing Yuan
cc4c980e03
[None][feat] Add Qwen3-Next to layer-wise benchmarks (#9065)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-14 10:03:00 +08:00
Erin
44d1c75701
[TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server (#8765)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-11-13 17:21:24 -08:00
William Zhang
121140cfec
[None][fixes] Add tool call parsing fixes and Qwen3 coder parser (#8817)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-11-13 04:34:38 -08:00
Lizhi Zhou
48a27c7bef
[https://nvbugs/5633340][chore] unwaive test_auto_scaling.py::test_disagg_server_restart (#9131)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-11-13 01:45:36 -08:00
Emma Qiao
d0ea417ec8
[None][infra] Waive failed tests for main 11/13 (#9132)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-13 01:00:40 -08:00
xinhe-nv
548f5ce4bc
[None][fix] waive failed tests (#9090)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 23:40:00 -08:00
xinhe-nv
8fa3c55c76
[None][chore] Remove closed bugs (#9114)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-12 22:49:37 -08:00
ruodil
c86e36fe38
[None][test] add deepseek and qwen cases for rtx series (#8839)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-11-12 22:28:02 -08:00
HuiGao-NV
cde18c12da
[https://nvbugs/5640873][fix] Move thop tests to pre-merge (#9094)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-13 13:08:13 +08:00
Yan Chunwei
4fd93bdc2c
[None][ci] Waive test_llm_rpc and test_llm_rpc_streaming (#9118)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-11-12 19:55:09 -08:00
Zhenhuan Chen
943b05e2d3
[TRTLLM-9179][feat] add pp_partition to customize each rank's layer number (#9003)
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2025-11-13 10:34:17 +08:00
QI JUN
3416efbc29
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_chunked_prefill (#9111)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-13 10:06:32 +08:00
dongxuy04
9241ccaf27
[None][feat] Enable EPLB for trtllm-gen and cutlass backend (#8886)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
2025-11-12 12:30:27 -08:00
Chenghao Zhang
5f26c31954
[https://nvbugs/5636912][fix] AutoDeploy: Unwaive the test (#9018)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-12 12:26:38 -08:00
Fanrong Li
780d4f9dc5
[None][feat] Add MTP>1 support for DS-v3.2 (#9045)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-12 09:56:12 -08:00
Iman Tabrizian
cdde15b275
[TRTLLM-8540][feat] Add support for disagg in DSv3.2 (#8735)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-11-12 08:21:11 -08:00
yufeiwu-nv
b7a2574c60
[https://nvbugs/5568991][test] Remove Phi-3 models (#9066)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-12 03:16:36 -08:00
QI JUN
4003dc7574
[None][ci] waive some test cases of disaggregated serving (#9085)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-12 15:06:21 +08:00
Emma Qiao
bb6eb9510d
[None][infra] Waive a failed case of disaggregated/test_disaggregated.py (#9074)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 19:38:32 -08:00
QI JUN
fd703fbb7b
[None][ci] run speculative unit tests serially (#9080)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 19:06:44 -08:00
Lucas Liebenwein
aca56097cb
[None][fix] AutoDeploy: update nano3 accuracy test (#9061)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-11 12:26:31 -08:00
Wanli Jiang
ebdd1cc8e0
[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm (#8840)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-11 07:48:23 -08:00
QI JUN
0ce22ce928
[None][ci] waive test_disaggregated_serving.py::TestQwen3_8B::test_auto_dtype[False] (#9069)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-11 02:11:15 -08:00
Yiqing Yan
b7d51c5549
[None][chore] Remove duplicated waive test (#9067)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-11-11 16:49:49 +08:00
Emma Qiao
da1f0e2465
[None][infra] Waive failed tests on main 11/11 (#9058)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-11 13:19:30 +08:00
xinhe-nv
fac522056c
[None][chore] Add failed cases into waives.txt (#8998)
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-11-11 12:40:59 +08:00
Yechan Kim
0938a3ad2a
[https://nvbugs/5644187][fix] Llava-Next MMMU bugfix and Phi4 test bugfix (#9034)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-11 10:24:31 +09:00
xiweny
50c486367a
[https://nvbugs/5619396][fix] Add sm103 to CutlassFP8RowwiseGemm (#9042)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-10 08:12:14 -08:00
xinhe-nv
f848d844d9
[None][chore] Add failed cases into waives.txt (#9030)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-09 23:36:05 -08:00
Fanrong Li
a7033a9193
[TRTLLM-9001][feat] add TP support for DeepSeek-V3.2 (#8943)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-10 12:16:01 +08:00
Bo Li
67af7c15a5
[https://nvbugs/5637037][fix] Update unwaive list. (#9001)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-11-10 08:53:07 +08:00
Emma Qiao
183778d58a
[None][infra] Waive failed tests for main 11/07 (#9008)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-08 08:51:35 -08:00
Emma Qiao
2af6a537ad
[TRTLLM-8999][infra] Reduce gb200 multi-node test stages (#8778)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-08 06:34:24 -08:00
Yuxian Qiu
7b82ba90da
[https://nvbugs/5629790][chore] unwaive test. (#8967)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-11-07 18:41:32 +08:00
QI JUN
1c6e490894
[TRTLLM-9065][chore] remove PyTorchConfig completely (#8856)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-06 22:37:03 -08:00
Lizhi Zhou
b26e1617f2
[https://nvbugs/5633340][fix] kill processes properly after test (#8970)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-06 21:45:38 -08:00
xiweny
ee20e679a9
[https://nvbugs/5636986][fix] Fix DeepGemmMoe get_buffer calls (#8939)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-06 19:57:19 -08:00
Simeng Liu
9f8d93f89a
[https://nvbugs/5606136][ci] Remove tests for deprecating triton multimodal models. (#8926)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-11-06 17:58:42 -08:00
jthomson04
fcae852cef
[None][fix] Fix KV cache clearing with KV Connector API (#8750)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-06 14:28:27 -08:00
shuyixiong
c73efe12e7
[None][chore] Use cached model in all ray tests (#8962)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-11-06 15:14:15 +01:00
Fanrong Li
d246f62868
[https://nvbugs/5630345] [chore] skip deepseek-v3.2 fp8 kv tests on pre-Blackwell architectures (#8973)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-06 03:41:37 -08:00
xinhe-nv
e822184cd7
[None][feat] add waive by sm version (#8928)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-05 19:20:43 -08:00
Lucas Liebenwein
7a552c450a
[https://nvbugs/5606166][fix] AutoDeploy: unwaive test for use tuples for cudagraph shape lookup (#8957)
also updated test waive for another nvbug

Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-11-05 16:27:00 -08:00
Fanrong Li
c2feed798a
[https://nvbugs/5630345][chore] unwaive DS-v32 nvfp4 and fp8 tests (#8887)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-05 03:49:23 -08:00
Chuang Zhu
595f78078c
[https://nvbugs/5624367][fix] Fix disagg GPT-OSS test (#8870)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-11-05 01:47:09 -08:00
Emma Qiao
31116825b3
[None][infra] Waive failed cases on main 11/05 (#8936)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-04 22:54:45 -08:00
xinhe-nv
cc4aa29523
[None][chore] Add failed cases into waives.txt (#8865)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-04 19:26:50 -08:00
Yechan Kim
ed81173c55
[None][ci] Add test on waives (#8915)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-05 08:42:08 +08:00
Patrice Castonguay
782824533e
[https://nvbugs/5587574][fix] Increase server timeout to wait for weight loading (#8806)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-04 12:11:08 -08:00
Yanchao Lu
e2b2675120
[None][fix] Remove duplicated test waives (#8914)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-04 23:04:33 +08:00
Robin Kobus
7e4b87b17c
[None][ci] Remove outdated test entries (#8909)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-11-04 05:32:46 -08:00
xiweny
cae468cc8e
[https://nvbugs/5596343] [test] Waive flaky GPT-OSS cases (#8904)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-11-04 03:00:00 -08:00
Zhanrui Sun
4de31bece2
[TRTLLM-8994][infra] upgrade to DLFW 25.10 and pytorch 2.9.0 / triton 3.5.0 (#8838)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-04 18:59:34 +08:00
Ivy Zhang
23717cdb3f [TRTLLM-8580][test] save runtime report periodically (#8312) (#8455)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yukun He
6c8ba3be27 [None][chore] Remove duplicate log outputs in test_perf.py (#8418)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
ruodil
102e556863 [None][test] cherry-pick: add test-model-suites in integration conftest.py (#8388)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Patrice Castonguay
65c138108e [https://nvbugs/5552889][fix] fix: Prevent empty batch when using attention DP with disagg (#8372)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Ivy Zhang
9bcd2e6c0a [None][chore] Update nim test list (#8356)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Stanley Sun
def9c0004d [TRTLLM-8113][test] Add pytorch workflow e2e tests with pp enabled (#8357)
Signed-off-by: Stanley Sun <stsun@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
xiweny
fcac2022e2 [https://nvbugs/5565565] [fix] fp8 wideep support sm103 (#8228)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yueh-Ting (eop) Chen
bd1c9c0af4
[https://nvbugs/5625990][chore] Add test coverage for current incapability of the KV cache manager (#8829)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
2025-11-04 16:35:45 +08:00
Emma Qiao
4fe47faf47
[None][infra] Waive failed tests for main branch (#8897)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-03 22:21:28 -08:00
Zhanrui Sun
9ec6a6b68f
[None][infra] waive failed test on main 11/4 (#8896)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-11-03 21:37:09 -08:00
Mike Iovine
5e6f1bcd24
[TRTLLM-8979][test] Improve qwen3 spec dec test coverage (#8767)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-03 10:12:10 -08:00
Yechan Kim
f48968b6cc
[TRTLLM-6928][fix] Refactor multimodal unittest (#8453)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 06:01:07 -08:00
Emma Qiao
14bc8571ae
[TRTLLM-8435][infra] Test existing rtxpro6000 stages on rtxpro6000d (#8319)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-03 05:26:17 -08:00
Emma Qiao
d7176768cd
[None][infra] Waive the failed test for main on 11/3 (#8875)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-03 02:52:52 -08:00
Tailing Yuan
8303cfa477
[None][fix] Fix import issues in layer-wise benchmarks (#8827)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-03 02:32:48 -08:00
xinhe-nv
4873ca04cc
[https://nvbugs/5521799][fix] add harmony channel validation (#8837)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 02:31:54 -08:00
xinhe-nv
64540451e7
[None][chore] Add failed cases into waives.txt (#8872)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 01:19:04 -08:00
Fanrong Li
e9f78c687a
[https://nvbugs/5625962][chore] unwaive DS-v32-fp4 tests (#8853)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-03 00:34:52 -08:00
Yechan Kim
00c0e6c440
[https://nvbugs/5523315][fix] Fix serve benchmark test (#8255)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 00:30:13 -08:00
chenfeiz0326
cc4ab8d9d1
[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database (#8653)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-11-03 16:23:13 +08:00
yufeiwu-nv
b4d17d1a4c
[TRTLLM-8991][test] Add Llama 3.3 70B model with different performance config (#8753)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-11-03 13:34:06 +08:00
dongfengy
6d6797c792
[None][test] Enhance GPT-OSS CI with GPQA Diamond and additional Spec Decoding Test (#8661)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-11-02 16:44:02 -08:00
Yan Chunwei
1551ed8e5f
[https://nvbugs/5437384][test] CHERRY-PICK: fix trtllm-llmapi-launch multi tests (#8567)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-11-01 06:49:33 -07:00
dongxuy04
bba2519726
[TRTLLM-7008][fix] Enable GDRCopy and unwaive online eplb tests (#8720)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-10-31 16:39:51 -07:00
Fanrong Li
f0dc746738
[TRTLLM-8541][feat] Add trtllm-gen sparse MLA kernels to support per-Tensor FP8 KV Cache (#8692)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-31 14:38:31 -07:00
Tailing Yuan
98453d2bb7
[None][fix] Waive layer-wise benchmark tests (#8823)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 22:51:31 -07:00
Emma Qiao
aecc9655a0
[None][info] Waive failed case for main (#8826)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 20:43:59 -07:00
Yuxian Qiu
025d2926df
[https://nvbugs/5599515][fix] Fix PP bubbles. (#8687)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-31 10:13:56 +08:00
Mike Iovine
b87448b009
[TRTLLM-8978][test] Remove llama 4 spec dec tests (#8766)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-30 15:47:04 -04:00
Tailing Yuan
ec31363a86
[None][fix] Layer wise benchmarks: use local models, lint (#8799)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 09:47:46 -07:00
Emma Qiao
9112cffaf3
[None][infra] Waive failed case for main branch (#8797)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 07:57:35 -07:00
Tailing Yuan
f9c7786dc8
[None][feat] Add layer wise benchmarks (#8777)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 20:29:34 +08:00
Emma Qiao
a5cc9fe0aa
[TRTLLM-5453][infra] Check all steps for test name and also check the test in waives.txt also exists in l0 or qa test list. (#6256)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-10-30 01:56:04 -07:00
xinhe-nv
a4f75399b9
[https://nvbugs/5481206][fix] update waives (#8774)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-30 00:43:38 -07:00
Emma Qiao
7d3cebf34e
[None][infra] Unwaive the tests passed in latest CI and disable a perf stage (#8775)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 12:48:23 +08:00
Emma Qiao
db99a936b0
[TRTLLM-8971][infra] Update gpu key for B300/GB300 (#8724)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 20:36:44 -07:00
Yuxian Qiu
3176bd3815
[None][fix] Fix UnboundLocalError. (#8756)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-29 19:41:37 -07:00
HuiGao-NV
ae57738bae
[https://nvbugs/5547414][fix] Use cached models (#8755)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-29 19:10:10 -07:00
Iman Tabrizian
ae6875fe10
[TRTLLM-8976][feat] Move indexer-k-cache to KVCacheManager (#8699)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-10-29 08:04:26 -07:00
Emma Qiao
579e1067bf
[None][infra] Waive failed tests on main (#8759)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 21:32:23 +08:00
Yan Chunwei
fc3b6f5331
[None][ci] waive test_rpc.py (#8745)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-29 05:17:40 -07:00
Chang Liu
81eb861df0
[None][chore] Enable GPQA in CI for DeepSeek V3.2 (#8712)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-29 04:22:22 -07:00
Zheng Duan
d626d13d37
[https://nvbugs/5607238][test] fix working dir in disagg worker test (#8648)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-10-29 16:13:52 +08:00
Pengyun Lin
2aade46d18
[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-29 15:48:29 +08:00
xinhe-nv
7ba98a6b20
[None][chore] Add failed cases into waives.txt (#8684)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-28 20:30:01 -07:00
Yan Chunwei
f2faf2809f
[None][ci] waive test_rpc.py temporarily (#8743)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-28 19:20:27 -07:00
Zheng Duan
fea5bfbda7
[None][feat] add detailed KV cache transfer time breakdown (#8521)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-10-29 10:11:09 +08:00
ruodil
f444fe2deb
[None][test] fix a typo in perf test sampler config (#8726)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-29 09:53:53 +08:00
Lizhi Zhou
24167d00eb
[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-10-28 17:04:53 -07:00
dongfengy
083f3637f1
[https://nvbugs/5596343][test] Update test waive to get back some coverage (#8702)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-10-28 14:05:48 -07:00
Anish Shanbhag
a09b38a862
[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-28 09:17:26 -07:00
dongfengy
5a01f382c1
[https://nvbugs/5575913][fix] Use separate thresholds for 120b/20b gptoss (#8664)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-10-28 10:35:07 -04:00
Robin Kobus
e8e2b0697a
[None][chore] Revert "[TRTLLM-7835][test] add default sample config for perf test (#8523) (#8725)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-28 14:23:38 +01:00
ruodil
6b9b73ee27
[https://nvbugs/5564465][test] ensure deepseek_v3_lite isl + osl < max_seq_len (#8565)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-28 15:25:52 +08:00
ruodil
bf72eb045e
[TRTLLM-7835][test] add default sample config for perf test (#8523)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-10-28 02:22:47 -04:00
yufeiwu-nv
0e36484fba
[None][test] Add gpt_oss_20b Model to Sanity Perf Test (#8265) 2025-10-28 13:36:28 +08:00
Aurelien Chartier
0a02f5f25d
[None][chore] Use a cached model path for Ray integration test (#8660)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-10-27 19:16:06 -07:00
HuiGao-NV
49974eed75
[None][chore] ISOLATE some cases (#8690)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-27 22:10:44 -04:00
chenfeiz0326
f5265a087b
[None][infra] Minor Update on Perf Sanity Testdb Files (#8607)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-10-28 09:54:48 +08:00
gramnarayan
88b0fbc8ff
[#8245][feat] Autodeploy: Guided Decoding Support (#8551)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-28 09:29:57 +08:00
Yechan Kim
a6017f6266
[https://nvbugs/5608723][fix] Use local data on multimodal tests and unwaive tests (#8673)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-10-28 09:20:02 +09:00
Emma Qiao
73a5479b26
[None][infra] Skip failed tests for main 10/27 (#8686)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-28 08:04:30 +08:00
Bo Li
9c4432f8a4
[TRTLLM-7318][feat] MnnvlThroughput AlltoAll implementation. (#7499)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-10-27 13:23:06 -04:00
mpikulski
7c8ba71b49
[TRTLLM-8832][feat] fully async _select_generated_logits with tests (#8628)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-27 16:15:32 +01:00
Kaiyu Xie
c9b08790c2
[None] [test] Add MNNVL AlltoAll tests to pre-merge (#8601) 2025-10-27 21:39:44 +08:00
Jie Li
ce0d76135d
[https://nvbugs/5546507][fix] skip TRT-Flow test case due to CMake Error in building (#8677)
Signed-off-by: Jie Li <lijie@nvidia.com>
2025-10-27 05:11:47 -04:00
xinhe-nv
8090c9641c
[TRTLLM-8638][fix] Add failed cases into waives.txt (#8672)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-27 03:20:46 -04:00
xinhe-nv
0ac5cbcac4
[None][chore] Add failed cases into waives.txt (#8669)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-27 02:36:28 -04:00
QI JUN
cc5b8b6d28
[None][ci] move some time-consuming benchmark test cases to post merge (#8641)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-26 22:47:17 -04:00
Emma Qiao
e0728ba8a7
[None][infra] Waive failed case on main 10/26 (#8668)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-26 22:02:32 +08:00
Chenghao Zhang
a6d20f6f9b
[None][feat] AutoDeploy: Add FP8 MOE for Nemotron (#8599)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
Co-authored-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2025-10-25 15:26:45 -04:00
Simeng Liu
2b27810198
[https://nvbugs/5494718][fix] Fix Single GPU Multi-node issue and OOM on DGX Spark (#8514)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-10-24 19:09:07 -07:00
jthomson04
02081e2390
[None][feat] Support KV Connector with Disagg Prefill Worker (#8246)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-10-24 11:09:06 -07:00
Chang Liu
e47c787dd7
[TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache (#8405)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-24 13:40:41 -04:00
Chuang Zhu
2420918e5b
[TRTLLM-7078][chore] optimal kvcache transfer for VWSA (#7952)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-10-24 08:58:16 -04:00
Emma Qiao
35e35db422
[None][infra] Waive tests on main and remove lines which missed in MI (#8639)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-10-24 02:49:23 -04:00
xinhe-nv
2aaedd08cd
[TRTLLM-8638][fix] fix test issues (#8557)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:16:55 -04:00
xinhe-nv
9a9d647292
[None][chore] Add failed cases into waives.txt (#8630)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:11:03 -04:00
ruodil
07a957e5cb
[None][test] remove redunctant runtime backend in perf test (#8358)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-10-24 02:01:34 -04:00
Stanley Sun
6b793d5c3d
[TRTLLM-8738][test] Add end-to-end trtllm-serve negative tests (#8580)
Signed-off-by: Stanley Sun <stsun@nvidia.com>
2025-10-24 13:23:47 +08:00
xinhe-nv
59375e8bed
[TRTLLM-8638][fix] Add failed cases into waives.txt (#8590)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 00:02:42 -04:00