Commit Graph

1125 Commits

Author SHA1 Message Date
Yan Chunwei
04a39a4e2b
[None][chore] enable test_ipc.py (#9865)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-11 17:47:14 +08:00
Bo Deng
c1d53ee43d
[https://nvbugs/5582258][fix] unwaive (#9650)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-12-10 19:18:30 -08:00
Patrice Castonguay
2c0293c612
[https://nvbugs/5601682][fix] Unwaiving disagg test (#9627)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-10 13:42:26 -05:00
cheshirekow
2f030312a8
[TRTLLM-9228][infra] Verify thirdparty C++ process (#9367)
Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
2025-12-10 21:01:19 +08:00
dominicshanshan
0e78a4b244
[https://nvbugs/5702791][fix] Unwaive fixed test (#9844)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-10 14:01:44 +08:00
QI JUN
2c46126a93
[TRTLLM-9794][ci] move some deepseek test cases to gb200 (#9841)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 19:54:51 -08:00
zhanghaotong
36c9e7cfe6
[None][chore] Add unittest for otlp tracing (#8716)
Signed-off-by: zhanghaotong <zhanghaotong.zht@antgroup.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-09 18:34:08 -08:00
dhansen-nvidia
2d33ae94d5
[https://nvbugs/5508301][feat] Move D->H copies to a worker thread whe… (#8463)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
2025-12-09 18:51:31 -05:00
Patrice Castonguay
414448bb37
[https://nvbugs/5719561][chore] Unwaive tests for nvbug 5719561 (#9801)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 18:21:50 -05:00
Patrice Castonguay
ff0ef19ee9
[https://nvbugs/5688388][chore] Unwaiving fixed disagg test (#9800)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 16:51:46 -05:00
Patrice Castonguay
7d7d05d8db
[None][chore] Adding flaky auto scaling test to waives (#9851)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 15:05:19 -05:00
Emma Qiao
75bc386b65
[None][infra] Waive failed cases for main branch on 12/09 (#9839)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 19:39:29 +08:00
QI JUN
58c29957d9
[TRTLLM-9794][ci] move qwen3-next test cases to gb200 (#9827)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 01:58:25 -08:00
Robin Kobus
76f49c903b
[None][fix] Additional model outputs for pipeline parallelism (#9794)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-09 10:41:22 +01:00
yufeiwu-nv
fbcf03040f
[None][test] Refactor qa/llm_perf_nim.yml test list (#9700)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-08 22:00:43 -08:00
QI JUN
252769c930
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-08 21:51:30 -08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
JunyiXu-nv
90890785eb
[https://nvbugs/5722653][fix] Fix config file used by disagg_client (#9783)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-08 20:34:55 -08:00
Balaram Buddharaju
bafb60c1bc
[None][chore] Fix tests failing on pre-merge 12/08 (#9819)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-08 20:08:52 -08:00
Bo Li
f2006a1f74
[https://nvbugs/5726066][infra] Waive timeout disaggregated/test_auto_scaling tests. (#9815)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-08 19:51:43 -08:00
Jiagan Cheng
4a3a66b124
[https://nvbugs/5677746][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659)
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-12-08 18:43:52 -08:00
yuanjingx87
390391ebf1
[None][infra] Correct the waived test names due to a merge conflict (#9803)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-09 09:48:21 +08:00
Yibin Li
faabc1a387
[TRTLLM-7967][chore] Add more tests (#9415)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-12-08 11:57:32 -08:00
Jhao-Ting Chen
0a09465089
[https://nvbugs/5567586][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 11:16:05 -08:00
Lizhi Zhou
52f78e4000
[http://nvbugs/5649010][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-08 03:26:01 -08:00
fredricz-20070104
96d9b67d65
[https://nvbugs/5527655][test] Add test case for RCCA 5527655 (#9511)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 01:27:13 -08:00
xinhe-nv
3f55c07223
[None][chore] Remove closed bugs (#9770)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-07 22:51:55 -08:00
Fanrong Li
2f526583fb
[None][chore] Move the rocketkv e2e test to post-merge (#9768)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-08 13:22:16 +08:00
Emma Qiao
137713a869
[None][infra] Waive failed cases for main on 12/08 (#9773)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 20:18:29 -08:00
xxi
8e27ce7084
[TRTLLM-9603][feat] Enable ConfigurableMoE test in the CI (#9645) 2025-12-08 10:19:40 +08:00
chenfeiz0326
383178c00a
[TRTLLM-9000][feat] Add multi-node Perf Tests into CI (#8800)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-08 09:00:44 +08:00
Emma Qiao
7c6c493993
[None][infra] Waive failed cases for main branch on 12/07 (#9769)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 06:26:47 -08:00
Mike Iovine
31ab367576
[None][chore] Waive flakey disagg tests (#9749)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 13:07:05 -08:00
jthomson04
299601aebf
[https://nvbugs/5670672][fix] Fix flaky KV connector tests (#9676)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-12-05 10:04:54 -08:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
yufeiwu-nv
68253d9d29
[https://nvbugs/5518713][test] Refactor core test lists by merging with llm_perf_cluster.yml (#9714)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-05 01:15:37 -08:00
Kaiyu Xie
e06c582648
[None] [tests] Unwaive EPLB tests (#9625)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-05 00:13:24 -08:00
Lizhi Zhou
dc766fc126
[https://nvbugs/5633340][fix] start disagg workers and servers on free ports (#9694)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:51:29 +08:00
Lizhi Zhou
0d0a16fff4
[TRTLLM-8920][feat] decouple disagg service from fastapi (#8714)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:44:16 +08:00
xinhe-nv
530af1a98e
[None][chore] Add failed cases into waives.txt (#9662)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-04 22:33:22 +08:00
Yan Chunwei
05058f5e2a
[None][ci] unwaive tests (#9651)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-04 15:06:07 +08:00
JunyiXu-nv
6d2daec5d0
[TRTLLM-8274][feat] Check if executor is shutdown in /health entrypoint (#9057)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-04 13:49:40 +08:00
mpikulski
744f0eff1b
[TRTLLM-9522][fix] restore trtllm-serve mm_embedding_serve (#9669) 2025-12-03 19:27:11 -08:00
gramnarayan
098b9ff226
[#9147][feat] AutoDeploy: Draft Target Speculative Decoding (#9275)
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-04 05:13:49 +08:00
Michal Guzek
4e5b10da48
[https://nvbugs/5552132][fix] Enable LoRa for GPT OSS Torch (#8253)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-12-03 15:42:15 +01:00
Patrice Castonguay
ae8d8a266a
[https://nvbugs/5705197][chore] Unwaive timeout disagg tests (#9637)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-03 22:18:36 +08:00
xinhe-nv
3a748b166b
[None][chore] Add failed cases into waives.txt (#9593)
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-12-03 16:26:06 +08:00
heyuhhh
a08eb81cce
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 (#9572)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2025-12-03 11:33:46 +08:00
yufeiwu-nv
21f2ba74e8
[None][test] Remove duplicate test cases (#9623)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-03 10:35:26 +08:00
brb-nv
55c7023c92
[None][chore] Waive test failing on pre-merge (#9638)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 07:31:10 +08:00