dongfengy
bfc591994c
[ https://nvbugs/5745152 ][fix] Fix some GPTOSS test setups ( #10085 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-12-26 17:52:40 +08:00
bhsueh_NV
db3430f589
[None][feat] Support VLM part for Mistral Large 3 ( #10188 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-25 11:20:58 -05:00
ZhichenJiang
46e4af5688
[TRTLLM-9831][perf] Enable 2CTA with autotune for CuteDSL MoE and Grouped GEMM optimizations ( #10201 )
...
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-25 09:04:20 -05:00
Lizhi Zhou
fe12faef81
[ https://nvbugs/5752516 ][chore] unwaive test; fix port conflicts in CI ( #10152 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-25 08:16:09 -05:00
Emma Qiao
0ecdb69b93
[None][infra] Waive failed tests for main on 12/25 ( #10298 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-25 05:22:39 -05:00
Jie Li
83e02ee335
[None][chore] Remove NIM TRT-Backend Test Lists ( #10232 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2025-12-25 04:01:51 -05:00
Enwei Zhu
182b3eb633
[None][ci] Waive TestLlama3_1_8B::test_auto_dtype[False-2] for timeout ( #10293 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-25 02:35:18 -05:00
xinhe-nv
4ae6f6a46c
[None][chore] Add failed cases into waives.txt ( #10249 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-25 01:26:21 -05:00
gramnarayan
a9eb5afc9f
[ #9241 ][feat] AutoDeploy: Support Eagle3 Speculative Decoding ( #9869 )
...
Support two model flow with no overlap scheduler or chain drafter. Drafting model is in PyTorch backend.
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-24 23:30:42 -05:00
Emma Qiao
16fd781e42
[TRTLLM-9862][infra] Move single-gpu tests on rtxpro6000d to pre-merge ( #9897 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-24 21:45:33 -05:00
Stanley Sun
ddac4d7379
[None][test] Add disag-serving auto scaling qa test ( #10262 )
...
Signed-off-by: Stanley Sun <stsun@nvidia.com>
2025-12-24 08:43:47 -05:00
shuyixiong
f4f0fe85e9
[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests ( #9939 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-24 15:27:01 +08:00
xinhe-nv
534700ecd9
[None][chore] Add failed cases into waives.txt ( #10240 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-24 02:21:50 -05:00
Emma Qiao
7b84e48e0f
[None][infra] Waive failed cases om 12/24 ( #10257 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-23 22:49:57 -05:00
xinhe-nv
fc1f77eafc
[None][chore] Add failed cases into waives.txt ( #10204 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
2025-12-24 10:37:23 +08:00
Balaram Buddharaju
8c1cfc872b
[TRTLLM-9493][feat] Custom AllToAll for helix parallelism ( #9986 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-23 18:14:30 -08:00
Jhao-Ting Chen
92d90fa29a
[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS ( #10018 )
...
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-23 11:41:31 -06:00
Grzegorz Kwasniewski
0027a01ad5
[ https://nvbugs/5680312 ][fix] Updated test waiving ( #9630 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
2025-12-23 09:38:12 -08:00
Emma Qiao
984c20e0b2
[None][infra] Waive failed cases on 12/23 ( #10236 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-23 08:48:54 -05:00
dongfengy
e284d0bf80
[None][infra] Waive flaky unittest/executor/test_rpc_proxy.py and unittest/executor/test_rpc_worker.py tests ( #10209 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-23 07:43:13 -05:00
Yukun He
522f1d2bc3
[ https://nvbugs/5764627 ][chore] waive the time-out test ( #10222 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-23 16:36:06 +08:00
Balaram Buddharaju
f2e00a75de
[None][chore] Remove helix test from rtx test list ( #10224 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-23 03:07:37 -05:00
Chuang Zhu
53db3b2612
[ https://nvbugs/5741884 ][fix] unwaive disagg sampler ( #10189 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-23 14:38:07 +08:00
xinhe-nv
77b591f73b
[None][chore] Add failed cases into waives.txt ( #10177 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-23 13:43:50 +08:00
Pamela Peng
5bc7ffe379
[None][test] Add qa tests for RTX 6K ( #10210 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-12-22 22:47:09 -05:00
Emma Qiao
ba14a9308e
[None][infra] Waive failed cases on 12/22 ( #10200 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-23 00:05:45 +08:00
xinhe-nv
d30ee8101e
[None][chore] Remove closed bugs ( #10182 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-22 01:58:17 -05:00
Yuxian Qiu
237fd0eae4
[ https://nvbugs/5666821 ][chore] unwaive tests. ( #9958 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-22 11:39:45 +08:00
Jin Li
066b653940
[TRTLLM-9880][feat] Include torch compile tests in QA test list ( #10149 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-12-22 10:37:09 +08:00
Yuxian Qiu
2f139ee07e
[ https://nvbugs/5701445 ][chore] unwaive test. ( #9949 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-22 10:12:54 +08:00
Chuang Zhu
914dd39127
[None][fix] disable cuda ipc on device without nvlink (L40s) for disagg test ( #9735 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-22 09:29:24 +08:00
dominicshanshan
d274a4c5d3
[ https://nvbugs/5701457 ][fix] Unwaive ray test. ( #10175 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-22 09:25:58 +08:00
Enwei Zhu
5549067966
[None][ci] Waive GPTOSS test case ( #10155 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-22 08:50:44 +08:00
Balaram Buddharaju
5266475014
[None][feat] Cudagraph updates for helix parallelism ( #10141 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-21 15:21:52 -05:00
shuyixiong
4fc6036276
[ https://nvbugs/5702793 ][fix] Fix view operation on uncontiguous tensor ( #10147 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-21 11:47:20 -05:00
bhsueh_NV
cd4b4f43fa
[None][feat] Support Eagle3 on Mistral Large3 ( #9971 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-21 10:25:45 -05:00
Emma Qiao
aa5dbb7ca5
[None][infra] Waive failed tests for main branch on 12/21 ( #10184 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-21 22:23:46 +08:00
Eran Geva
b15f987972
[None][chore] removed duplicated test from l0_b200.yml ( #10090 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-21 11:34:01 +02:00
Bo Li
a66eeab537
[TRTLLM-9805][feat] Skip Softmax Attention. ( #9821 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-12-21 02:52:42 -05:00
Balaram Buddharaju
dcd3f7b5ea
[ https://nvbugs/5744427 ][fix] Fix accuracy test OOM ( #10173 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-21 02:03:38 -05:00
Enwei Zhu
2ce785f39a
[ https://nvbugs/5643631 ][fix] Fix hostfunc seg fault ( #10028 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-20 07:58:43 -05:00
Yuxian Qiu
3b3069b390
[ https://nvbugs/5747930 ][fix] Use offline tokenizer for whisper models. ( #10121 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-20 09:42:07 +08:00
Balaram Buddharaju
bee9051484
[None][chore] Waive timing out pre-merge test ( #10167 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-19 17:56:33 -05:00
JunyiXu-nv
7b71ff6b8a
[ https://nvbugs/5722653 ][fix] Unwaive fixed test ( #10157 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-19 11:19:20 -05:00
xxi
27e49e2904
[None][fix] waive the failed test test_service_discovery[etcd-load_ba… ( #10161 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2025-12-19 06:14:26 -08:00
xinhe-nv
7b51e3cedb
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10129 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-19 17:55:17 +08:00
Emma Qiao
dd8ce68c94
[None][infra] Update waive and waive failed tests for main branch on 12/19 ( #10151 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-19 01:20:42 -08:00
yufeiwu-nv
52cee573ad
[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases ( #10114 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-19 17:01:52 +08:00
xinhe-nv
cb0444b1b5
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10132 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-19 16:07:56 +08:00
JunyiXu-nv
356ad4fe3a
[ https://nvbugs/5722653 ][fix] Address port conflict by assigning different port section in the same node. ( #10035 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-19 15:34:04 +08:00
William Zhang
478b6b20a1
[ #9230 ][refactor] Replace nemotron patches with custom model implementation ( #9751 )
...
[#9230 ][refactor] Replace nemotron patches with custom model implementation
* Why?
Patching for nemotron H models was growing out of hand, and made certain
optimizations more complex than they needed to be.
* What?
This commit finally gets rid of them, and replaces them with the custom
model implementation in `modeling_nemotron_h.py`.
Closes #9230
Closes NvBug 5747867
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-12-18 19:36:27 -08:00
Balaram Buddharaju
72c5480dfb
[None][chore] Waive test blocking pre-merge 12/18 ( #10145 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-18 19:12:05 -08:00
Wangjue Yao
9f283f330b
[None][feat] Support Mooncake transfer engine as a cache transceiver backend ( #8309 )
...
Signed-off-by: wjueyao <wyao123@terpmail.umd.edu>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-19 10:09:51 +08:00
Chuang Zhu
e0b2a94309
[None][fix] Fix ready signal in NIXL backend ( #10000 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-19 09:43:40 +08:00
Yukun He
bd5b3c2ac0
[ https://nvbugs/5721912 ][chore] Unwaive the test ( #10108 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-19 09:12:25 +08:00
Anish Shanbhag
91a9ae42d2
[TRTC-71][feat] Add regression testing for config database ( #9832 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-12-18 16:15:38 -08:00
Balaram Buddharaju
799a2ae311
[ https://nvbugs/5741331 ][fix] Fix helix accuracy test ( #10021 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-18 15:27:53 -08:00
Chang Liu
a97e411b44
[ https://nvbugs/5747911 ][fix] Use offline data path for the unit test of mmencoder server ( #10135 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-12-18 15:19:23 -08:00
Lizhi Zhou
f02782a6f2
[ https://nvbugs/5726066 ][fix] fix auto-scaling related failures ( #9845 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
2025-12-18 16:37:48 -05:00
Bo Li
9d7e038bcb
[ https://nvbugs/5753250 ][infra] Waive _test_openai_responses. ( #10110 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-18 00:15:06 -08:00
Emma Qiao
33a90f2dd2
[None][infra] Waive failed cases for main branch on 12/18 ( #10105 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-17 21:35:45 -08:00
Wanli Jiang
601c29ca73
[ https://nvbugs/5721644 ][fix] Update tests for nemotron_h ( #9993 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-18 12:38:02 +08:00
xinhe-nv
4a98f190a8
[None][chore] Add failed cases into waives.txt ( #10025 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 19:13:52 -08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests ( #9906 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
yufeiwu-nv
5d71f662c3
[ https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test ( #10041 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-17 13:37:25 +08:00
Emma Qiao
0dbf3948cc
[None][infra] Waive failed tests due to llm model files ( #10068 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-16 20:12:57 -08:00
JunyiXu-nv
6649c3743c
[ https://nvbugs/5635153 ][chore] Remove responses tests from waive list ( #10026 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-17 11:22:02 +08:00
shuyixiong
26fb063076
[ https://nvbugs/5741060 ][fix] Fix pg op test ( #9989 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-17 09:44:25 +08:00
Lizhi Zhou
bd13957e70
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic ( #9726 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-16 05:16:32 -08:00
Emma Qiao
12727ebd7f
[None][infra] Waive failed test for main branch on 12/16 ( #10029 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-16 02:54:32 -08:00
Yechan Kim
8ba8699f66
[TRTLLM-8310][feat] Add Qwen3-VL-MoE ( #9689 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-12-15 20:05:20 -08:00
xinhe-nv
cdf56c278f
[TRTLLM-8638][fix] Add failed cases into waives.txt New activity. ( #9979 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-15 18:59:13 -08:00
Patrice Castonguay
9ba14263db
[ https://nvbugs/5673559 ][fix] Unwaiving disagg test for nvbug 5673559 ( #9957 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-15 12:32:15 -05:00
Emma Qiao
d5d15c06df
[None][infra] Waive failed tests for main branch on 12/15 ( #10001 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-16 01:29:43 +08:00
Bo Li
9eb5a229dd
[None][infra] Fully waive test_worker_restart test_disagg_server_restart. ( #9988 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-15 01:26:18 -08:00
xinhe-nv
3c98b25005
[None][chore] Add failed cases into waives.txt ( #9941 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-14 23:14:24 -08:00
shuyixiong
25db9e7b3e
[ https://nvbugs/5741060 ][chore] Waive all pg operator tests ( #9991 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-14 21:24:43 -08:00
Balaram Buddharaju
dfc8799352
[ https://nvbugs/5669114 ][fix] Switch to MMMU benchmark for Gemma3 27B ( #9966 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-14 21:23:59 -08:00
Fanrong Li
8f144d9282
[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. ( #9524 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-15 12:42:25 +08:00
QI JUN
b57650f1e6
[TRTLLM-9794][ci] move test cases of gpt-oss to gb200 ( #9934 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-14 19:21:54 -08:00
xxi
f5696df285
[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm ( #9858 )
2025-12-15 10:47:15 +08:00
Simeng Liu
f21e2b3329
[TRTLLM-9601][feat] Expose mmKeys for multimodal to integrate with dynamo. ( #9604 )
...
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
2025-12-15 08:42:30 +08:00
Emma Qiao
e0a4b72279
[None][infra] Waive failed tests for main branch on 12/14 ( #9982 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-14 22:48:34 +08:00
Mike Iovine
96d654029d
[ https://nvbugs/5666816 ][fix] Unwaive llama3 eagle3 test ( #9964 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-12-14 15:07:35 +08:00
nvxuanyuc
a5a37227d6
[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe ( #9852 )
...
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-12-14 10:47:24 +08:00
Yan Chunwei
85406f9dda
[ https://nvbugs/5720482 ][fix] Fix test rpc streaming ( #9902 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-13 01:14:43 -08:00
bhsueh_NV
e49c70f6df
[None][feat] Support Mistral Large3 LLM part ( #9820 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-13 11:44:27 +08:00
Chuang Zhu
9c59c9f920
[ https://nvbugs/5643787 ][fix] remove the war path for notify to itself ( #9834 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-12 11:10:05 -05:00
Balaram Buddharaju
af315d8ef1
[TRTLLM-5972][chore] Load balance decode token KV cache with helix parallelism ( #9757 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-12 22:29:05 +08:00
chenfeiz0326
61745f034a
[ https://nvbugs/5727481 ][ci] Fix Port Conflict in Perf-Sanity CI Test ( #9896 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-12 17:16:50 +08:00
kris1025
2fc94e5dd7
[None][chore] unwaive qwen3 accuracy test ( #9895 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-12-12 16:30:09 +08:00
Yihan Wang
711016c799
[ https://nvbugs/5736923 ][infra] Waive timeout disaggregated/test_auto_scaling[http-round_robin] test ( #9942 )
...
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
2025-12-12 15:15:13 +08:00
Ivy Zhang
fded6c393d
[TRTLLM-9262][test] add groupgemm ada case for rcca ( #9833 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-12-12 13:23:33 +08:00
dominicshanshan
093465ed29
[ https://nvbugs/5599176 ][fix] Unwaive fixed test for Ray ( #9861 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-12 11:24:05 +08:00
xinhe-nv
e8efeb765d
[TRTLLM-9717][fix] fix multi nodes tests cases ( #9736 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-12 10:14:23 +08:00
Erin
89dabf5aa1
[TRTLLM-9736][feat] AsyncLLM and verl integ ( #9353 )
...
Signed-off-by: Liwei Ma <liweim@nvidia.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Liwei Ma <liweim@nvidia.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-12-11 09:33:25 -08:00
xxi
488d38f88d
[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS ( #9772 )
2025-12-12 00:22:13 +08:00
Yan Chunwei
04a39a4e2b
[None][chore] enable test_ipc.py ( #9865 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-11 17:47:14 +08:00
Bo Deng
c1d53ee43d
[ https://nvbugs/5582258 ][fix] unwaive ( #9650 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-12-10 19:18:30 -08:00
Patrice Castonguay
2c0293c612
[ https://nvbugs/5601682 ][fix] Unwaiving disagg test ( #9627 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-10 13:42:26 -05:00
cheshirekow
2f030312a8
[TRTLLM-9228][infra] Verify thirdparty C++ process ( #9367 )
...
Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
2025-12-10 21:01:19 +08:00
dominicshanshan
0e78a4b244
[ https://nvbugs/5702791 ][fix] Unwaive fixed test ( #9844 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-10 14:01:44 +08:00
QI JUN
2c46126a93
[TRTLLM-9794][ci] move some deepseek test cases to gb200 ( #9841 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 19:54:51 -08:00
zhanghaotong
36c9e7cfe6
[None][chore] Add unittest for otlp tracing ( #8716 )
...
Signed-off-by: zhanghaotong <zhanghaotong.zht@antgroup.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-09 18:34:08 -08:00
dhansen-nvidia
2d33ae94d5
[ https://nvbugs/5508301 ][feat] Move D->H copies to a worker thread whe… ( #8463 )
...
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
2025-12-09 18:51:31 -05:00
Patrice Castonguay
414448bb37
[ https://nvbugs/5719561 ][chore] Unwaive tests for nvbug 5719561 ( #9801 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 18:21:50 -05:00
Patrice Castonguay
ff0ef19ee9
[ https://nvbugs/5688388 ][chore] Unwaiving fixed disagg test ( #9800 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 16:51:46 -05:00
Patrice Castonguay
7d7d05d8db
[None][chore] Adding flaky auto scaling test to waives ( #9851 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 15:05:19 -05:00
Emma Qiao
75bc386b65
[None][infra] Waive failed cases for main branch on 12/09 ( #9839 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 19:39:29 +08:00
QI JUN
58c29957d9
[TRTLLM-9794][ci] move qwen3-next test cases to gb200 ( #9827 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 01:58:25 -08:00
Robin Kobus
76f49c903b
[None][fix] Additional model outputs for pipeline parallelism ( #9794 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-09 10:41:22 +01:00
yufeiwu-nv
fbcf03040f
[None][test] Refactor qa/llm_perf_nim.yml test list ( #9700 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-08 22:00:43 -08:00
QI JUN
252769c930
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 ( #9817 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-08 21:51:30 -08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list ( #9614 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
JunyiXu-nv
90890785eb
[ https://nvbugs/5722653 ][fix] Fix config file used by disagg_client ( #9783 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-08 20:34:55 -08:00
Balaram Buddharaju
bafb60c1bc
[None][chore] Fix tests failing on pre-merge 12/08 ( #9819 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-08 20:08:52 -08:00
Bo Li
f2006a1f74
[ https://nvbugs/5726066 ][infra] Waive timeout disaggregated/test_auto_scaling tests. ( #9815 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-08 19:51:43 -08:00
Jiagan Cheng
4a3a66b124
[ https://nvbugs/5677746 ][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang ( #9659 )
...
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-12-08 18:43:52 -08:00
yuanjingx87
390391ebf1
[None][infra] Correct the waived test names due to a merge conflict ( #9803 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-09 09:48:21 +08:00
Yibin Li
faabc1a387
[TRTLLM-7967][chore] Add more tests ( #9415 )
...
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-12-08 11:57:32 -08:00
Jhao-Ting Chen
0a09465089
[ https://nvbugs/5567586 ][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model ( #8383 )
...
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 11:16:05 -08:00
Lizhi Zhou
52f78e4000
[ http://nvbugs/5649010 ][fix] fix test_auto_scaling.py::test_worker_restart timeout ( #9775 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-08 03:26:01 -08:00
fredricz-20070104
96d9b67d65
[ https://nvbugs/5527655 ][test] Add test case for RCCA 5527655 ( #9511 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 01:27:13 -08:00
xinhe-nv
3f55c07223
[None][chore] Remove closed bugs ( #9770 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-07 22:51:55 -08:00
Fanrong Li
2f526583fb
[None][chore] Move the rocketkv e2e test to post-merge ( #9768 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-08 13:22:16 +08:00
Emma Qiao
137713a869
[None][infra] Waive failed cases for main on 12/08 ( #9773 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 20:18:29 -08:00
xxi
8e27ce7084
[TRTLLM-9603][feat] Enable ConfigurableMoE test in the CI ( #9645 )
2025-12-08 10:19:40 +08:00
chenfeiz0326
383178c00a
[TRTLLM-9000][feat] Add multi-node Perf Tests into CI ( #8800 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-08 09:00:44 +08:00
Emma Qiao
7c6c493993
[None][infra] Waive failed cases for main branch on 12/07 ( #9769 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 06:26:47 -08:00
Mike Iovine
31ab367576
[None][chore] Waive flakey disagg tests ( #9749 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 13:07:05 -08:00
jthomson04
299601aebf
[ https://nvbugs/5670672 ][fix] Fix flaky KV connector tests ( #9676 )
...
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-12-05 10:04:54 -08:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders ( #9583 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
yufeiwu-nv
68253d9d29
[ https://nvbugs/5518713 ][test] Refactor core test lists by merging with llm_perf_cluster.yml ( #9714 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-05 01:15:37 -08:00
Kaiyu Xie
e06c582648
[None] [tests] Unwaive EPLB tests ( #9625 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-05 00:13:24 -08:00
Lizhi Zhou
dc766fc126
[ https://nvbugs/5633340 ][fix] start disagg workers and servers on free ports ( #9694 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:51:29 +08:00
Lizhi Zhou
0d0a16fff4
[TRTLLM-8920][feat] decouple disagg service from fastapi ( #8714 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:44:16 +08:00
xinhe-nv
530af1a98e
[None][chore] Add failed cases into waives.txt ( #9662 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-04 22:33:22 +08:00
Yan Chunwei
05058f5e2a
[None][ci] unwaive tests ( #9651 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-04 15:06:07 +08:00
JunyiXu-nv
6d2daec5d0
[TRTLLM-8274][feat] Check if executor is shutdown in /health entrypoint ( #9057 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-04 13:49:40 +08:00
mpikulski
744f0eff1b
[TRTLLM-9522][fix] restore trtllm-serve mm_embedding_serve ( #9669 )
2025-12-03 19:27:11 -08:00
gramnarayan
098b9ff226
[ #9147 ][feat] AutoDeploy: Draft Target Speculative Decoding ( #9275 )
...
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-04 05:13:49 +08:00
Michal Guzek
4e5b10da48
[ https://nvbugs/5552132 ][fix] Enable LoRa for GPT OSS Torch ( #8253 )
...
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-12-03 15:42:15 +01:00
Patrice Castonguay
ae8d8a266a
[ https://nvbugs/5705197 ][chore] Unwaive timeout disagg tests ( #9637 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-03 22:18:36 +08:00
xinhe-nv
3a748b166b
[None][chore] Add failed cases into waives.txt ( #9593 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2025-12-03 16:26:06 +08:00
heyuhhh
a08eb81cce
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 ( #9572 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2025-12-03 11:33:46 +08:00
yufeiwu-nv
21f2ba74e8
[None][test] Remove duplicate test cases ( #9623 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-03 10:35:26 +08:00
brb-nv
55c7023c92
[None][chore] Waive test failing on pre-merge ( #9638 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-03 07:31:10 +08:00
Shi Xiaowei
227d42e492
[ https://nvbugs/5651854 ][fix] Fix dist-serving perf by clearing CPU affinity ( #9549 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-03 01:17:03 +08:00
Mike Iovine
d5b7f0c8ad
[TRTLLM-8980][test] Clean up spec dec tests in test_llm_api_pytorch ( #8889 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-02 10:32:02 -05:00
Yan Chunwei
b86256eb54
[TRTLLM-9144][fix] enhance RPC robustness ( #8711 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-12-02 21:37:59 +08:00