Jin Li
c04563657e
[TRTLLM-7735][feat] Attention NVFP4 out support for torch compile ( #9740 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-12-27 00:07:20 +08:00
chenfeiz0326
d70aeddc7f
[TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI ( #9138 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-26 22:50:53 +08:00
Pengyun Lin
c5b0f9e436
[ https://nvbugs/5633700 ][fix] Cache tiktoken vocab for gpt-oss ( #10219 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-12-26 18:39:03 +08:00
dongfengy
bfc591994c
[ https://nvbugs/5745152 ][fix] Fix some GPTOSS test setups ( #10085 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-12-26 17:52:40 +08:00
bhsueh_NV
db3430f589
[None][feat] Support VLM part for Mistral Large 3 ( #10188 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-25 11:20:58 -05:00
ZhichenJiang
46e4af5688
[TRTLLM-9831][perf] Enable 2CTA with autotune for CuteDSL MoE and Grouped GEMM optimizations ( #10201 )
...
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-25 09:04:20 -05:00
Lizhi Zhou
fe12faef81
[ https://nvbugs/5752516 ][chore] unwaive test; fix port conflicts in CI ( #10152 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-25 08:16:09 -05:00
Emma Qiao
0ecdb69b93
[None][infra] Waive failed tests for main on 12/25 ( #10298 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-25 05:22:39 -05:00
Jie Li
83e02ee335
[None][chore] Remove NIM TRT-Backend Test Lists ( #10232 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2025-12-25 04:01:51 -05:00
Enwei Zhu
182b3eb633
[None][ci] Waive TestLlama3_1_8B::test_auto_dtype[False-2] for timeout ( #10293 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-25 02:35:18 -05:00
xinhe-nv
4ae6f6a46c
[None][chore] Add failed cases into waives.txt ( #10249 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-25 01:26:21 -05:00
gramnarayan
a9eb5afc9f
[ #9241 ][feat] AutoDeploy: Support Eagle3 Speculative Decoding ( #9869 )
...
Support two model flow with no overlap scheduler or chain drafter. Drafting model is in PyTorch backend.
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-24 23:30:42 -05:00
Emma Qiao
16fd781e42
[TRTLLM-9862][infra] Move single-gpu tests on rtxpro6000d to pre-merge ( #9897 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-24 21:45:33 -05:00
Stanley Sun
ddac4d7379
[None][test] Add disag-serving auto scaling qa test ( #10262 )
...
Signed-off-by: Stanley Sun <stsun@nvidia.com>
2025-12-24 08:43:47 -05:00
shuyixiong
f4f0fe85e9
[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests ( #9939 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-24 15:27:01 +08:00
xinhe-nv
534700ecd9
[None][chore] Add failed cases into waives.txt ( #10240 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-24 02:21:50 -05:00
Emma Qiao
7b84e48e0f
[None][infra] Waive failed cases om 12/24 ( #10257 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-23 22:49:57 -05:00
xinhe-nv
fc1f77eafc
[None][chore] Add failed cases into waives.txt ( #10204 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
2025-12-24 10:37:23 +08:00
Balaram Buddharaju
8c1cfc872b
[TRTLLM-9493][feat] Custom AllToAll for helix parallelism ( #9986 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-23 18:14:30 -08:00
Jhao-Ting Chen
92d90fa29a
[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS ( #10018 )
...
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-23 11:41:31 -06:00
Grzegorz Kwasniewski
0027a01ad5
[ https://nvbugs/5680312 ][fix] Updated test waiving ( #9630 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
2025-12-23 09:38:12 -08:00
Emma Qiao
984c20e0b2
[None][infra] Waive failed cases on 12/23 ( #10236 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-23 08:48:54 -05:00
dongfengy
e284d0bf80
[None][infra] Waive flaky unittest/executor/test_rpc_proxy.py and unittest/executor/test_rpc_worker.py tests ( #10209 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-23 07:43:13 -05:00
Yukun He
522f1d2bc3
[ https://nvbugs/5764627 ][chore] waive the time-out test ( #10222 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-23 16:36:06 +08:00
Balaram Buddharaju
f2e00a75de
[None][chore] Remove helix test from rtx test list ( #10224 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-23 03:07:37 -05:00
chenfeiz0326
48c875f8ea
[None][fix] Add OpenSearch URL in slurm_launch.sh for Multinode Perf Sanity Test ( #9990 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-23 16:02:38 +08:00
Chuang Zhu
53db3b2612
[ https://nvbugs/5741884 ][fix] unwaive disagg sampler ( #10189 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-23 14:38:07 +08:00
xinhe-nv
77b591f73b
[None][chore] Add failed cases into waives.txt ( #10177 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-23 13:43:50 +08:00
Harshini Komali
d691371eaf
[TRTLLM-9091] [feat] Replace GenAI-Perf with AIPerf ( #9310 )
...
Signed-off-by: lkomali <lkomali@nvidia.com>
Signed-off-by: Harshini Komali <157742537+lkomali@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-23 13:25:55 +08:00
Pamela Peng
5bc7ffe379
[None][test] Add qa tests for RTX 6K ( #10210 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-12-22 22:47:09 -05:00
fredricz-20070104
621156ad44
[None][chore] Fix GB300 support issues ( #10196 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: fredricz-20070104 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-23 10:42:41 +08:00
Emma Qiao
ba14a9308e
[None][infra] Waive failed cases on 12/22 ( #10200 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-23 00:05:45 +08:00
Perkz Zheng
c87f1a6b39
[ https://nvbugs/5503479 ][fix] update trtllm-gen kernels to address few bugs ( #10089 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-12-22 04:45:33 -05:00
xinhe-nv
d30ee8101e
[None][chore] Remove closed bugs ( #10182 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-22 01:58:17 -05:00
Yuxian Qiu
237fd0eae4
[ https://nvbugs/5666821 ][chore] unwaive tests. ( #9958 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-22 11:39:45 +08:00
Jin Li
066b653940
[TRTLLM-9880][feat] Include torch compile tests in QA test list ( #10149 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-12-22 10:37:09 +08:00
Yuxian Qiu
2f139ee07e
[ https://nvbugs/5701445 ][chore] unwaive test. ( #9949 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-22 10:12:54 +08:00
Chuang Zhu
914dd39127
[None][fix] disable cuda ipc on device without nvlink (L40s) for disagg test ( #9735 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-22 09:29:24 +08:00
dominicshanshan
d274a4c5d3
[ https://nvbugs/5701457 ][fix] Unwaive ray test. ( #10175 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-22 09:25:58 +08:00
Enwei Zhu
5549067966
[None][ci] Waive GPTOSS test case ( #10155 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-22 08:50:44 +08:00
Balaram Buddharaju
5266475014
[None][feat] Cudagraph updates for helix parallelism ( #10141 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-21 15:21:52 -05:00
shuyixiong
4fc6036276
[ https://nvbugs/5702793 ][fix] Fix view operation on uncontiguous tensor ( #10147 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-21 11:47:20 -05:00
bhsueh_NV
cd4b4f43fa
[None][feat] Support Eagle3 on Mistral Large3 ( #9971 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-21 10:25:45 -05:00
Emma Qiao
aa5dbb7ca5
[None][infra] Waive failed tests for main branch on 12/21 ( #10184 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-21 22:23:46 +08:00
Eran Geva
b15f987972
[None][chore] removed duplicated test from l0_b200.yml ( #10090 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-21 11:34:01 +02:00
Bo Li
a66eeab537
[TRTLLM-9805][feat] Skip Softmax Attention. ( #9821 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-12-21 02:52:42 -05:00
Balaram Buddharaju
dcd3f7b5ea
[ https://nvbugs/5744427 ][fix] Fix accuracy test OOM ( #10173 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-21 02:03:38 -05:00
Enwei Zhu
2ce785f39a
[ https://nvbugs/5643631 ][fix] Fix hostfunc seg fault ( #10028 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-20 07:58:43 -05:00
Yuxian Qiu
3b3069b390
[ https://nvbugs/5747930 ][fix] Use offline tokenizer for whisper models. ( #10121 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-20 09:42:07 +08:00
Balaram Buddharaju
bee9051484
[None][chore] Waive timing out pre-merge test ( #10167 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-19 17:56:33 -05:00