Tian Zheng
cfebfbb505
[ https://nvbugs/5783509 ][fix] Fix a hang issue when enabling skip softmax on Blackwell ( #10490 )
...
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2026-01-16 18:59:54 +08:00
xinhe-nv
cc43edc8f4
[None][fix] waive tests on sm89 ( #10753 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 17:35:42 +08:00
Kaiyu Xie
4f86c5f5ce
[None] [feat] Support multiple accuracy tasks for slurm scripts ( #10500 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-16 15:50:32 +08:00
xinhe-nv
0256c7234f
[None][chore] Remove closed bugs ( #10586 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 15:04:11 +08:00
Emma Qiao
e2c3373749
[None][infra] Waive failed cases for main branch on 01/16 ( #10738 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-16 12:46:35 +08:00
Bo Li
7686fbbcbe
[ https://nvbugs/5810940 ][chore] Update waive lists for nvbugs/5810940. ( #10737 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-16 12:08:14 +08:00
Enwei Zhu
9f741fb254
[ https://nvbugs/5800521 ][ci] Move test_openai_chat_guided_decoding to H100 stage (to avoid potential OOM) ( #10703 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-16 10:42:52 +08:00
xxi
ce561b6a8e
[TRTLLM-9111][feat] MoE test refactor: Extend MoE quantization test utilities with comprehensive quant algorithm support ( #10691 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-16 10:26:33 +08:00
Chuang Zhu
7e2cbc0756
[ https://nvbugs/5598674 ][fix] enable partial reuse in gemma and gpt oss test ( #10559 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-16 10:26:15 +08:00
heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests ( #10439 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Yuxian Qiu
ef838cc852
[ https://nvbugs/5701445 ][chore] isolate test. ( #10444 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-16 10:04:12 +08:00
Iman Tabrizian
5ad8cf6d5e
[ https://nvbugs/5738168 ][fix] unwaive test accuracy/test_disaggregated_serving.py::TestDeepSeekV32Exp::test_auto_dtype[False] ( #10584 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-16 06:04:45 +08:00
yufeiwu-nv
cd55fb4551
[None][test] Remove NIM test ( #10657 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2026-01-15 16:30:47 +08:00
Perkz Zheng
71ccc07d2b
[None][feat] update trtllm-gen to support groupsTokensHeadsQ ( #10261 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-15 02:24:25 -05:00
Ludwig Schneider
e12a7119cf
[ https://nvbugs/5741392 ][fix] [chore] Remove test exemptions from waivers tile ( #10517 )
...
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
2026-01-14 22:07:52 -08:00
ruodil
22240e43eb
[None][test] store per user output and per gpu output metric in csv file ( #10658 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-15 00:51:08 -05:00
Emma Qiao
7b3b6f1161
[None][infra] Waive failed tests on main 01/15 ( #10683 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-15 13:40:37 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias ( #10099 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00
Lucas Liebenwein
62050b2381
[None][infra] separate AutoDeploy tests into own stages ( #10634 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-14 23:05:26 -05:00
Lucas Liebenwein
15b43e8a14
[ https://nvbugs/5777041 ][fix] fix AutoDeploy ep sharding test ( #10460 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-14 21:53:56 -05:00
Dom Brown
94c7b69048
[ https://nvbugs/5630196 ] [fix] Prevent flaky failures in C++ test_e2e.py by using local cached datasets for benchmarking ( #10638 )
...
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2026-01-14 21:39:55 -05:00
Wanli Jiang
73d1840c12
[TRTLLM-10245][feat] Add accuracy tests for super v3 fp8 model ( #10482 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-15 10:07:02 +08:00
dominicshanshan
0f2d61b8c6
[ https://nvbugs/5766952 ][fix] Fix AIPerf issue. ( #10666 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-15 09:54:34 +08:00
bhsueh_NV
5f9fc50233
[ https://nvbugs/5800725 ][infra] Update waives.txt ( #10625 )
2026-01-15 09:08:07 +08:00
彭晋韬(jtao peng)
211c44b951
[None][feat] Adding torch ext API for FusedAddRMSNormQuant kernel ( #9905 )
...
Signed-off-by: jintaop <jintaop@nvidia.com>
2026-01-15 07:29:15 +08:00
Tzu-Ling Kan
c99faaed06
[ #9760 ][fix] Use RequestError for validation errors to prevent engine shutdown ( #9761 )
...
Signed-off-by: tzulingk@nvidia.com <tzulingk@nvidia.com>
2026-01-14 10:22:36 -05:00
Emma Qiao
01083b56bf
[TRTLLM-9849][infra] Update dependencies to 25.12 ( #9818 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: xxi <xxi@nvidia.com>
Signed-off-by: xxi <95731198+xxi-nv@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: xxi <xxi@nvidia.com>
Co-authored-by: xxi <95731198+xxi-nv@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-14 21:54:04 +08:00
Emma Qiao
35c24424f6
[None][infra] Waive failed cases in post-merge on 01/14 ( #10668 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-14 21:39:32 +08:00
HuiGao-NV
b10704428d
[ https://nvbugs/5787566 ][fix] Only keep a limited number of performance statistic data ( #10569 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-14 07:53:01 -05:00
Bo Li
582dec5bb5
[ https://nvbugs/5774869 ][infra] Use 2 GPUs to test skip softmax attention on H100. ( #10420 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-14 07:03:01 -05:00
shuyixiong
babd5ecacc
[ https://nvbugs/5760740 ][fix] Enable ray tests ( #10272 )
...
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2026-01-14 19:25:46 +08:00
xinhe-nv
272688c663
[None][fix] fix L0 issues ( #10670 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-14 18:09:40 +08:00
jmydurant
e7882d5c74
[None][feat] MiniMax M2 support ( #10532 )
...
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
2026-01-14 17:38:58 +08:00
mpikulski
052c36ddd2
[TRTLLM-9522][feat] support image_embeds in OpenAI API ( #9715 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-14 10:31:03 +01:00
Bo Li
487287a412
[None][chore] Update test name MNNVL->NVLinkTwoSided. ( #9672 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-14 04:29:57 -05:00
QI JUN
c4da4fd462
[ https://nvbugs/5637220 ][ci] unwaive TestQwen3_235B_A22B::test_nvfp4[latency_moe_trtllm_attention_dp] ( #9870 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2026-01-14 15:41:14 +08:00
Yuxian Qiu
39cefd6125
[None][refactor] Unify the usage of MPIDist and TorchDist. ( #10380 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-14 14:05:47 +08:00
xxi
f841b43cde
[None][chore] waive the CI failure ( #10655 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-14 13:59:15 +08:00
JennyLiu
92ae490410
[None][test] Spark - Change testlist name and perf yml format ( #10626 )
...
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-01-13 23:07:11 -05:00
xinhe-nv
07d9390e9b
[None][test] add test into qa test list ( #10627 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-13 22:43:00 -05:00
xinhe-nv
7305c61fc9
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10589 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-13 22:00:20 -05:00
Leslie Fang
bc119f5644
[None][chore] Add test configurable moe module ( #10575 )
...
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2026-01-14 07:25:57 +08:00
Balaram Buddharaju
ccdfa43a6e
[ https://nvbugs/5791900 ][fix] Fix HelixCpMnnvlMemory init with PP ( #10533 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-13 15:48:42 -05:00
Frida Hou
bf16fbd86c
[ #9283 ][feat] AutoDeploy: separate rms pattern detection from fusion ( #9969 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-13 14:57:27 -05:00
dongfengy
6ee8dbfe0b
[ https://nvbugs/5772396 ][fix] WAR: Disable TinyGEMM PDL due to accuracy issues ( #10619 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2026-01-13 12:40:11 -05:00
benzh-2025
6df2c8a074
[None][feat] add fp4 gemm + allreduce ( #9729 )
...
Signed-off-by: benzh
Signed-off-by: benzh-2025
2026-01-13 21:11:13 +08:00
Guoming Zhang
c1b0b7350f
[None][test] Unwaive qwen3 next test case. ( #9877 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-13 20:42:31 +08:00
Tailing Yuan
38296a472b
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading ( #10562 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-13 19:17:03 +08:00
Erin
55580f8ec1
[NVBUG-5670458][chore] Unwaive lp tests ( #10524 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>
2026-01-13 04:31:27 -05:00
Guoming Zhang
bdaee87895
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. ( #10347 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-13 17:13:55 +08:00