Commit Graph

2606 Commits

Author SHA1 Message Date
Emma Qiao
935c174283
[None][infra] Waive failed cases for main on 01/19 (#10794)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-19 00:55:26 -05:00
Zhanrui Sun
df845a028b
[TRTLLM-9581][infra] Use /home/scratch.trt_llm_data_ci in computelab (#10616)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2026-01-19 00:40:40 -05:00
chenfeiz0326
e97af45556
[TRTLLM-10300][feat] Upload regression info to artifactory (#10599)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-19 10:16:31 +08:00
Lucas Liebenwein
a6a63f5a36
[https://nvbugs/5814247][fix] unwaive AutoDeploy multi-gpu unit tests (#10769)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-19 10:00:54 +08:00
Chuang Zhu
4f04532ce7
[https://nvbugs/5769890][fix] enable system memory to transfer active message in NIXL ucx (#10602)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-19 09:20:12 +08:00
Lucas Liebenwein
9879400479
[#10642][feat] AutoDeploy: optimized canonicalize_graph utilities [1/2] (#10675)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-18 13:42:30 -05:00
Eran Geva
4d2916d683
[#10688][fix] AutoDeploy Fix CUDA graph batch sizes exceeding max_batch_size (#10687)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-18 13:31:01 -05:00
Lucas Liebenwein
b64052539d
[https://nvbugs/5769712][fix] fix timeout in AutoDeploy llama accuracy test (#10461)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-18 13:20:55 -05:00
Eran Geva
a11f0dbd61
[#10696][fix] AutoDeploy prevent torch.export from specializing batch dimension when max_batch_size=1 (#10697)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-18 10:42:49 +02:00
Yanchao Lu
0af1a0e478
[None][test] Waive main post-merge test failures 1/18 (#10777)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-18 15:34:48 +08:00
Grzegorz Kwasniewski
7bf4dd9f63
[TRTLLM-10318][feat] Fixing Nemotron sharding: support for sharding buffers (#10319)
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Lucas <11156568+lucaslie@users.noreply.github.com>
Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
Co-authored-by: Lucas <11156568+lucaslie@users.noreply.github.com>
2026-01-17 04:02:06 -05:00
Yuxian Qiu
b65560fc32
[https://nvbugs/5794313][chore] unwaive tests. (#10660)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-17 14:15:15 +08:00
Yukun He
3d16daf696
[None][fix] Fix tmp dir being deleted too early in unit test. (#10740)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-17 13:49:10 +08:00
chenfeiz0326
56073f501a
[TRTLLM-8263][feat] Add Aggregated Perf Tests (#10598)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-17 13:16:36 +08:00
Frida Hou
069ad68d3c
[None][fix] AutoDeploy: skip mxfp4_moe test unless on Hopper (#10729)
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-16 16:24:37 -05:00
Chenghao Zhang
0b748d5bba
[None][chore] update flashinfer to 0.6.0 (#10522)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-16 16:22:06 -05:00
Chenghao Zhang
b6acd96616
[None][fix] AutoDeploy: Fix the nvfp4 fused_moe (#10727)
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-16 12:04:40 -08:00
Stefan Niebler
0cfd08745c
[TRTLLM-9735][feat] Add processed logprobs functionality to TorchSampler (#9675)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2026-01-16 10:52:41 -08:00
Tian Zheng
cfebfbb505
[https://nvbugs/5783509][fix] Fix a hang issue when enabling skip softmax on Blackwell (#10490)
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2026-01-16 18:59:54 +08:00
xinhe-nv
cc43edc8f4
[None][fix] waive tests on sm89 (#10753)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 17:35:42 +08:00
Kaiyu Xie
4f86c5f5ce
[None] [feat] Support multiple accuracy tasks for slurm scripts (#10500)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-16 15:50:32 +08:00
xinhe-nv
0256c7234f
[None][chore] Remove closed bugs (#10586)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 15:04:11 +08:00
Emma Qiao
e2c3373749
[None][infra] Waive failed cases for main branch on 01/16 (#10738)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-16 12:46:35 +08:00
Bo Li
7686fbbcbe
[https://nvbugs/5810940][chore] Update waive lists for nvbugs/5810940. (#10737)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-16 12:08:14 +08:00
Enwei Zhu
9f741fb254
[https://nvbugs/5800521][ci] Move test_openai_chat_guided_decoding to H100 stage (to avoid potential OOM) (#10703)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-16 10:42:52 +08:00
xxi
ce561b6a8e
[TRTLLM-9111][feat] MoE test refactor: Extend MoE quantization test utilities with comprehensive quant algorithm support (#10691)
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-16 10:26:33 +08:00
Chuang Zhu
7e2cbc0756
[https://nvbugs/5598674][fix] enable partial reuse in gemma and gpt oss test (#10559)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-16 10:26:15 +08:00
heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests (#10439)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Yuxian Qiu
ef838cc852
[https://nvbugs/5701445][chore] isolate test. (#10444)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-16 10:04:12 +08:00
Iman Tabrizian
5ad8cf6d5e
[https://nvbugs/5738168][fix] unwaive test accuracy/test_disaggregated_serving.py::TestDeepSeekV32Exp::test_auto_dtype[False] (#10584)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-16 06:04:45 +08:00
yufeiwu-nv
cd55fb4551
[None][test] Remove NIM test (#10657)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2026-01-15 16:30:47 +08:00
Perkz Zheng
71ccc07d2b
[None][feat] update trtllm-gen to support groupsTokensHeadsQ (#10261)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-15 02:24:25 -05:00
Ludwig Schneider
e12a7119cf
[https://nvbugs/5741392][fix] [chore] Remove test exemptions from waivers tile (#10517)
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
2026-01-14 22:07:52 -08:00
ruodil
22240e43eb
[None][test] store per user output and per gpu output metric in csv file (#10658)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-15 00:51:08 -05:00
Emma Qiao
7b3b6f1161
[None][infra] Waive failed tests on main 01/15 (#10683)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-15 13:40:37 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00
Lucas Liebenwein
62050b2381
[None][infra] separate AutoDeploy tests into own stages (#10634)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-14 23:05:26 -05:00
Lucas Liebenwein
15b43e8a14
[https://nvbugs/5777041][fix] fix AutoDeploy ep sharding test (#10460)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-14 21:53:56 -05:00
Dom Brown
94c7b69048
[https://nvbugs/5630196] [fix] Prevent flaky failures in C++ test_e2e.py by using local cached datasets for benchmarking (#10638)
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2026-01-14 21:39:55 -05:00
Wanli Jiang
73d1840c12
[TRTLLM-10245][feat] Add accuracy tests for super v3 fp8 model (#10482)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-15 10:07:02 +08:00
dominicshanshan
0f2d61b8c6
[https://nvbugs/5766952][fix] Fix AIPerf issue. (#10666)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-15 09:54:34 +08:00
bhsueh_NV
5f9fc50233
[https://nvbugs/5800725][infra] Update waives.txt (#10625) 2026-01-15 09:08:07 +08:00
彭晋韬(jtao peng)
211c44b951
[None][feat] Adding torch ext API for FusedAddRMSNormQuant kernel (#9905)
Signed-off-by: jintaop <jintaop@nvidia.com>
2026-01-15 07:29:15 +08:00
Tzu-Ling Kan
c99faaed06
[#9760][fix] Use RequestError for validation errors to prevent engine shutdown (#9761)
Signed-off-by: tzulingk@nvidia.com <tzulingk@nvidia.com>
2026-01-14 10:22:36 -05:00
Emma Qiao
01083b56bf
[TRTLLM-9849][infra] Update dependencies to 25.12 (#9818)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: xxi <xxi@nvidia.com>
Signed-off-by: xxi <95731198+xxi-nv@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: xxi <xxi@nvidia.com>
Co-authored-by: xxi <95731198+xxi-nv@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-14 21:54:04 +08:00
Emma Qiao
35c24424f6
[None][infra] Waive failed cases in post-merge on 01/14 (#10668)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-14 21:39:32 +08:00
HuiGao-NV
b10704428d
[https://nvbugs/5787566][fix] Only keep a limited number of performance statistic data (#10569)
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-14 07:53:01 -05:00
Bo Li
582dec5bb5
[https://nvbugs/5774869][infra] Use 2 GPUs to test skip softmax attention on H100. (#10420)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-14 07:03:01 -05:00
shuyixiong
babd5ecacc
[https://nvbugs/5760740][fix] Enable ray tests (#10272)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2026-01-14 19:25:46 +08:00
xinhe-nv
272688c663
[None][fix] fix L0 issues (#10670)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-14 18:09:40 +08:00