jthomson04
2db3d7eeba
[None][chore] Async Transfer Manager ( #9891 )
...
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2026-01-20 12:12:47 -05:00
Gal Hubara-Agam
e61c942d1f
[ #10707 ][fix] AutoDeploy: Super accuracy test fixes ( #10717 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Gal Hubara-Agam <96368689+galagam@users.noreply.github.com>
2026-01-20 18:16:13 +02:00
Emma Qiao
3a894951e7
[None][infra] Waive failed cases for main branch on 01/20 ( #10829 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-20 17:58:58 +08:00
Yuxian Qiu
c8a200486d
[ https://nvbugs/5701445 ][chore] unwaive test. ( #10806 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-20 16:30:32 +08:00
Yi Zhang
58311b2345
[None][fix] Remove unused params in attn ( #10652 )
...
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
2026-01-20 03:08:59 -05:00
xinhe-nv
47e0ec2527
[None][test] Update sanity test list ( #10825 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-20 02:11:42 -05:00
xinhe-nv
fc467d06c3
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10787 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-20 00:48:19 -05:00
benzh-2025
4c8468c5d3
[None][fix] default disable gemm+allreduce fusion ( #10656 )
2026-01-20 12:31:17 +08:00
xinhe-nv
26bc16842e
[None][chore] Add failed cases into waives.txt ( #10776 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
2026-01-19 22:45:40 -05:00
Liao Lanyu
dbb858ae0c
[TRTLLM-10029][scheduler] Re-implement MicroBatchScheduler and CapacityScheduler in Python ( #10273 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com>
Co-authored-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2026-01-20 10:31:13 +08:00
Lizhi Zhou
c6320d924d
[ https://nvbugs/5776445 ][chore] unwaive test ( #10667 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-01-19 21:22:47 -05:00
Jie Li
ed95e70150
[None][chore] Remove trt flow tests in NIM ( #10731 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-19 05:25:39 -05:00
Shi Xiaowei
442d2e8a15
[None][test] adjust the dis-agg test timeout threshold ( #10800 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2026-01-19 17:02:00 +08:00
Eran Geva
32ab809f36
[ #10607 ][chore] Add Nemotron Nano v3 FP8 autodeploy perf test ( #10603 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Eran Geva <egeva@cw-dfw-cs-001-vscode-01.cm.cluster>
Co-authored-by: Eran Geva <egeva@cw-dfw-cs-001-vscode-01.cm.cluster>
2026-01-19 08:48:07 +02:00
Emma Qiao
935c174283
[None][infra] Waive failed cases for main on 01/19 ( #10794 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-19 00:55:26 -05:00
Zhanrui Sun
df845a028b
[TRTLLM-9581][infra] Use /home/scratch.trt_llm_data_ci in computelab ( #10616 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2026-01-19 00:40:40 -05:00
chenfeiz0326
e97af45556
[TRTLLM-10300][feat] Upload regression info to artifactory ( #10599 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-19 10:16:31 +08:00
Lucas Liebenwein
a6a63f5a36
[ https://nvbugs/5814247 ][fix] unwaive AutoDeploy multi-gpu unit tests ( #10769 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-19 10:00:54 +08:00
Chuang Zhu
4f04532ce7
[ https://nvbugs/5769890 ][fix] enable system memory to transfer active message in NIXL ucx ( #10602 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-19 09:20:12 +08:00
Lucas Liebenwein
9879400479
[ #10642 ][feat] AutoDeploy: optimized canonicalize_graph utilities [1/2] ( #10675 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-18 13:42:30 -05:00
Eran Geva
4d2916d683
[ #10688 ][fix] AutoDeploy Fix CUDA graph batch sizes exceeding max_batch_size ( #10687 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-18 13:31:01 -05:00
Lucas Liebenwein
b64052539d
[ https://nvbugs/5769712 ][fix] fix timeout in AutoDeploy llama accuracy test ( #10461 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-18 13:20:55 -05:00
Eran Geva
a11f0dbd61
[ #10696 ][fix] AutoDeploy prevent torch.export from specializing batch dimension when max_batch_size=1 ( #10697 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-18 10:42:49 +02:00
Yanchao Lu
0af1a0e478
[None][test] Waive main post-merge test failures 1/18 ( #10777 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-18 15:34:48 +08:00
Grzegorz Kwasniewski
7bf4dd9f63
[TRTLLM-10318][feat] Fixing Nemotron sharding: support for sharding buffers ( #10319 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Lucas <11156568+lucaslie@users.noreply.github.com>
Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
Co-authored-by: Lucas <11156568+lucaslie@users.noreply.github.com>
2026-01-17 04:02:06 -05:00
Yuxian Qiu
b65560fc32
[ https://nvbugs/5794313 ][chore] unwaive tests. ( #10660 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-17 14:15:15 +08:00
Yukun He
3d16daf696
[None][fix] Fix tmp dir being deleted too early in unit test. ( #10740 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-17 13:49:10 +08:00
chenfeiz0326
56073f501a
[TRTLLM-8263][feat] Add Aggregated Perf Tests ( #10598 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-17 13:16:36 +08:00
Frida Hou
069ad68d3c
[None][fix] AutoDeploy: skip mxfp4_moe test unless on Hopper ( #10729 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-16 16:24:37 -05:00
Chenghao Zhang
0b748d5bba
[None][chore] update flashinfer to 0.6.0 ( #10522 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-16 16:22:06 -05:00
Chenghao Zhang
b6acd96616
[None][fix] AutoDeploy: Fix the nvfp4 fused_moe ( #10727 )
...
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-16 12:04:40 -08:00
Stefan Niebler
0cfd08745c
[TRTLLM-9735][feat] Add processed logprobs functionality to TorchSampler ( #9675 )
...
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2026-01-16 10:52:41 -08:00
Tian Zheng
cfebfbb505
[ https://nvbugs/5783509 ][fix] Fix a hang issue when enabling skip softmax on Blackwell ( #10490 )
...
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2026-01-16 18:59:54 +08:00
xinhe-nv
cc43edc8f4
[None][fix] waive tests on sm89 ( #10753 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 17:35:42 +08:00
Kaiyu Xie
4f86c5f5ce
[None] [feat] Support multiple accuracy tasks for slurm scripts ( #10500 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-16 15:50:32 +08:00
xinhe-nv
0256c7234f
[None][chore] Remove closed bugs ( #10586 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-16 15:04:11 +08:00
Emma Qiao
e2c3373749
[None][infra] Waive failed cases for main branch on 01/16 ( #10738 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-16 12:46:35 +08:00
Bo Li
7686fbbcbe
[ https://nvbugs/5810940 ][chore] Update waive lists for nvbugs/5810940. ( #10737 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-16 12:08:14 +08:00
Enwei Zhu
9f741fb254
[ https://nvbugs/5800521 ][ci] Move test_openai_chat_guided_decoding to H100 stage (to avoid potential OOM) ( #10703 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-16 10:42:52 +08:00
xxi
ce561b6a8e
[TRTLLM-9111][feat] MoE test refactor: Extend MoE quantization test utilities with comprehensive quant algorithm support ( #10691 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-16 10:26:33 +08:00
Chuang Zhu
7e2cbc0756
[ https://nvbugs/5598674 ][fix] enable partial reuse in gemma and gpt oss test ( #10559 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-16 10:26:15 +08:00
heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests ( #10439 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Yuxian Qiu
ef838cc852
[ https://nvbugs/5701445 ][chore] isolate test. ( #10444 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-16 10:04:12 +08:00
Iman Tabrizian
5ad8cf6d5e
[ https://nvbugs/5738168 ][fix] unwaive test accuracy/test_disaggregated_serving.py::TestDeepSeekV32Exp::test_auto_dtype[False] ( #10584 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-16 06:04:45 +08:00
yufeiwu-nv
cd55fb4551
[None][test] Remove NIM test ( #10657 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2026-01-15 16:30:47 +08:00
Perkz Zheng
71ccc07d2b
[None][feat] update trtllm-gen to support groupsTokensHeadsQ ( #10261 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-15 02:24:25 -05:00
Ludwig Schneider
e12a7119cf
[ https://nvbugs/5741392 ][fix] [chore] Remove test exemptions from waivers tile ( #10517 )
...
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
2026-01-14 22:07:52 -08:00
ruodil
22240e43eb
[None][test] store per user output and per gpu output metric in csv file ( #10658 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-15 00:51:08 -05:00
Emma Qiao
7b3b6f1161
[None][infra] Waive failed tests on main 01/15 ( #10683 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-15 13:40:37 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias ( #10099 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00