Commit Graph

4256 Commits

Author SHA1 Message Date
Balaram Buddharaju
72c5480dfb
[None][chore] Waive test blocking pre-merge 12/18 (#10145)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-18 19:12:05 -08:00
TensorRT LLM
00f70c30a6 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-19 03:11:26 +00:00
Ivy Zhang
9aa40871c2
[TRTLLM-9840][test] switch ucx backend to default backend (#10101)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-12-18 18:54:15 -08:00
TensorRT LLM
a7ac5a6bca [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-19 02:14:37 +00:00
Wangjue Yao
9f283f330b
[None][feat] Support Mooncake transfer engine as a cache transceiver backend (#8309)
Signed-off-by: wjueyao <wyao123@terpmail.umd.edu>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-19 10:09:51 +08:00
Chuang Zhu
e0b2a94309
[None][fix] Fix ready signal in NIXL backend (#10000)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-19 09:43:40 +08:00
yuanjingx87
2e88c86f10
[None][infra] Fix issue that lock file geneartion will skip dependency with comment (#10144)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-18 17:41:23 -08:00
Yukun He
bd5b3c2ac0
[https://nvbugs/5721912][chore] Unwaive the test (#10108)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-19 09:12:25 +08:00
Anish Shanbhag
91a9ae42d2
[TRTC-71][feat] Add regression testing for config database (#9832)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-12-18 16:15:38 -08:00
Balaram Buddharaju
799a2ae311
[https://nvbugs/5741331][fix] Fix helix accuracy test (#10021)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-18 15:27:53 -08:00
Chang Liu
a97e411b44
[https://nvbugs/5747911][fix] Use offline data path for the unit test of mmencoder server (#10135)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-12-18 15:19:23 -08:00
Lizhi Zhou
f02782a6f2
[https://nvbugs/5726066][fix] fix auto-scaling related failures (#9845)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
2025-12-18 16:37:48 -05:00
Enwei Zhu
6fe89ea00f
[TRTLLM-9819][perf] Reuse alltoall workspace for CuteDSL MoE output (#9840)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-18 10:36:38 -08:00
CarstyYou
0b279f4ad4
[https://nvbugs/5456493][feat] Add fp8 bmm on sm120 (#9687)
Signed-off-by: CarstyYou <186021327+CarstyYou@users.noreply.github.com>
2025-12-18 22:57:20 +08:00
ZhichenJiang
4e55b83101
[None][perf] Add more optimization options for MOE CuteDSL finalized kernel (#10042)
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
2025-12-18 22:49:28 +08:00
Nikita Korobov
3b4f26e4d1
[None][feat] update TRT-LLM Gen MoE for NvFp4 + bias with tileN=256 (#9734)
Signed-off-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>
2025-12-18 11:58:23 +01:00
yuanjingx87
df15be3fad
[None][infra] Fix slurm job does not catch cancelled jobs (#9722)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: yuanjingx87 <197832395+yuanjingx87@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-18 00:32:43 -08:00
Bo Li
9d7e038bcb
[https://nvbugs/5753250][infra] Waive _test_openai_responses. (#10110)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-18 00:15:06 -08:00
Emma Qiao
33a90f2dd2
[None][infra] Waive failed cases for main branch on 12/18 (#10105)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-17 21:35:45 -08:00
Yuxian Qiu
bec864a78c
[None][fix] avoid ID conversion for non enable_configurable_moe cases. (#10003)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-18 13:29:52 +08:00
yuanjingx87
897a38978d
[None][infra] Update allowlist 2025.12.17 (#10097)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-17 21:11:35 -08:00
Wanli Jiang
601c29ca73
[https://nvbugs/5721644][fix] Update tests for nemotron_h (#9993)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-18 12:38:02 +08:00
Lucas Liebenwein
76ec820465
[#7532][feat] AutoDeploy: gather logits before lm head (#9962)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-17 19:50:13 -08:00
TensorRT LLM
cfe53e7425 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-18 03:23:35 +00:00
xinhe-nv
4a98f190a8
[None][chore] Add failed cases into waives.txt (#10025)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 19:13:52 -08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
TensorRT LLM
50c2b82f24 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-17 23:45:35 +00:00
tburt-nv
27064f95c7
[None][chore] Clarify copyright header guidance (#9882)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-12-18 06:38:10 +08:00
tburt-nv
5da7879b38
[None][fix] Revert GHA upgrade for blossom-ci workflow (#10095)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-12-17 15:57:04 -05:00
Chenghao Zhang
22c6e8a424
[None][fix] Autodeploy: fix some legacy flashinfer attention test errors (#9928)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-17 12:27:22 -08:00
Salman Chishti
cb5cd4376e
[None][chore] Upgrade GitHub Actions for Node 24 compatibility (#10045)
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
2025-12-17 09:44:09 -08:00
Yuan Tong
f7e245668b
[TRTLLM-9680][perf] Optimize TRTLLMSampler log_probs performance (Core fix has been merged via #9353) (#9655)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-12-17 17:56:01 +08:00
Yukun He
00c0564334
[None][chore] Remove unnecessary warning log for tuning. (#10077)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 01:51:17 -08:00
Yukun He
18b335d584
[TRTLLM-9989][fix] Disable tvm_ffi for CuteDSL nvFP4 dense GEMM. (#10040)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 00:41:26 -08:00
Yukun He
2fd1a23e4c
[TRTLLM-9998][fix] Change trtllm-gen MoE distributed tuning strategy back to INDEPENDENT (#10036)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 00:35:22 -08:00
yufeiwu-nv
5d71f662c3
[https://nvbugs/5698434][test] Add Qwen3-4B-Eagle3 One-model perf test (#10041)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-17 13:37:25 +08:00
Void
47404196fa
[None][fix] Enabled simultaneous support for low-precision combine and MTP. (#9091)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
2025-12-17 13:37:08 +08:00
Emma Qiao
0dbf3948cc
[None][infra] Waive failed tests due to llm model files (#10068)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-16 20:12:57 -08:00
Kaiyu Xie
02fd13448b
[None] [feat] Enhancements to slurm scripts (#10031)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-16 19:31:27 -08:00
JunyiXu-nv
6649c3743c
[https://nvbugs/5635153][chore] Remove responses tests from waive list (#10026)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-17 11:22:02 +08:00
shuyixiong
26fb063076
[https://nvbugs/5741060][fix] Fix pg op test (#9989)
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-17 09:44:25 +08:00
Aurelien Chartier
7175d89b48
[None][fix] Fix iteration stats for spec-dec (#9855)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-16 14:11:38 -08:00
QI JUN
dba9036072 [None][doc] remove nano-vl-v2 model support in release notes (#9887)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
3daca4fea3 [https://nvbugs/5729847][doc] fix broken links to modelopt (#9868)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
e6ab864066 [None][doc] Update release notes (#9739)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Co-authored-by: Laikh Tewari <laikhtewari1@gmail.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Zac Patel
1ffa2c8937 [IB-1920][doc] Update Perf_Overview.md with Benchmarking Results for Release 1.1 (#9723)
Signed-off-by: Zachary Patel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
xiweny
2756a0da60 [TRTLLM-4629][doc] Add B300 & GB300 in documents (#9663)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
ruodil
07f307d131 [https://nvbugs/5652552][fix] cherry-pick add printing for llm args (#9206)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Iman Tabrizian
1fc8bd3cd8 [TRTLLM-9082][doc] Address Dynamo Example feedback (#9619)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Kaiyu Xie
e41b060fe6 [TRTLLM-9090] [doc] Update online benchmarking docs (#9611)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00