Lucas Liebenwein
76ec820465
[ #7532 ][feat] AutoDeploy: gather logits before lm head ( #9962 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-17 19:50:13 -08:00
TensorRT LLM
cfe53e7425
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-18 03:23:35 +00:00
xinhe-nv
4a98f190a8
[None][chore] Add failed cases into waives.txt ( #10025 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 19:13:52 -08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests ( #9906 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
TensorRT LLM
50c2b82f24
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-17 23:45:35 +00:00
tburt-nv
27064f95c7
[None][chore] Clarify copyright header guidance ( #9882 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-12-18 06:38:10 +08:00
tburt-nv
5da7879b38
[None][fix] Revert GHA upgrade for blossom-ci workflow ( #10095 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-12-17 15:57:04 -05:00
Chenghao Zhang
22c6e8a424
[None][fix] Autodeploy: fix some legacy flashinfer attention test errors ( #9928 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-17 12:27:22 -08:00
Salman Chishti
cb5cd4376e
[None][chore] Upgrade GitHub Actions for Node 24 compatibility ( #10045 )
...
Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
2025-12-17 09:44:09 -08:00
Yuan Tong
f7e245668b
[TRTLLM-9680][perf] Optimize TRTLLMSampler log_probs performance (Core fix has been merged via #9353 ) ( #9655 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-12-17 17:56:01 +08:00
Yukun He
00c0564334
[None][chore] Remove unnecessary warning log for tuning. ( #10077 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 01:51:17 -08:00
Yukun He
18b335d584
[TRTLLM-9989][fix] Disable tvm_ffi for CuteDSL nvFP4 dense GEMM. ( #10040 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 00:41:26 -08:00
Yukun He
2fd1a23e4c
[TRTLLM-9998][fix] Change trtllm-gen MoE distributed tuning strategy back to INDEPENDENT ( #10036 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-17 00:35:22 -08:00
yufeiwu-nv
5d71f662c3
[ https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test ( #10041 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-17 13:37:25 +08:00
Void
47404196fa
[None][fix] Enabled simultaneous support for low-precision combine and MTP. ( #9091 )
...
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
2025-12-17 13:37:08 +08:00
Emma Qiao
0dbf3948cc
[None][infra] Waive failed tests due to llm model files ( #10068 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-16 20:12:57 -08:00
Kaiyu Xie
02fd13448b
[None] [feat] Enhancements to slurm scripts ( #10031 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-16 19:31:27 -08:00
JunyiXu-nv
6649c3743c
[ https://nvbugs/5635153 ][chore] Remove responses tests from waive list ( #10026 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-17 11:22:02 +08:00
shuyixiong
26fb063076
[ https://nvbugs/5741060 ][fix] Fix pg op test ( #9989 )
...
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
2025-12-17 09:44:25 +08:00
Aurelien Chartier
7175d89b48
[None][fix] Fix iteration stats for spec-dec ( #9855 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-16 14:11:38 -08:00
QI JUN
dba9036072
[None][doc] remove nano-vl-v2 model support in release notes ( #9887 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
3daca4fea3
[ https://nvbugs/5729847 ][doc] fix broken links to modelopt ( #9868 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
QI JUN
e6ab864066
[None][doc] Update release notes ( #9739 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
Co-authored-by: Laikh Tewari <laikhtewari1@gmail.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Zac Patel
1ffa2c8937
[IB-1920][doc] Update Perf_Overview.md with Benchmarking Results for Release 1.1 ( #9723 )
...
Signed-off-by: Zachary Patel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
xiweny
2756a0da60
[TRTLLM-4629][doc] Add B300 & GB300 in documents ( #9663 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
ruodil
07f307d131
[ https://nvbugs/5652552 ][fix] cherry-pick add printing for llm args ( #9206 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Iman Tabrizian
1fc8bd3cd8
[TRTLLM-9082][doc] Address Dynamo Example feedback ( #9619 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Kaiyu Xie
e41b060fe6
[TRTLLM-9090] [doc] Update online benchmarking docs ( #9611 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-16 13:33:20 -05:00
Lizhi Zhou
bd13957e70
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic ( #9726 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-16 05:16:32 -08:00
Enwei Zhu
609d1d0383
[None][fix] Fix Illegal Memory Access for CuteDSL Grouped GEMM ( #10008 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-16 04:06:49 -08:00
Enwei Zhu
6a238ca8ad
[None][doc] Update CONTRIBUTING.md ( #10023 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-16 18:58:43 +08:00
Emma Qiao
12727ebd7f
[None][infra] Waive failed test for main branch on 12/16 ( #10029 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-16 02:54:32 -08:00
Perkz Zheng
064b67e40c
[ https://nvbugs/5727952 ][fix] a pdl bug in trtllm-gen fmha kernels ( #9913 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-12-16 00:34:37 -08:00
yuanjingx87
0a4c59136a
[None][infra] Fixing credential loading in lockfile generation pipeline ( #10020 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-16 15:38:29 +08:00
William Zhang
28b02b4f5a
[None][docs] Add README for Nemotron Nano v3 ( #10017 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-15 22:17:24 -08:00
Yihan Wang
6b5ebaae3e
[None][chore] Update internal_cutlass_kernels artifacts ( #9992 )
...
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
2025-12-15 21:15:25 -08:00
Wanli Jiang
8af51211c1
[FMDL-1222][feat] Support weight and weight_scale padding for NVFP4 MoE cutlass ( #9358 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-16 12:41:17 +08:00
Eran Geva
ce7a42f4cf
[ https://nvbugs/5731717 ][fix] fixed flashinfer build race condition during test ( #9983 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-15 20:30:24 -08:00
Yechan Kim
8ba8699f66
[TRTLLM-8310][feat] Add Qwen3-VL-MoE ( #9689 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-12-15 20:05:20 -08:00
ChristinaZ
dff77efa2a
[None][feat] Add routing support for the new model for both cutlass and trtllm moe backend ( #9792 )
...
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
2025-12-15 19:59:08 -08:00
QI JUN
4ce35eacf1
[TRTLLM-9794][ci] move more test cases to gb200 ( #9994 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-15 19:50:41 -08:00
xinhe-nv
cdf56c278f
[TRTLLM-8638][fix] Add failed cases into waives.txt New activity. ( #9979 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-15 18:59:13 -08:00
Zhanrui Sun
b757ea73ba
[TRTLLM-9641][infra] Use public triton 3.5.0 in SBSA ( #9652 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-12-15 18:58:59 -08:00
Michal Guzek
e6187d8109
[ https://nvbugs/5708810 ][fix] Fix TRTLLMSampler ( #9710 )
...
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-12-15 23:26:52 +01:00
Patrice Castonguay
9ba14263db
[ https://nvbugs/5673559 ][fix] Unwaiving disagg test for nvbug 5673559 ( #9957 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-15 12:32:15 -05:00
Emma Qiao
d5d15c06df
[None][infra] Waive failed tests for main branch on 12/15 ( #10001 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-16 01:29:43 +08:00
Faraz
0c31502fbc
[None][feat] disable fused gemm for sm121 ( #9916 )
...
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
2025-12-15 12:07:06 -05:00
Kaiyu Xie
44b0f8c3ed
[None] [fix] Revert "[None] [feat] add eos_token_id in generation_config to sampling params" ( #10002 )
2025-12-15 08:52:52 -08:00
zackyoray
63e7a2fa70
[None][infra] Update ucx to 1.20.x ( #9977 )
...
Signed-off-by: Yoray Zack <yorayz@nvidia.com>
Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com>
2025-12-16 00:31:48 +08:00
arekay-nv
4f75a31a45
[ https://nvbugs/5540979 ][fix] Potential fix for 5540979 ( #9716 )
...
Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>
2025-12-15 10:49:31 -05:00