Commit Graph

4869 Commits

Author SHA1 Message Date
gramnarayan
744a955cbb
[None][chore] AutoDeploy: Eagle One-Model [1/n]: PyTorch impl for Eagle3 Llama checkpoint (#10674)
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2026-01-28 12:10:49 -08:00
Emma Qiao
0ffa77af51
[None][infra] Waive failed cases for main on 1/28 (#11053)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-28 06:11:06 -05:00
yingguo-trt
e70a55bd94
[None][feat] support multi_acc and Lyris GB200 test (#11024)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-28 06:01:48 -05:00
Linda
29647d9446
[None][chore] Removing cpp/tensorrt_llm/pybind (#11026)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2026-01-28 11:25:11 +01:00
Grzegorz Kwasniewski
38bcee189c
[TRTLLM-10362][feat] Added Mamba and MLA layers to the sharding tests (#10364)
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
2026-01-28 10:34:10 +01:00
yuanjingx87
3e17ee4e38
[None][infra] Update CI allowList (#11040)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2026-01-28 00:54:37 -08:00
Pengbo Wang
d008494232
[https://nvbugs/5779536][fix] Cherry-pick #10902: Unwaive DeepSeekR1 nvfp4 pp4 mtp test case (#10902) (#11000)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2026-01-28 14:18:53 +08:00
xinhe-nv
dc5eda546b
[None][fix] unwaive tests (#11047)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-27 23:49:07 -05:00
TensorRT LLM
a7748ceb57 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-28 03:36:53 +00:00
dongfengy
1c2e415b3a
[https://nvbugs/5756804][fix] Re-enable passing test (#10986)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2026-01-28 11:23:43 +08:00
Yuan Tong
30348b2753
[None][fix] Proper conditional compilation of sm10x cubins (#10839)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2026-01-28 10:17:51 +08:00
Matt Lefebvre
c26a8f764c
[TRTINFRA-7379][infra] Change SLURM config access to use resolvePlatform (#11006)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
2026-01-27 12:33:16 -08:00
NVShreyas
6c1862fb33
[TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957)
Signed-off-by: Shreyas Misra <shreyasm@nvidia.com>
2026-01-27 12:23:02 -08:00
Evgueni Petrov
f25a2c53bb
[#10877][fix] restore ipv6 support in serve.py (#10929)
Signed-off-by: Evgueni Petrov <evgueni.s.petrov@gmail.com>
2026-01-27 11:55:59 -08:00
Simeng Liu
bae2fac834
[https://nvbugs/5721661][chore] Unwaive fixed bug. (#11009)
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
2026-01-27 11:41:48 -08:00
Lucas Liebenwein
ff3a494f5c
[#10013][feat] AutoDeploy: native cache manager integration (#10635)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-27 11:23:22 -05:00
Gal Hubara-Agam
7f8c260601
[https://nvbugs/5843316][chore] waive overlap_scheduler test (#11025)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
2026-01-27 09:07:52 -05:00
xinhe-nv
552aa32aa2
[None][chore] Add failed cases into waives.txt (#10993)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
2026-01-27 06:08:11 -05:00
Yukun He
b575184fca
[TRTLLM-10308][feat] AutoTuner Cache: reorganize cache file for distributed tuning (#10956)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-27 16:39:40 +08:00
Chuang Zhu
d6f76d2fae
[TRTLLM-9527][feat] change context params and disagg params (step3) (#10495)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-27 16:34:17 +08:00
ZhichenJiang
fae4985797
[TRTLLM-9831][perf] Use TMA.RED to improve effective memory bandwidth (#10987)
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
2026-01-27 16:15:32 +08:00
Bo Li
6b251cc7fa
[TRTLLM-9390][chore] Add Fake OPs for One-Sided AlltoAll. (#11002)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-27 15:55:07 +08:00
Lizhi Zhou
93ae8a14ab
[#10889][fix] fix pydantic deepcopy bug (#11004)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-01-27 02:40:13 -05:00
xinhe-nv
069ad30bdb
[None][chore] Remove closed bugs (#10982)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-27 15:35:44 +08:00
Yiqing Yan
ea5d811aec
[None][chore] Bump version to 1.3.0rc2 (#11021)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-27 15:26:03 +08:00
Emma Qiao
c761b68481
[None][infra] Waive failed cases for main on 01/27 (#11017)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-27 15:24:54 +08:00
zhhuang-nv
ca9f70f78c
[https://nvbugs/5612438][fix] Add timeout for SeedOSS test (#8683)
Signed-off-by: Zhen Huang <145532724+zhhuang-nv@users.noreply.github.com>
2026-01-27 15:22:21 +08:00
Tailing Yuan
5553391c5e
[TRTLLM-10560][fix] Fix the time of pause() for overlap scheduler (#10943)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-27 13:18:34 +08:00
Wanli Jiang
4a206351bb
[TRTLLM-10453][feat] Update mamba decode kernel to flashinfer (#10757)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-27 13:04:40 +08:00
TensorRT LLM
da43a28b01 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-27 03:23:36 +00:00
ameynaik-hub
df8be0c50c
[TRTLLM-10276][feat] Integrate cutedsl argmax kernel (#10476)
Signed-off-by: Amey Naik <212485788+ameynaik-hub@users.noreply.github.com>
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
Co-authored-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2026-01-26 22:08:47 -05:00
sunnyqgg
ff0dd6076e
[TRTLLM-10062][feat] Enable MTP for Nemotron Super (#10754)
Signed-off-by: qgai <qgai@nvidia.com>
2026-01-26 11:23:26 -05:00
tcherckez-nvidia
43b8a5561c
[None][chore] update AD model list (#10981)
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-01-26 16:49:50 +02:00
Lucas Liebenwein
00f341be49
[#8982][feat] AutoDeploy attention dp support (#10728)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-26 09:43:33 -05:00
Linda
ce556290c9
[None][chore] Removing pybind11 bindings and references (#10550)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2026-01-26 08:19:12 -05:00
Pengyun Lin
ce37e27066
[#10614][fix] gpt_oss first iteration streaming in trtllm-serve (#10808)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2026-01-26 20:53:11 +08:00
Pengbo Wang
5d7a5e6800
[https://nvbugs/5779536][fix] Cherry-pick #10855: Unwaive Llama 3.3 related multi GPU tests (#10942)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
2026-01-26 05:40:29 -05:00
Bo Li
e405468230
[TRTLLM-10048][feat] Fuse the AllGather for expert statistics required by the EPLB. (#10885)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-26 17:59:03 +08:00
Tian Zheng
5efee01da1
[None][feat] Add Skip Softmax MLA kernels for Blackwell and Fix an accuracy bug of NVFP4 KV (#10813)
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2026-01-26 16:46:33 +08:00
Emma Qiao
a3a3ceb17f
[None][infra] Waive failed case for main branch on 01/26 (#10994)
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-26 03:20:53 -05:00
xinhe-nv
d3406cb515
[None][chore] Add failed cases into waives.txt (#10976)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-26 02:23:05 -05:00
yingguo-trt
c8f1745a6e
[https://nvbugs/5661741][feat] Add 250K-token NVFP4 MoE + PDL regression tests (#10911)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-26 01:48:29 -05:00
xinhe-nv
2d8245d125
[None][chore] Add failed cases into waives.txt (#10974)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-26 00:33:50 -05:00
TensorRT LLM
d2b5954aea [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-26 03:26:18 +00:00
Enwei Zhu
ffab217974
[None][fix] Fix CuteDSL MoE unittest (#10983)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-26 08:34:17 +08:00
Yanchao Lu
45d7022cc3
[None][test] Waive failed tests on main 1/25 (#10984)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-26 00:32:02 +08:00
Enwei Zhu
72ef732bcf
[TRTLLM-10147][perf] Balanced random MoE workload generator for CuteDSL kernel UT, autotuner and layerwise benchmark (#10279)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-25 21:02:30 +08:00
Pengyun Lin
fd7fd8c39d [https://nvbugs/5747938][infra] Unwaive trtllm serve example test (#10820)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-25 18:12:21 +08:00
dominicshanshan
c98c286c0f [https://nvbugs/5814203][fix] Fix port 8000 being used issue in stress test. (#10756)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-25 18:12:21 +08:00
Yanchao Lu
ae58a7ed20 [None][chore] Revert NVIDIA/TensorRT-LLM#10819 (#10870)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-01-25 18:12:21 +08:00