Stefan Niebler
|
f155812eb0
|
[TRTLLM-6756][feat] Add Beam Search to TorchSampler (#8509)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
|
2025-12-01 18:48:04 +01:00 |
|
Yanchao Lu
|
7127c4407a
|
[None][test] [None][test] Waive main branch test failures 12/1 (#9566)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-12-01 21:54:53 +08:00 |
|
Shi Xiaowei
|
48b1d31895
|
[https://nvbugs/5651854][infra] Enable perf metrics during accuracy testing (#9140)
|
2025-12-01 20:15:32 +08:00 |
|
alel
|
4107254c82
|
[TRTLLM-6222][feat] Several perf opt for cuteDSL nvf4 gemm (#9428)
Signed-off-by: Yuhan Li <51736452+liyuhannnnn@users.noreply.github.com>
|
2025-12-01 18:10:45 +08:00 |
|
JadoTu
|
a92af27411
|
[None][chore] remove qwen3-next accuracy tests (#9534)
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
|
2025-12-01 11:49:37 +08:00 |
|
Pengbo Wang
|
aa3310f64f
|
[https://nvbugs/5503479][fix] Temporarily lower reference accuracy to stabilize CI (#9398)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2025-12-01 11:49:14 +08:00 |
|
Enwei Zhu
|
2e3ac3c48f
|
[https://nvbugs/5684703][fix] Unwaive disagg guided decoding test (#9466)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-01 11:39:40 +08:00 |
|
Li Min
|
1797e91dfd
|
[TRTLLM-6222][feat] Extend cute_dsl_nvfp4_gemm to sm103. (#9543)
Signed-off-by: Mindy Li <11663212+limin2021@users.noreply.github.com>
|
2025-12-01 10:19:36 +08:00 |
|
heyuhhh
|
6e470aab72
|
[None] [feat] Optimize the algorithm part of RocketKV (#9333)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
|
2025-12-01 09:04:09 +08:00 |
|
xxi
|
c12e67bb66
|
[TRTLLM-8958][feat] and [TRTLLM-8960]: create ConfigurableMoE and support TRTLLMGenFusedMoE as backend (#9486)
|
2025-12-01 08:37:07 +08:00 |
|
JunyiXu-nv
|
3f588198dc
|
[None][fix] Fix port conflict in disagg tests (#9474)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-30 17:33:22 +08:00 |
|
Emma Qiao
|
c927ccf510
|
[None][infra] Wiave failed tests for main branch on 11/30 (#9555)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-30 16:13:20 +08:00 |
|
brb-nv
|
b77f4ffe54
|
[TRTLLM-5971][feat] Integrate helix parallelism (#9342)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-11-29 15:17:30 -08:00 |
|
dominicshanshan
|
6345074686
|
[None][chore] Weekly mass integration of release/1.1 -- rebase (#9522)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
|
2025-11-29 21:48:48 +08:00 |
|
Grzegorz Kwasniewski
|
cff54fcae3
|
[#8948][feat] Support custom sharding config (#9143)
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
|
2025-11-29 05:28:05 +08:00 |
|
mpikulski
|
bc355eadf5
|
[TRTLLM-9488][fix] llmapi references (#9547)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-11-28 08:54:05 -08:00 |
|
dominicshanshan
|
70efa3ac43
|
[None][infra] Waive failed case in pre-merge on 11/28 (#9537)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-11-28 20:53:45 +08:00 |
|
mpikulski
|
e5f39ec7cf
|
[TRTLLM-9488][feat] add 'disable_flashinfer_sampling' config option (#9454)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-11-28 13:00:39 +01:00 |
|
Emma Qiao
|
2d7421b314
|
[None][infra] Waive failed cases for main branch on 11/28 (#9539)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-28 17:19:55 +08:00 |
|
Liao Lanyu
|
bf84d9cea1
|
[None][chore] add spec_decoding configs in perf benchmark scripts and fix typos (#9533)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-11-28 14:52:05 +08:00 |
|
yufeiwu-nv
|
08755a809d
|
[https://nvbugs/5689658][test] Fix gpu lock issue running on cluster (#9441)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
|
2025-11-28 13:59:22 +08:00 |
|
Yukun He
|
60c43a200a
|
[None][fix] Fix on-disk cache and revise logger/statistics for AutoTuner. (#9211)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2025-11-28 13:32:21 +08:00 |
|
JunyiXu-nv
|
c87e81c1d8
|
[https://nvbugs/5685015][fix] Update invalid max_token test (#9435)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-28 11:41:16 +08:00 |
|
Bo Li
|
19f3f4e520
|
[https://nvbugs/5637037][chore] Update waive lists. (#9386)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-11-28 10:45:22 +08:00 |
|
Yueh-Ting (eop) Chen
|
4cbfc10b28
|
[https://nvbugs/5674665][chore] Add test coverage for https://nvbugspro.nvidia.com/bug/5674665 (#9518)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
|
2025-11-27 21:40:34 +08:00 |
|
Bo Li
|
62b771877c
|
[TRTLLM-9389][chore] Refactor AlltoallMethodType. (#9388)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2025-11-27 21:09:29 +08:00 |
|
Fanrong Li
|
2d5eadf65f
|
[None][fix] fix TP support for DeepSeek-V3.2 on hopper (#9484)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-11-27 21:02:25 +08:00 |
|
JadoTu
|
51bf7164d3
|
[None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 (#9330)
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
|
2025-11-27 18:05:00 +08:00 |
|
Lizhi Zhou
|
8104a78931
|
[None][chore] revert batch_size=1 to prevent timeout and lower accuracy reference by 0.12% as a WAR (#9447)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-11-27 14:25:44 +08:00 |
|
Emma Qiao
|
0442510304
|
[None][infra] Waive failed case in pre-merge on 11/27 (#9507)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-27 13:53:33 +08:00 |
|
HuiGao-NV
|
03331bc43d
|
[https://nvbugs/5547414][fix] enable case after using local cache model (#9473)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-11-27 12:18:20 +08:00 |
|
Patrice Castonguay
|
1b2da426cd
|
[https://nvbugs/5680310][fix] Fix ctx only timed out test (#9410)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-11-27 11:21:21 +08:00 |
|
Shi Xiaowei
|
e76e149861
|
[https://nvbugs/5608930][fix] Fix a typo (#9487)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-11-27 09:05:17 +08:00 |
|
Zheyu Fu
|
dbbed1f85a
|
[None][ci] Waive blackwell test on spec gate. (#9502)
Signed-off-by: Zheyu Fu <zheyuf@NVIDIA.com>
|
2025-11-27 07:19:58 +08:00 |
|
Chenghao Zhang
|
18fbda5cdb
|
[None][feat] AutoDeploy: Add A_log fusion for Mamba layers (#9422)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
|
2025-11-26 14:39:20 -08:00 |
|
Chenghao Zhang
|
bc7b60e016
|
[None][feat] AutoDeploy: Remove redundant copies in mamba layers (#9461)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-11-26 14:38:33 -08:00 |
|
Chang Liu
|
b10137fdd5
|
[None][feat] Support MLA chunked prefill for DeepSeek V3.2 model (#9376)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-11-26 16:38:25 +08:00 |
|
JunyiXu-nv
|
b7308a4000
|
[https://nvbugs/5580099][fix] Cherry pick IMA issue fix from release/1.1 (#9032)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-26 13:09:06 +08:00 |
|
Wanli Jiang
|
d100599ea7
|
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm (#9246)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-11-26 11:12:35 +08:00 |
|
shuyixiong
|
d8acea1db3
|
[TRTLLM-9293][feat] Enable partial weight loading to support streaming update weights (#9224)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
|
2025-11-26 10:59:06 +08:00 |
|
QI JUN
|
5972119e1c
|
[None][ci] move some slow test cases of DGX-B200 to post merge (#9467)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-11-26 10:48:53 +08:00 |
|
fredricz-20070104
|
6a64cb4c71
|
[TRTLLM-8936][test] Add disagg and wideep multi-node multi-gpu test cases (#9356)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2025-11-26 10:34:49 +08:00 |
|
Chuang Zhu
|
0e9c7f8c07
|
[https://nvbugs/5685143][fix] avoid cudaFree overlap with cuda graph (#9438)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-11-25 16:20:29 -08:00 |
|
Suyog Gupta
|
e484bec82f
|
[None][chore] AutoDeploy add multi stream moe pass to default.yaml (#9430)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-11-25 14:16:13 -08:00 |
|
Robin Kobus
|
32f53910ef
|
[TRTLLM-909][feat] Overlap context chunks in pipeline parallel mode (#9308)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-11-25 22:11:51 +01:00 |
|
Eran Geva
|
afc52d7b93
|
[https://nvbugs/5647400] [fix] Enlarged the AllReduce workspace size to 64MB. Added AllReduce strategy to AD config. (#9145)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-11-25 10:56:07 -08:00 |
|
mpikulski
|
899fda9e47
|
[TRTLLM-9490][feat] use FlashInfer's top_k_sampling_from_probs (#9457)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-11-25 18:53:53 +01:00 |
|
mpikulski
|
c5f52ab304
|
[TRTLLM-8376][feat] top-p optimization (removes redundant softmax) (#9411)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-11-25 18:46:48 +01:00 |
|
Fanrong Li
|
8da59103d6
|
[https://nvbugs/5680905][fix] Relax the MMLU accuracy requirement for DS-v3.2 (#9439)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-11-26 00:32:20 +08:00 |
|
Yan Chunwei
|
1f43dc8174
|
[None][ci] waive a test (#9458)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-11-25 07:04:20 -08:00 |
|