Chuang Zhu
|
914dd39127
|
[None][fix] disable cuda ipc on device without nvlink (L40s) for disagg test (#9735)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-12-22 09:29:24 +08:00 |
|
Balaram Buddharaju
|
5266475014
|
[None][feat] Cudagraph updates for helix parallelism (#10141)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-21 15:21:52 -05:00 |
|
bhsueh_NV
|
cd4b4f43fa
|
[None][feat] Support Eagle3 on Mistral Large3 (#9971)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-21 10:25:45 -05:00 |
|
Bo Li
|
a66eeab537
|
[TRTLLM-9805][feat] Skip Softmax Attention. (#9821)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
|
2025-12-21 02:52:42 -05:00 |
|
Balaram Buddharaju
|
dcd3f7b5ea
|
[https://nvbugs/5744427][fix] Fix accuracy test OOM (#10173)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-21 02:03:38 -05:00 |
|
Yuxian Qiu
|
3b3069b390
|
[https://nvbugs/5747930][fix] Use offline tokenizer for whisper models. (#10121)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-12-20 09:42:07 +08:00 |
|
Gal Hubara-Agam
|
20b69a982a
|
[#10056][test] AutoDeploy: Add accuracy test for Nemotron SuperV3 (#10131)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
|
2025-12-19 13:28:42 -08:00 |
|
Chang Liu
|
5489d188a4
|
[None][fix] Revert the change and remove device count guard for DSv32 (#9631)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-12-19 15:00:55 -05:00 |
|
Venky
|
dfa11d810e
|
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005)
|
2025-12-19 13:48:43 -05:00 |
|
yufeiwu-nv
|
52cee573ad
|
[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases (#10114)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
2025-12-19 17:01:52 +08:00 |
|
JunyiXu-nv
|
356ad4fe3a
|
[https://nvbugs/5722653][fix] Address port conflict by assigning different port section in the same node. (#10035)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-19 15:34:04 +08:00 |
|
Wangjue Yao
|
9f283f330b
|
[None][feat] Support Mooncake transfer engine as a cache transceiver backend (#8309)
Signed-off-by: wjueyao <wyao123@terpmail.umd.edu>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-12-19 10:09:51 +08:00 |
|
Anish Shanbhag
|
91a9ae42d2
|
[TRTC-71][feat] Add regression testing for config database (#9832)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2025-12-18 16:15:38 -08:00 |
|
Balaram Buddharaju
|
799a2ae311
|
[https://nvbugs/5741331][fix] Fix helix accuracy test (#10021)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-18 15:27:53 -08:00 |
|
Lizhi Zhou
|
f02782a6f2
|
[https://nvbugs/5726066][fix] fix auto-scaling related failures (#9845)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
|
2025-12-18 16:37:48 -05:00 |
|
Yuxian Qiu
|
bec864a78c
|
[None][fix] avoid ID conversion for non enable_configurable_moe cases. (#10003)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-12-18 13:29:52 +08:00 |
|
Wanli Jiang
|
601c29ca73
|
[https://nvbugs/5721644][fix] Update tests for nemotron_h (#9993)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-12-18 12:38:02 +08:00 |
|
xinhe-nv
|
c1cfb61b1b
|
[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-17 18:15:27 -08:00 |
|
yufeiwu-nv
|
5d71f662c3
|
[https://nvbugs/5698434][test] Add Qwen3-4B-Eagle3 One-model perf test (#10041)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
2025-12-17 13:37:25 +08:00 |
|
Aurelien Chartier
|
7175d89b48
|
[None][fix] Fix iteration stats for spec-dec (#9855)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-12-16 14:11:38 -08:00 |
|
Lizhi Zhou
|
bd13957e70
|
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-12-16 05:16:32 -08:00 |
|
Enwei Zhu
|
609d1d0383
|
[None][fix] Fix Illegal Memory Access for CuteDSL Grouped GEMM (#10008)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-16 04:06:49 -08:00 |
|
Eran Geva
|
ce7a42f4cf
|
[https://nvbugs/5731717][fix] fixed flashinfer build race condition during test (#9983)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-12-15 20:30:24 -08:00 |
|
Yechan Kim
|
8ba8699f66
|
[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-12-15 20:05:20 -08:00 |
|
Balaram Buddharaju
|
dfc8799352
|
[https://nvbugs/5669114][fix] Switch to MMMU benchmark for Gemma3 27B (#9966)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-14 21:23:59 -08:00 |
|
Fanrong Li
|
8f144d9282
|
[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. (#9524)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-12-15 12:42:25 +08:00 |
|
xxi
|
f5696df285
|
[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm (#9858)
|
2025-12-15 10:47:15 +08:00 |
|
nvxuanyuc
|
a5a37227d6
|
[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe (#9852)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
|
2025-12-14 10:47:24 +08:00 |
|
Mike Iovine
|
383b13e0e5
|
[None][feat] Implement sampling on 1-model EAGLE3 (#9885)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-13 07:38:22 -08:00 |
|
Balaram Buddharaju
|
6a6e41f802
|
[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism (#9720)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-12 22:29:41 -08:00 |
|
bhsueh_NV
|
e49c70f6df
|
[None][feat] Support Mistral Large3 LLM part (#9820)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-13 11:44:27 +08:00 |
|
tburt-nv
|
6147452158
|
[https://nvbugs/4141427][chore] Add more details to LICENSE file (#9881)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
2025-12-13 08:35:31 +08:00 |
|
ruodil
|
9b3e5e90ee
|
[None][test] fix a typo in model name in script (#9867)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-12-12 17:35:55 +08:00 |
|
chenfeiz0326
|
61745f034a
|
[https://nvbugs/5727481][ci] Fix Port Conflict in Perf-Sanity CI Test (#9896)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2025-12-12 17:16:50 +08:00 |
|
Ivy Zhang
|
fded6c393d
|
[TRTLLM-9262][test] add groupgemm ada case for rcca (#9833)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-12-12 13:23:33 +08:00 |
|
xinhe-nv
|
e8efeb765d
|
[TRTLLM-9717][fix] fix multi nodes tests cases (#9736)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-12 10:14:23 +08:00 |
|
xxi
|
488d38f88d
|
[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS (#9772)
|
2025-12-12 00:22:13 +08:00 |
|
fredricz-20070104
|
341cb1a12c
|
[None][chore] Add GB300 support since it does not support segment (#9731)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2025-12-10 18:36:55 -08:00 |
|
Patrice Castonguay
|
2c0293c612
|
[https://nvbugs/5601682][fix] Unwaiving disagg test (#9627)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-12-10 13:42:26 -05:00 |
|
cheshirekow
|
2f030312a8
|
[TRTLLM-9228][infra] Verify thirdparty C++ process (#9367)
Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
|
2025-12-10 21:01:19 +08:00 |
|
dhansen-nvidia
|
2d33ae94d5
|
[https://nvbugs/5508301][feat] Move D->H copies to a worker thread whe… (#8463)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
|
2025-12-09 18:51:31 -05:00 |
|
QI JUN
|
252769c930
|
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-12-08 21:51:30 -08:00 |
|
Shi Xiaowei
|
b050804b63
|
[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-12-09 12:54:53 +08:00 |
|
JunyiXu-nv
|
90890785eb
|
[https://nvbugs/5722653][fix] Fix config file used by disagg_client (#9783)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-08 20:34:55 -08:00 |
|
Chenghao Zhang
|
75f5446d67
|
[#9753][feat] AutoDeploy: Implement add rms_norm fusion (#9754)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
|
2025-12-08 14:24:27 -08:00 |
|
Jhao-Ting Chen
|
0a09465089
|
[https://nvbugs/5567586][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
|
2025-12-08 11:16:05 -08:00 |
|
Frank
|
f6df9eb2a6
|
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250)
|
2025-12-08 10:37:40 -08:00 |
|
Lizhi Zhou
|
52f78e4000
|
[http://nvbugs/5649010][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-12-08 03:26:01 -08:00 |
|
fredricz-20070104
|
96d9b67d65
|
[https://nvbugs/5527655][test] Add test case for RCCA 5527655 (#9511)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2025-12-08 01:27:13 -08:00 |
|
fredricz-20070104
|
ededeecb0f
|
[None][test] Add Kimi k2 WIDEEP perf and accuracy cases (#9686)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-12-08 01:25:07 -08:00 |
|