Gal Hubara-Agam
20b69a982a
[ #10056 ][test] AutoDeploy: Add accuracy test for Nemotron SuperV3 ( #10131 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-19 13:28:42 -08:00
Chang Liu
5489d188a4
[None][fix] Revert the change and remove device count guard for DSv32 ( #9631 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-12-19 15:00:55 -05:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests ( #10005 )
2025-12-19 13:48:43 -05:00
yufeiwu-nv
52cee573ad
[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases ( #10114 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-19 17:01:52 +08:00
JunyiXu-nv
356ad4fe3a
[ https://nvbugs/5722653 ][fix] Address port conflict by assigning different port section in the same node. ( #10035 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-19 15:34:04 +08:00
Wangjue Yao
9f283f330b
[None][feat] Support Mooncake transfer engine as a cache transceiver backend ( #8309 )
...
Signed-off-by: wjueyao <wyao123@terpmail.umd.edu>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-19 10:09:51 +08:00
Anish Shanbhag
91a9ae42d2
[TRTC-71][feat] Add regression testing for config database ( #9832 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-12-18 16:15:38 -08:00
Balaram Buddharaju
799a2ae311
[ https://nvbugs/5741331 ][fix] Fix helix accuracy test ( #10021 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-18 15:27:53 -08:00
Lizhi Zhou
f02782a6f2
[ https://nvbugs/5726066 ][fix] fix auto-scaling related failures ( #9845 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
2025-12-18 16:37:48 -05:00
Yuxian Qiu
bec864a78c
[None][fix] avoid ID conversion for non enable_configurable_moe cases. ( #10003 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-18 13:29:52 +08:00
Wanli Jiang
601c29ca73
[ https://nvbugs/5721644 ][fix] Update tests for nemotron_h ( #9993 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-12-18 12:38:02 +08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests ( #9906 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
yufeiwu-nv
5d71f662c3
[ https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test ( #10041 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-17 13:37:25 +08:00
Aurelien Chartier
7175d89b48
[None][fix] Fix iteration stats for spec-dec ( #9855 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-16 14:11:38 -08:00
Lizhi Zhou
bd13957e70
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic ( #9726 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-16 05:16:32 -08:00
Enwei Zhu
609d1d0383
[None][fix] Fix Illegal Memory Access for CuteDSL Grouped GEMM ( #10008 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-16 04:06:49 -08:00
Eran Geva
ce7a42f4cf
[ https://nvbugs/5731717 ][fix] fixed flashinfer build race condition during test ( #9983 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-15 20:30:24 -08:00
Yechan Kim
8ba8699f66
[TRTLLM-8310][feat] Add Qwen3-VL-MoE ( #9689 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-12-15 20:05:20 -08:00
Balaram Buddharaju
dfc8799352
[ https://nvbugs/5669114 ][fix] Switch to MMMU benchmark for Gemma3 27B ( #9966 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-14 21:23:59 -08:00
Fanrong Li
8f144d9282
[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. ( #9524 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-15 12:42:25 +08:00
xxi
f5696df285
[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm ( #9858 )
2025-12-15 10:47:15 +08:00
nvxuanyuc
a5a37227d6
[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe ( #9852 )
...
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-12-14 10:47:24 +08:00
Mike Iovine
383b13e0e5
[None][feat] Implement sampling on 1-model EAGLE3 ( #9885 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-13 07:38:22 -08:00
Balaram Buddharaju
6a6e41f802
[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism ( #9720 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-12 22:29:41 -08:00
bhsueh_NV
e49c70f6df
[None][feat] Support Mistral Large3 LLM part ( #9820 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-13 11:44:27 +08:00
tburt-nv
6147452158
[ https://nvbugs/4141427 ][chore] Add more details to LICENSE file ( #9881 )
...
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-12-13 08:35:31 +08:00
ruodil
9b3e5e90ee
[None][test] fix a typo in model name in script ( #9867 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2025-12-12 17:35:55 +08:00
chenfeiz0326
61745f034a
[ https://nvbugs/5727481 ][ci] Fix Port Conflict in Perf-Sanity CI Test ( #9896 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-12 17:16:50 +08:00
Ivy Zhang
fded6c393d
[TRTLLM-9262][test] add groupgemm ada case for rcca ( #9833 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-12-12 13:23:33 +08:00
xinhe-nv
e8efeb765d
[TRTLLM-9717][fix] fix multi nodes tests cases ( #9736 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-12 10:14:23 +08:00
xxi
488d38f88d
[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS ( #9772 )
2025-12-12 00:22:13 +08:00
fredricz-20070104
341cb1a12c
[None][chore] Add GB300 support since it does not support segment ( #9731 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-10 18:36:55 -08:00
Patrice Castonguay
2c0293c612
[ https://nvbugs/5601682 ][fix] Unwaiving disagg test ( #9627 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-10 13:42:26 -05:00
cheshirekow
2f030312a8
[TRTLLM-9228][infra] Verify thirdparty C++ process ( #9367 )
...
Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
2025-12-10 21:01:19 +08:00
dhansen-nvidia
2d33ae94d5
[ https://nvbugs/5508301 ][feat] Move D->H copies to a worker thread whe… ( #8463 )
...
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
2025-12-09 18:51:31 -05:00
QI JUN
252769c930
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 ( #9817 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-08 21:51:30 -08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list ( #9614 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
JunyiXu-nv
90890785eb
[ https://nvbugs/5722653 ][fix] Fix config file used by disagg_client ( #9783 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-08 20:34:55 -08:00
Chenghao Zhang
75f5446d67
[ #9753 ][feat] AutoDeploy: Implement add rms_norm fusion ( #9754 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-08 14:24:27 -08:00
Jhao-Ting Chen
0a09465089
[ https://nvbugs/5567586 ][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model ( #8383 )
...
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 11:16:05 -08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench ( #9250 )
2025-12-08 10:37:40 -08:00
Lizhi Zhou
52f78e4000
[ http://nvbugs/5649010 ][fix] fix test_auto_scaling.py::test_worker_restart timeout ( #9775 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-08 03:26:01 -08:00
fredricz-20070104
96d9b67d65
[ https://nvbugs/5527655 ][test] Add test case for RCCA 5527655 ( #9511 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 01:27:13 -08:00
fredricz-20070104
ededeecb0f
[None][test] Add Kimi k2 WIDEEP perf and accuracy cases ( #9686 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 01:25:07 -08:00
Fanrong Li
2f526583fb
[None][chore] Move the rocketkv e2e test to post-merge ( #9768 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-08 13:22:16 +08:00
ruodil
d232709568
[ https://nvbugs/5666804 ][test] only adding sampler config for limited models ( #9512 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-07 19:40:29 -08:00
fredricz-20070104
9bfb6179ec
[ https://nvbugs/5422621 ][test] Add GB 200 WIDEEP test case for RCCA 5422621 ( #9506 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 10:41:40 +08:00
xxi
8e27ce7084
[TRTLLM-9603][feat] Enable ConfigurableMoE test in the CI ( #9645 )
2025-12-08 10:19:40 +08:00
Zheng Duan
4da0e1473c
[None][test] add ntp tolerance in time metrics verification ( #9741 )
...
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-12-08 09:51:10 +08:00
chenfeiz0326
383178c00a
[TRTLLM-9000][feat] Add multi-node Perf Tests into CI ( #8800 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-08 09:00:44 +08:00