TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-05 02:31:33 +08:00

Author	SHA1	Message	Date
Kaiyu Xie	4f86c5f5ce	[None] [feat] Support multiple accuracy tasks for slurm scripts (#10500 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com> Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>	2026-01-16 15:50:32 +08:00
ruodil	22240e43eb	[None][test] store per user output and per gpu output metric in csv file (#10658 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-15 00:51:08 -05:00
Anish Shanbhag	faa80e73fd	[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2026-01-14 21:06:07 -08:00
JennyLiu	2967d299fb	[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-01-13 13:20:15 +08:00
fredricz-20070104	bbe535fddf	[None][chore] Fix disagg assert (#10596 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-12 21:39:57 -05:00
Anish Shanbhag	dacc881993	[https://nvbugs/5761391 ][fix] Use correct model names for config database regression tests (#10192 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2026-01-12 10:55:07 -08:00
yingguo-trt	c5914f9085	[None][chore] update deepseekv3.2 test parameter (#10595 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-12 01:43:22 -05:00
chenfeiz0326	54459377d2	[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-12 14:23:23 +08:00
Eran Geva	c5d5af9e7f	[#8391 ][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-11 16:31:24 -05:00
fredricz-20070104	f6045fac09	[None][chore] Fix Gitlab CI termination issues (#10576 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2026-01-10 07:51:18 -05:00
yingguo-trt	d80f01d205	[None][feat] Add support for DeepSeek v3.2 tests (#10561 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-09 10:20:29 -05:00
ruodil	2b72d33fdc	[TRTLLM-9932][test] add kimi_k2 single node perf test (#10436 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-09 05:36:50 -05:00
ruodil	d707286ca8	[None][test] restrict max_num_tokens in disagg mtp config (#10442 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-08 21:53:24 -05:00
Barry Kang	f57aab5255	[https://nvbugs/5775402 ][fix] Fix concurrency list in Wide-EP perf tests (#10529 ) Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2026-01-08 01:58:55 -05:00
yingguo-trt	f8b2a8fd30	[None][chore] Support multiple job submission at the same time (#10492 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-07 21:51:36 -05:00
yingguo-trt	cbf8357e5f	[https://nvbugs/5726086 ][fix] update kimi-k2-1k1k dataset (#10473 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-07 01:24:08 -05:00
chenfeiz0326	8a04c05079	[None][fix] Only Use Throughput Metrics to Check Regression (#10404 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-06 09:21:15 +08:00
chenfeiz0326	a65b0d4efa	[None][fix] Decrease Pre Merge Perf Tests (#10390 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2026-01-04 12:21:34 -05:00
chenfeiz0326	5e0e48144f	[None][fix] Minor updates on Perf Test System (#10375 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-02 17:17:42 +08:00
fredricz-20070104	f631b25c85	[None][test] Unified slurm extra args management and session collection logic (#10332 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com> Co-authored-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-01 21:10:51 -05:00
chenfeiz0326	a23c6f1092	[TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-31 21:44:59 +08:00
ruodil	0f4ed90560	[TRTLLM-9965][test] add long-context disagg test for GB300/GB200 and remove config_index in yaml (#10225 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-30 02:39:50 -05:00
chenfeiz0326	d70aeddc7f	[TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-26 22:50:53 +08:00
chenfeiz0326	48c875f8ea	[None][fix] Add OpenSearch URL in slurm_launch.sh for Multinode Perf Sanity Test (#9990 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-23 16:02:38 +08:00
fredricz-20070104	621156ad44	[None][chore] Fix GB300 support issues (#10196 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: fredricz-20070104 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-23 10:42:41 +08:00
Venky	dfa11d810e	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
yufeiwu-nv	52cee573ad	[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases (#10114 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-19 17:01:52 +08:00
Anish Shanbhag	91a9ae42d2	[TRTC-71][feat] Add regression testing for config database (#9832 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-12-18 16:15:38 -08:00
yufeiwu-nv	5d71f662c3	[https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test (#10041 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-17 13:37:25 +08:00
Lizhi Zhou	bd13957e70	[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-12-16 05:16:32 -08:00
ruodil	9b3e5e90ee	[None][test] fix a typo in model name in script (#9867 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-12 17:35:55 +08:00
chenfeiz0326	61745f034a	[https://nvbugs/5727481 ][ci] Fix Port Conflict in Perf-Sanity CI Test (#9896 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-12 17:16:50 +08:00
fredricz-20070104	341cb1a12c	[None][chore] Add GB300 support since it does not support segment (#9731 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-10 18:36:55 -08:00
Frank	f6df9eb2a6	[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250 )	2025-12-08 10:37:40 -08:00
fredricz-20070104	96d9b67d65	[https://nvbugs/5527655 ][test] Add test case for RCCA 5527655 (#9511 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-08 01:27:13 -08:00
fredricz-20070104	ededeecb0f	[None][test] Add Kimi k2 WIDEEP perf and accuracy cases (#9686 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-12-08 01:25:07 -08:00
ruodil	d232709568	[https://nvbugs/5666804 ][test] only adding sampler config for limited models (#9512 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-12-07 19:40:29 -08:00
fredricz-20070104	9bfb6179ec	[https://nvbugs/5422621 ][test] Add GB 200 WIDEEP test case for RCCA 5422621 (#9506 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-08 10:41:40 +08:00
chenfeiz0326	383178c00a	[TRTLLM-9000][feat] Add multi-node Perf Tests into CI (#8800 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-08 09:00:44 +08:00
ruodil	8a392af28f	[None][test] rename wide ep and disagg metric name in perf test (#9704 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-04 18:16:06 +08:00
fredricz-20070104	80ff9015ce	[https://nvbugs/5561153 ][test] Fix log error for perf test (#9622 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-03 15:27:13 +08:00
ruodil	4586b5f42f	[https://nvbugs/5582091 ][test] increase warmup times in testing for multi-gpu cases (#9578 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-02 14:22:49 +08:00
yufeiwu-nv	08755a809d	[https://nvbugs/5689658 ][test] Fix gpu lock issue running on cluster (#9441 ) Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>	2025-11-28 13:59:22 +08:00
fredricz-20070104	6a64cb4c71	[TRTLLM-8936][test] Add disagg and wideep multi-node multi-gpu test cases (#9356 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-11-26 10:34:49 +08:00
Eran Geva	6af01dc664	[#8391 ][chore] test_perf.py to lock clocks read from gpu_configs.yml instead of max freq (#9409 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-11-25 09:20:33 +02:00
ruodil	c86e36fe38	[None][test] add deepseek and qwen cases for rtx series (#8839 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-11-12 22:28:02 -08:00
yufeiwu-nv	b7a2574c60	[https://nvbugs/5568991 ][test] Remove Phi-3 models (#9066 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-11-12 03:16:36 -08:00
Yukun He	6c8ba3be27	[None][chore] Remove duplicate log outputs in test_perf.py (#8418 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
chenfeiz0326	cc4ab8d9d1	[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database (#8653 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-11-03 16:23:13 +08:00
yufeiwu-nv	b4d17d1a4c	[TRTLLM-8991][test] Add Llama 3.3 70B model with different performance config (#8753 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-11-03 13:34:06 +08:00

1 2 3

133 Commits