TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

Author	SHA1	Message	Date
benzh-2025	6df2c8a074	[None][feat] add fp4 gemm + allreduce (#9729 ) Signed-off-by: benzh Signed-off-by: benzh-2025	2026-01-13 21:11:13 +08:00
Guoming Zhang	c1b0b7350f	[None][test] Unwaive qwen3 next test case. (#9877 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2026-01-13 20:42:31 +08:00
Tailing Yuan	38296a472b	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2026-01-13 19:17:03 +08:00
Erin	55580f8ec1	[NVBUG-5670458][chore] Unwaive lp tests (#10524 ) Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com> Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>	2026-01-13 04:31:27 -05:00
Guoming Zhang	bdaee87895	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2026-01-13 17:13:55 +08:00
JunyiXu-nv	e291a834db	[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2026-01-13 03:57:14 -05:00
JennyLiu	2967d299fb	[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-01-13 13:20:15 +08:00
fredricz-20070104	bbe535fddf	[None][chore] Fix disagg assert (#10596 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-12 21:39:57 -05:00
Iman Tabrizian	48b09e5a25	[https://nvbugs/5689235 ][fix] Fix cancellation+chunked prefill+disagg (#10111 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-01-12 18:23:26 -05:00
Anish Shanbhag	dacc881993	[https://nvbugs/5761391 ][fix] Use correct model names for config database regression tests (#10192 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2026-01-12 10:55:07 -08:00
Suyog Gupta	a1385243e1	[#10580 ][fix] re-enable NemotronH MOE MMLU test (#10594 ) Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>	2026-01-12 09:26:07 -08:00
Emma Qiao	9f044b9dd9	[None][infra] Waive failed tests for main 01/12 (#10604 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-12 10:24:54 -05:00
mpikulski	bf7998f1b8	[TRTLLM-9522][test] cover LLM API `multi_modal_embeddings` (#9963 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2026-01-12 11:38:22 +01:00
Wanli Jiang	11da7e3605	[None][fix] Solve pillow version conflict (#10537 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2026-01-12 04:05:54 -05:00
Zhenhuan Chen	3bd319dc8e	[https://nvbugs/5794796 ][chore] waive test blocking premerge (#10593 ) Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>	2026-01-12 15:39:07 +08:00
yufeiwu-nv	8e806abac3	[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10572 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-12 15:34:55 +08:00
yingguo-trt	c5914f9085	[None][chore] update deepseekv3.2 test parameter (#10595 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-12 01:43:22 -05:00
chenfeiz0326	54459377d2	[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-12 14:23:23 +08:00
Jie Li	5e0dbba0c9	[None][chore]: update waive list (#10577 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-11 22:18:04 -05:00
Eran Geva	c5d5af9e7f	[#8391 ][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-11 16:31:24 -05:00
Ivy Zhang	7f018c89e9	[None][test] update core test list (#10538 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-11 14:08:20 -05:00
Yechan Kim	8e0d20d901	[TRTLLM-10195][feat] K-EXAONE support (#10355 ) Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com> Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>	2026-01-12 00:29:51 +09:00
HuiGao-NV	3c65ec3c55	[None][chore] waive test case (#10581 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-10 18:53:36 -05:00
fredricz-20070104	f6045fac09	[None][chore] Fix Gitlab CI termination issues (#10576 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2026-01-10 07:51:18 -05:00
William Zhang	ff7eb93f31	[https://nvbugs/5669097 ][tests] Add MMMU test for mistral small (#10530 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-01-09 16:09:28 -08:00
Chenghao Zhang	38f249b479	[https://nvbugs/5548861 ][fix] AutoDeploy: Fix the test (#10521 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2026-01-09 13:30:24 -08:00
yingguo-trt	d80f01d205	[None][feat] Add support for DeepSeek v3.2 tests (#10561 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-09 10:20:29 -05:00
Yechan Kim	7295af68ba	[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2026-01-10 00:13:26 +09:00
Iman Tabrizian	ced88424ef	[https://nvbugs/5756008 ][fix] unwaive test (#10523 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-01-09 09:40:07 -05:00
Jie Li	627d306df9	[None][chore] remove some model support; add device constraint (#10563 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-09 09:36:23 -05:00
ruodil	2b72d33fdc	[TRTLLM-9932][test] add kimi_k2 single node perf test (#10436 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-09 05:36:50 -05:00
bhsueh_NV	4a09acd012	[https://nvbugs/5785206 ][infra] unwaive the accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B (#10560 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2026-01-09 03:13:29 -05:00
JadoTu	4c498bfe58	[TRTLLM-9676][fix] Fix mamba_cache_manager when enabling cuda_graph_padding and let test cover this case (#9873 ) Signed-off-by: JadoTu <107457950+JadoTu@users.noreply.github.com>	2026-01-09 14:50:16 +08:00
Jie Li	6fcd4e7099	[None][chore] Add failed cases into waives.txt (#10541 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-09 01:03:47 -05:00
ruodil	d707286ca8	[None][test] restrict max_num_tokens in disagg mtp config (#10442 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-08 21:53:24 -05:00
Balaram Buddharaju	56e779d09f	[None][chore] Waive tests blocking premerge 01/08 (#10555 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-08 20:22:28 -05:00
Mike Iovine	4092a87b6f	[https://nvbugs/5740075 ][fix] Fix sm120 speculation (#10049 ) Signed-off-by: Mike Iovine <miovine@nvidia.com>	2026-01-08 19:55:43 -05:00
William Zhang	c0ae6bbdbe	[None][feat] EPD for Qwen3 VL (#10470 ) * Why? We would like to support EPD disaggregated serving for Qwen3 VL. * What? This commit adds such support, and extends existing unit tests for correctness checks. Some minor (protected) interface changes had to be made to the weight mapper as a side-effect. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-01-08 06:45:54 -05:00
bhsueh_NV	bea61bb17d	[None][fix] Mistral large 3 few code refine (#10405 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2026-01-08 06:38:49 -05:00
Emma Qiao	43839c7d9b	[TRTLLM-9642][infra] Increase pytest verbosity for failed tests (#9657 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Emma Qiao <qqiao@nvidia.com>	2026-01-08 02:33:48 -05:00
HuiGao-NV	22c81cb5fa	[None][chore] Enable seg fault cases since one race condition is fixed (#10398 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-08 02:15:30 -05:00
Barry Kang	f57aab5255	[https://nvbugs/5775402 ][fix] Fix concurrency list in Wide-EP perf tests (#10529 ) Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2026-01-08 01:58:55 -05:00
Lucas Liebenwein	30f8455d29	[https://nvbugs/5747878 ][fix] unwaive llama4 scout tests (#10468 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-07 23:33:45 -05:00
yingguo-trt	f8b2a8fd30	[None][chore] Support multiple job submission at the same time (#10492 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-07 21:51:36 -05:00
Yuxian Qiu	b85c447ceb	[https://nvbugs/5784543 ][fix] Setup dist before using autotuner. (#10491 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-08 10:32:50 +08:00
xxi	81f878c279	[https://nvbugs/5707392 ][fix] unwaive test_fused_moe_fp8_blockwise_wide_ep[NotEnabled] (#10428 ) Signed-off-by: xxi <xxi@nvidia.com>	2026-01-08 09:17:59 +08:00
Lucas Liebenwein	d736c7f290	[https://nvbugs/5761665 ][fix] AutoDeploy: handle bugs for 25.12 dlfw upgrade (#10511 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-07 20:16:53 -05:00
yufeiwu-nv	b130d58c88	[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10487 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-07 17:18:43 +08:00
xinhe-nv	872210468b	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10474 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-07 03:23:43 -05:00
yingguo-trt	cbf8357e5f	[https://nvbugs/5726086 ][fix] update kimi-k2-1k1k dataset (#10473 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-07 01:24:08 -05:00

1 2 3 4 5 ...

2543 Commits