TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
TensorRT LLM	3fec7e411c	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2026-01-07 03:10:22 +00:00
xinhe-nv	1fbadd2dde	[None][chore] Add failed cases into waives.txt (#10365 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <lijie@nvidia.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <lijie@nvidia.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2026-01-06 22:08:06 -05:00
Ivy Zhang	4a1b2e23b3	[https://nvbugs/5698434 ][test] add qwen3-4b accuracy test case (#10382 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 21:56:34 -05:00
Lucas Liebenwein	6095c80e56	[https://nvbugs/5721907 ][fix] AutoDeploy: improve numerical stability of flashinfer attention test (#10467 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-06 21:11:06 -05:00
Zongfei Jing	bb2f883296	[None] [feat] Add test script and raster M for gather fc1 kernel (#10429 ) Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>	2026-01-07 09:31:49 +08:00
Lucas Liebenwein	bb6a3973aa	[https://nvbugs/5732942 ][fix] AutoDeploy: handle transformers 4.57.1 upgrade fixes (#10466 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-06 19:55:49 -05:00
Lucas Liebenwein	00355b24b7	[None][feat] precompiled installation from local src dir with fnmatch only (#10430 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-06 15:31:59 -05:00
Mike Iovine	77be1b7572	[https://nvbugs/5749988 ][fix] Remove redundant qwen3 spec dec test (#10387 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-06 11:46:34 -05:00
Enwei Zhu	037753f65b	[https://nvbugs/5748600 ][ci] Unwaive disagg guided decoding test (#10409 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2026-01-06 11:38:12 -05:00
Lizhi Zhou	6a4bebcd01	[None][chore] remove redundant retries while binding to arbitrary port (#10452 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2026-01-06 10:39:15 -05:00
JunyiXu-nv	7d62773c6c	[https://nvbugs/5760726 ][fix] Use random port in container port section (#10432 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2026-01-06 23:25:46 +08:00
xinhe-nv	704f58dfbe	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10427 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-06 04:47:54 -05:00
Emma Qiao	6507087c3f	[None][infra] Waive failed cases on 1/6 (#10440 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-06 16:54:54 +08:00
Bo Li	df0b976b99	[https://nvbugs/5785206 ][infra] Waive TestQwen3_30B_A3B::test_fp8[latency-torch_compile=False]. (#10441 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2026-01-06 03:32:19 -05:00
William Zhang	ab58d7cac1	[https://nvbugs/5772361 ][ci] Unwaive tests that have been fixed (#10424 ) These tests were all failing due to the same issue, and were fixed in #10394. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-01-05 23:49:54 -08:00
Kaiyu Xie	2eaabd7461	[None] [fix] Fix undefined tokens_per_block (#10438 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2026-01-06 02:42:37 -05:00
Ivy Zhang	1e828587e5	[TRTLLM-9896][test] add vswa test cases coverage (#10146 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 02:02:29 -05:00
Yiqing Yan	5108a69fc0	[TRTLLM-9622][infra] Enable DGX_B300 multi-gpu testing in pre-merge pipeline (#9699 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2026-01-06 14:39:55 +08:00
xinhe-nv	998527724c	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10367 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-06 01:09:21 -05:00
Kaiyu Xie	810249c304	[https://nvbugs/5769926 ] [fix] Add no container mount home WAR (#10431 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2026-01-06 13:09:25 +08:00
Ivy Zhang	22a1d31a27	[None][test] update test case constraint (#10381 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 12:28:59 +08:00
xinhe-nv	1b1058279c	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10384 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-05 23:02:27 -05:00
kris1025	3e98265682	[None][chore] unwaive qwen3 30b test (#10115 ) Signed-off-by: linquanh <linquanh@nvidia.com>	2026-01-06 11:17:08 +08:00
TensorRT LLM	596d4f16fb	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2026-01-06 03:16:01 +00:00
Karthik	617f728903	[#8460 ][feat] Revive and simplify Model Explorer visualization integration (#10150 ) Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>	2026-01-05 22:15:25 -05:00
Venky	aa1fe931de	[None][docs] Add `--config` preference over `--extra_llm_api_options` in CODING_GUIDELINES.md (#10426 ) Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>	2026-01-05 22:05:47 -05:00
Xiao Xuan	46f035befe	[#2511 ][fix] eagle: qwen2 capture hidden states (#10091 ) Signed-off-by: SpicyNoodle <522169030@qq.com>	2026-01-05 21:46:41 -05:00
Min Yu	9cae7277ea	[https://nvbugs/5726962 ][feat] Apply fusion for W4AFP8_AWQ MoE (#9838 ) Signed-off-by: Min Yu <171526537+yumin066@users.noreply.github.com> Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com> Co-authored-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>	2026-01-06 10:16:41 +08:00
alel	6b8ae6fa81	[None][feat] CuteDSL MOE FC1 Enhancement (#10088 ) Signed-off-by: Yuhan Li <51736452+liyuhannnnn@users.noreply.github.com>	2026-01-06 09:30:43 +08:00
Mike Iovine	77712ed4ab	[None][chore] Update SWA + spec dec support matrix (#10421 ) Signed-off-by: Mike Iovine <miovine@nvidia.com>	2026-01-05 20:26:23 -05:00
JadoTu	82aaf98070	[None][feat] add the eos tokens in generation config to stop words in the sampler (#10389 ) Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>	2026-01-06 09:24:03 +08:00
chenfeiz0326	8a04c05079	[None][fix] Only Use Throughput Metrics to Check Regression (#10404 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-06 09:21:15 +08:00
Chuang Zhu	536a8f6a9c	[TRTLLM-9527][feat] Add transferAgent binding (step 1) (#10113 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2026-01-06 08:40:38 +08:00
Lucas Liebenwein	846e54aa09	[None][feat] precompiled installation from local src dir (#10419 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-05 19:16:38 -05:00
Simeng Liu	3b56548fcf	[https://nvbugs/5777044 ][chore] Remove solved bugs from waives.txt (#10422 ) Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>	2026-01-05 16:56:58 -05:00
Karthik	4e50cb5708	[#10170 ][fix] Add export patch for GraniteMoe MoE models to enable torch.export compatibility (#10169 ) Signed-off-by: Karthik Vetrivel <kvetrivel@nvidia.com>	2026-01-05 16:13:45 -05:00
Mike Iovine	91ff46d418	[https://nvbugs/5745152 ][fix] Unwaive gpt oss spec decode test (#10370 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 16:06:58 -05:00
Mike Iovine	7a2dab8e85	[https://nvbugs/5695984 ][fix] Unwaive llama3 eagle test (#10092 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 16:03:35 -05:00
Yan Chunwei	6b71b03947	[TRTLLM-9551][infra] Partition test_llm_pytorch.py for parallel execution (#10400 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2026-01-05 13:58:03 -05:00
Grzegorz Kwasniewski	ea380ff45c	[TRTLLM-9767][feat] Fixed recursive node traversals (#10379 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>	2026-01-05 18:42:06 +02:00
Mike Iovine	db2614ef10	[https://nvbugs/5772414 ][fix] Fix draft token tree depth=1 corner case (#10385 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 17:20:14 +01:00
Mike Iovine	bedfff4f00	[https://nvbugs/5772521 ][fix] Fix draft token tree chain crash (#10386 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 17:18:44 +01:00
Gal Hubara-Agam	e98c27ee4f	[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime (#10397 ) Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>	2026-01-05 18:17:27 +02:00
Anthony Chang	225d3a9001	[None][perf] TRTLLM MoE maps to lower tuning buckets when ep>1 (#9998 ) Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>	2026-01-05 17:16:12 +01:00
Balaram Buddharaju	a792c23dcf	[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-05 20:08:03 +08:00
Eran Geva	3749a2ce1c	[#10374 ][fix] fixed race condition in AutoDeploy's mp tests port acquisition (#10366 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-05 13:33:01 +02:00
xinhe-nv	b1733d56f6	[TRTLLM-9381][test] add disag-serving kimi k2 thinking tests (#10357 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-05 05:15:52 -05:00
Fanrong Li	4931c5eb3a	[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-05 16:43:42 +08:00
Yukun He	d272f1a9bc	[TRTLLM-8821][feat] Apply AutoTuner to AllReduce Op for strategy tuning. (#8531 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2026-01-05 15:44:37 +08:00
HuiGao-NV	2f768b76f8	[https://nvbugs/5715568 ][fix] Force release torch memory when LLM is destroyed (#10314 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-05 15:30:18 +08:00

1 2 3 4 5 ...

4569 Commits