TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-12 05:53:33 +08:00

Author	SHA1	Message	Date
Balaram Buddharaju	72c5480dfb	[None][chore] Waive test blocking pre-merge 12/18 (#10145 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 19:12:05 -08:00
TensorRT LLM	00f70c30a6	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-19 03:11:26 +00:00
Ivy Zhang	9aa40871c2	[TRTLLM-9840][test] switch ucx backend to default backend (#10101 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-12-18 18:54:15 -08:00
TensorRT LLM	a7ac5a6bca	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-19 02:14:37 +00:00
Wangjue Yao	9f283f330b	[None][feat] Support Mooncake transfer engine as a cache transceiver backend (#8309 ) Signed-off-by: wjueyao <wyao123@terpmail.umd.edu> Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>	2025-12-19 10:09:51 +08:00
Zheyu Fu	7c638f155b	Merge branch 'main' into fix_spec_gate Signed-off-by: Zheyu Fu <zheyuf@nvidia.com>	2025-12-18 18:06:59 -08:00
Zheyu Fu	5ab0d1edec	Fix thread leak for test_draft_len_schedule. Enhance stability for test_spec_gate. Signed-off-by: Zheyu Fu <zheyuf@NVIDIA.com>	2025-12-19 02:01:38 +00:00
Chuang Zhu	e0b2a94309	[None][fix] Fix ready signal in NIXL backend (#10000 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-19 09:43:40 +08:00
yuanjingx87	2e88c86f10	[None][infra] Fix issue that lock file geneartion will skip dependency with comment (#10144 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-12-18 17:41:23 -08:00
Yukun He	bd5b3c2ac0	[https://nvbugs/5721912 ][chore] Unwaive the test (#10108 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-19 09:12:25 +08:00
Anish Shanbhag	91a9ae42d2	[TRTC-71][feat] Add regression testing for config database (#9832 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-12-18 16:15:38 -08:00
Balaram Buddharaju	799a2ae311	[https://nvbugs/5741331 ][fix] Fix helix accuracy test (#10021 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 15:27:53 -08:00
Chang Liu	a97e411b44	[https://nvbugs/5747911 ][fix] Use offline data path for the unit test of mmencoder server (#10135 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-18 15:19:23 -08:00
Lizhi Zhou	f02782a6f2	[https://nvbugs/5726066 ][fix] fix auto-scaling related failures (#9845 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Co-authored-by: Emma Qiao <qqiao@nvidia.com>	2025-12-18 16:37:48 -05:00
Enwei Zhu	6fe89ea00f	[TRTLLM-9819][perf] Reuse alltoall workspace for CuteDSL MoE output (#9840 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-18 10:36:38 -08:00
CarstyYou	0b279f4ad4	[https://nvbugs/5456493 ][feat] Add fp8 bmm on sm120 (#9687 ) Signed-off-by: CarstyYou <186021327+CarstyYou@users.noreply.github.com>	2025-12-18 22:57:20 +08:00
ZhichenJiang	4e55b83101	[None][perf] Add more optimization options for MOE CuteDSL finalized kernel (#10042 ) Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>	2025-12-18 22:49:28 +08:00
Nikita Korobov	3b4f26e4d1	[None][feat] update TRT-LLM Gen MoE for NvFp4 + bias with tileN=256 (#9734 ) Signed-off-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>	2025-12-18 11:58:23 +01:00
yuanjingx87	df15be3fad	[None][infra] Fix slurm job does not catch cancelled jobs (#9722 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com> Signed-off-by: yuanjingx87 <197832395+yuanjingx87@users.noreply.github.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-18 00:32:43 -08:00
Bo Li	9d7e038bcb	[https://nvbugs/5753250 ][infra] Waive _test_openai_responses. (#10110 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-18 00:15:06 -08:00
Emma Qiao	33a90f2dd2	[None][infra] Waive failed cases for main branch on 12/18 (#10105 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-17 21:35:45 -08:00
Yuxian Qiu	bec864a78c	[None][fix] avoid ID conversion for non enable_configurable_moe cases. (#10003 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-18 13:29:52 +08:00
yuanjingx87	897a38978d	[None][infra] Update allowlist 2025.12.17 (#10097 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-12-17 21:11:35 -08:00
Wanli Jiang	601c29ca73	[https://nvbugs/5721644 ][fix] Update tests for nemotron_h (#9993 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-12-18 12:38:02 +08:00
Lucas Liebenwein	76ec820465	[#7532 ][feat] AutoDeploy: gather logits before lm head (#9962 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com> Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-17 19:50:13 -08:00
TensorRT LLM	cfe53e7425	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-18 03:23:35 +00:00
xinhe-nv	4a98f190a8	[None][chore] Add failed cases into waives.txt (#10025 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-17 19:13:52 -08:00
xinhe-nv	c1cfb61b1b	[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-17 18:15:27 -08:00
Zheyu Fu	8922ca839f	Change from correctness check to functional check and unwaive the test. Signed-off-by: Zheyu Fu <zheyuf@NVIDIA.com>	2025-12-18 01:09:14 +00:00
TensorRT LLM	50c2b82f24	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-17 23:45:35 +00:00
tburt-nv	27064f95c7	[None][chore] Clarify copyright header guidance (#9882 ) Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>	2025-12-18 06:38:10 +08:00
tburt-nv	5da7879b38	[None][fix] Revert GHA upgrade for blossom-ci workflow (#10095 ) Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>	2025-12-17 15:57:04 -05:00
Chenghao Zhang	22c6e8a424	[None][fix] Autodeploy: fix some legacy flashinfer attention test errors (#9928 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-17 12:27:22 -08:00
Salman Chishti	cb5cd4376e	[None][chore] Upgrade GitHub Actions for Node 24 compatibility (#10045 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2025-12-17 09:44:09 -08:00
Yuan Tong	f7e245668b	[TRTLLM-9680][perf] Optimize TRTLLMSampler log_probs performance (Core fix has been merged via #9353 ) (#9655 ) Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>	2025-12-17 17:56:01 +08:00
Yukun He	00c0564334	[None][chore] Remove unnecessary warning log for tuning. (#10077 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-17 01:51:17 -08:00
Yukun He	18b335d584	[TRTLLM-9989][fix] Disable tvm_ffi for CuteDSL nvFP4 dense GEMM. (#10040 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-17 00:41:26 -08:00
Yukun He	2fd1a23e4c	[TRTLLM-9998][fix] Change trtllm-gen MoE distributed tuning strategy back to INDEPENDENT (#10036 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-17 00:35:22 -08:00
yufeiwu-nv	5d71f662c3	[https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test (#10041 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-17 13:37:25 +08:00
Void	47404196fa	[None][fix] Enabled simultaneous support for low-precision combine and MTP. (#9091 ) Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>	2025-12-17 13:37:08 +08:00
Emma Qiao	0dbf3948cc	[None][infra] Waive failed tests due to llm model files (#10068 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-16 20:12:57 -08:00
Kaiyu Xie	02fd13448b	[None] [feat] Enhancements to slurm scripts (#10031 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-12-16 19:31:27 -08:00
JunyiXu-nv	6649c3743c	[https://nvbugs/5635153 ][chore] Remove responses tests from waive list (#10026 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-17 11:22:02 +08:00
shuyixiong	26fb063076	[https://nvbugs/5741060 ][fix] Fix pg op test (#9989 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-17 09:44:25 +08:00
Aurelien Chartier	7175d89b48	[None][fix] Fix iteration stats for spec-dec (#9855 ) Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>	2025-12-16 14:11:38 -08:00
QI JUN	dba9036072	[None][doc] remove nano-vl-v2 model support in release notes (#9887 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-16 13:33:20 -05:00
QI JUN	3daca4fea3	[https://nvbugs/5729847 ][doc] fix broken links to modelopt (#9868 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-16 13:33:20 -05:00
QI JUN	e6ab864066	[None][doc] Update release notes (#9739 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com> Co-authored-by: Laikh Tewari <laikhtewari1@gmail.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-16 13:33:20 -05:00
Zac Patel	1ffa2c8937	[IB-1920][doc] Update Perf_Overview.md with Benchmarking Results for Release 1.1 (#9723 ) Signed-off-by: Zachary Patel <22306219+zbpatel@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-16 13:33:20 -05:00
xiweny	2756a0da60	[TRTLLM-4629][doc] Add B300 & GB300 in documents (#9663 ) Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-16 13:33:20 -05:00

1 2 3 4 5 ...

4309 Commits