TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
Yukun He	072f236002	[None][fix] Fully resolve the tactic recovery issues in AutoTuner serialized cache (#9835 ) Restrict tactic types to those compatible with AutoTuner cache serialization and deserialization. Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-10 20:41:04 +08:00
Matt Lefebvre	df1adfbb50	[TRTINFRA-7328][infra] - Move half B200 tests to lbd (#9853 ) Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>	2025-12-10 04:24:30 -08:00
Brian K. Ryu	8cec2da375	[None][feat] Port fp4 quantization kernel optimization from FlashInfer (#9854 ) Signed-off-by: Brian Ryu <bryu@nvidia.com> Co-authored-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>	2025-12-10 13:13:48 +01:00
Matt Lefebvre	8fefa2c9d1	[None][infra] Fail fast if SLURM entrypoint fails (#9744 ) Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>	2025-12-10 02:31:29 -08:00
Perkz Zheng	e34302986d	[https://nvbugs/5727952 ][fix] PDL bugs with trtllm-gen fmha kernels (#9863 ) Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>	2025-12-10 01:47:03 -08:00
Guoming Zhang	12693a526b	[None][chore] Enable L0 multi-gpus testing for Qwen3-next (#9789 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-12-10 17:11:32 +08:00
Zhanrui Sun	49fe089470	[TRTLLM-9811][infra] Update urllib3 version >= 2.6.0 to fix high vulnerability issue (#9823 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-12-10 00:18:11 -08:00
dominicshanshan	0e78a4b244	[https://nvbugs/5702791 ][fix] Unwaive fixed test (#9844 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2025-12-10 14:01:44 +08:00
Yukun He	979f37e443	[None][fix] Fix nvfp4 gemm allowed backends arg passing (#9837 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-09 20:09:53 -08:00
QI JUN	2c46126a93	[TRTLLM-9794][ci] move some deepseek test cases to gb200 (#9841 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-12-09 19:54:51 -08:00
Bo Li	9d3c675a0b	[None][chore] Support larger topK for NVLinkOneSided AlltoAll. (#9816 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-10 11:10:55 +08:00
TensorRT LLM	6a39bb983c	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-10 03:07:34 +00:00
zhanghaotong	36c9e7cfe6	[None][chore] Add unittest for otlp tracing (#8716 ) Signed-off-by: zhanghaotong <zhanghaotong.zht@antgroup.com> Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>	2025-12-09 18:34:08 -08:00
dhansen-nvidia	2d33ae94d5	[https://nvbugs/5508301 ][feat] Move D->H copies to a worker thread whe… (#8463 ) Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com> Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com> Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>	2025-12-09 18:51:31 -05:00
Patrice Castonguay	414448bb37	[https://nvbugs/5719561 ][chore] Unwaive tests for nvbug 5719561 (#9801 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-09 18:21:50 -05:00
Patrice Castonguay	ff0ef19ee9	[https://nvbugs/5688388 ][chore] Unwaiving fixed disagg test (#9800 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-09 16:51:46 -05:00
Matt Lefebvre	5de4e3f621	[TRTINFRA-7328][infra] Consume SlurmCluster scratchPath and cleanup mounts (#9600 ) Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>	2025-12-09 13:34:09 -08:00
Eran Geva	4da3121363	[#8921 ][chore] AutoDeploy NanoV3 to use SYMM_MEM allreduce strategy (#9797 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-09 13:05:38 -08:00
Patrice Castonguay	7d7d05d8db	[None][chore] Adding flaky auto scaling test to waives (#9851 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-09 15:05:19 -05:00
Mike Iovine	07c76a5fac	[None][feat] Make 2-model spec dec use the 1-model kernels (Hopper) (#8810 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-12-09 11:06:31 -05:00
Dom Brown	3156f2e852	[https://nvbugs/5575841 ] [fix] Nvbug 5575841: Remove additional test waivers for TestMoEFP4 (#9788 ) Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>	2025-12-09 13:37:55 +00:00
Emma Qiao	75bc386b65	[None][infra] Waive failed cases for main branch on 12/09 (#9839 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-09 19:39:29 +08:00
QI JUN	58c29957d9	[TRTLLM-9794][ci] move qwen3-next test cases to gb200 (#9827 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-12-09 01:58:25 -08:00
Stefan Niebler	d600b9f851	[TRTLLM-6756][feat] Update BeamSearch for TorchSampler (#9660 ) Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>	2025-12-09 10:44:01 +01:00
Robin Kobus	76f49c903b	[None][fix] Additional model outputs for pipeline parallelism (#9794 ) Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>	2025-12-09 10:41:22 +01:00
Yiqing Yan	2ddcb45b2a	[None][chore] Generate lock file for release/1.2.0rc4.post1 branch automatically (#9829 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-12-09 16:34:17 +08:00
yufeiwu-nv	fbcf03040f	[None][test] Refactor qa/llm_perf_nim.yml test list (#9700 ) Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-08 22:00:43 -08:00
QI JUN	252769c930	[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-12-08 21:51:30 -08:00
Zhanrui Sun	309f92ec09	[None][infra] Use artifactory pypi mirror for Cython install (#9774 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-12-09 13:49:41 +08:00
Shi Xiaowei	b050804b63	[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614 ) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>	2025-12-09 12:54:53 +08:00
JunyiXu-nv	90890785eb	[https://nvbugs/5722653 ][fix] Fix config file used by disagg_client (#9783 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com> Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-08 20:34:55 -08:00
Balaram Buddharaju	bafb60c1bc	[None][chore] Fix tests failing on pre-merge 12/08 (#9819 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-08 20:08:52 -08:00
Bo Li	f2006a1f74	[https://nvbugs/5726066 ][infra] Waive timeout disaggregated/test_auto_scaling tests. (#9815 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-08 19:51:43 -08:00
TensorRT LLM	c7a2568872	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-09 03:19:48 +00:00
JunyiXu-nv	f521f6d910	[None][fix] Fix unterminated process issue for RemoteOpenAIServer (#9490 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-09 11:15:40 +08:00
Jiagan Cheng	4a3a66b124	[https://nvbugs/5677746 ][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659 ) Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>	2025-12-08 18:43:52 -08:00
bhsueh_NV	d6f961d3fe	[None][feat] Add llama4 scaling (#9771 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-12-09 10:27:39 +08:00
Tri Dao	1c4dacb19a	[None][fix] Fix PDL in TRTLLM MOE for dsv3 (#9799 ) Signed-off-by: Tri Dao <daominhtri0503@gmail.com>	2025-12-09 10:16:29 +08:00
yuanjingx87	390391ebf1	[None][infra] Correct the waived test names due to a merge conflict (#9803 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-12-09 09:48:21 +08:00
Chenghao Zhang	75f5446d67	[#9753 ][feat] AutoDeploy: Implement add rms_norm fusion (#9754 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-08 14:24:27 -08:00
Jhao-Ting Chen	da074be037	[None][fix] Fix #8383 introduced TRTLLM backend python error (#9804 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>	2025-12-08 13:31:37 -08:00
Eran Geva	23cf72b0f8	[#8921 ][feat] Added symetric memory AllReduce strategy (#8919 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-08 13:12:56 -08:00
Thor Johnsen	f9380581c5	[https://nvbugs/5508267 ][fix] Proper handling of inactive canceled requests (#9280 ) Signed-off-by: thorjohnsen <41591019+thorjohnsen@users.noreply.github.com>	2025-12-08 13:11:44 -08:00
Yibin Li	faabc1a387	[TRTLLM-7967][chore] Add more tests (#9415 ) Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>	2025-12-08 11:57:32 -08:00
Jhao-Ting Chen	0a09465089	[https://nvbugs/5567586 ][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>	2025-12-08 11:16:05 -08:00
Frank	f6df9eb2a6	[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250 )	2025-12-08 10:37:40 -08:00
sunnyqgg	1c7b7cdd47	[TRTLLM-9506][fix] Fix AR for DeepSeek-R1 2 model path (#9661 ) Signed-off-by: qgai <qgai@nvidia.com>	2025-12-08 10:12:32 -05:00
Eran Geva	98db262a67	[None][fix] Switch AutoDeploy's default allreduce strategy to NCCL (#9666 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-08 03:26:21 -08:00
Lizhi Zhou	52f78e4000	[http://nvbugs/5649010 ][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-12-08 03:26:01 -08:00
fredricz-20070104	96d9b67d65	[https://nvbugs/5527655 ][test] Add test case for RCCA 5527655 (#9511 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-08 01:27:13 -08:00

1 2 3 4 5 ...

4093 Commits