TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
TensorRT LLM	f6b0ddd61d	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-14 03:29:59 +00:00
nvxuanyuc	a5a37227d6	[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe (#9852 ) Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>	2025-12-14 10:47:24 +08:00
Faraz	64d7796234	[None][chore] Add namespace to header to fix tot failure (#9973 )	2025-12-13 12:18:10 -05:00
Mike Iovine	383b13e0e5	[None][feat] Implement sampling on 1-model EAGLE3 (#9885 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-13 07:38:22 -08:00
jellysnack	079ef8ae77	[None][feat] Graceful Error Handling for Guided Decoder (#9078 ) Signed-off-by: jellysnack <oleg.jellysnack@gmail.com> Signed-off-by: jellysnack <158609015+jellysnack@users.noreply.github.com> Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-13 19:57:59 +08:00
Yan Chunwei	85406f9dda	[https://nvbugs/5720482 ][fix] Fix test rpc streaming (#9902 ) Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>	2025-12-13 01:14:43 -08:00
shuyixiong	8cbf2d958c	[TRTLLM-9738][chore] Guard accuracy with nccl allreduce strategy (#9793 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-13 01:02:11 -08:00
Balaram Buddharaju	6a6e41f802	[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism (#9720 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 22:29:41 -08:00
shuyixiong	7fc720a397	[TRTLLM-9784][fix] Resolve port conflicts (#9780 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-12 22:10:01 -08:00
bhsueh_NV	e49c70f6df	[None][feat] Support Mistral Large3 LLM part (#9820 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-12-13 11:44:27 +08:00
Faraz	98d72c7648	[None][feat] spark cublas LUT table for llama-8b-bf16 perf (#9811 ) Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>	2025-12-12 22:37:56 -05:00
TensorRT LLM	e4e09867d1	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-13 03:26:42 +00:00
Balaram Buddharaju	461446045e	[TRTLLM-9493][feat] Add helixPostProcessNative kernel for cp_dim=2 (#9924 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 16:49:25 -08:00
tburt-nv	6147452158	[https://nvbugs/4141427 ][chore] Add more details to LICENSE file (#9881 ) Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>	2025-12-13 08:35:31 +08:00
yuanjingx87	246a877571	[None][infra] Remove generate lockfile schedule for 1.2.0rc4.post1 branch (#9945 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-12-12 09:10:32 -08:00
Yuxian Qiu	cd4e639536	[None][feat] Async pp send. (#9952 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-13 00:52:30 +08:00
Chuang Zhu	4cc4cbe926	[https://nvbugs/5716787 ][fix] terminate nixl running when exiting (#9785 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com> Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-12 11:15:02 -05:00
Chuang Zhu	9c59c9f920	[https://nvbugs/5643787 ][fix] remove the war path for notify to itself (#9834 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-12 11:10:05 -05:00
JunyiXu-nv	2fec53dfa5	[TRTLLM-9637][feat] Support tool parser for Kimi K2 (#9830 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-12 23:32:39 +08:00
Yihan Wang	9df4dad3b6	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 ) Signed-off-by: Yihan Wang <yihwang@nvidia.com>	2025-12-12 23:32:15 +08:00
Balaram Buddharaju	af315d8ef1	[TRTLLM-5972][chore] Load balance decode token KV cache with helix parallelism (#9757 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 22:29:05 +08:00
zackyoray	d5b9ad91c9	[None][feat] Upgrade NIXL to v0.8.0 (#9707 ) Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com> Signed-off-by: zackyoray Signed-off-by: Bo Deng Co-authored-by: Bo Deng	2025-12-12 20:21:10 +08:00
Lucas Liebenwein	e767fc649a	[None][feat] AutoDeploy: prepare_metadata revisited (#9764 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2025-12-12 20:14:14 +08:00
Yukun He	a6263a127f	[None][chore] Degrade log level in cublas fp4 runner when using default configs (#9951 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-12 18:53:54 +08:00
ruodil	9b3e5e90ee	[None][test] fix a typo in model name in script (#9867 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-12 17:35:55 +08:00
chenfeiz0326	61745f034a	[https://nvbugs/5727481 ][ci] Fix Port Conflict in Perf-Sanity CI Test (#9896 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-12 17:16:50 +08:00
kris1025	2fc94e5dd7	[None][chore] unwaive qwen3 accuracy test (#9895 ) Signed-off-by: linquanh <linquanh@nvidia.com>	2025-12-12 16:30:09 +08:00
yufeiwu-nv	fd3d3a553d	[None][chore] Modify python ipc_util to align with C++ path (#9894 ) Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>	2025-12-12 15:55:22 +08:00
Yihan Wang	711016c799	[https://nvbugs/5736923 ][infra] Waive timeout disaggregated/test_auto_scaling[http-round_robin] test (#9942 ) Signed-off-by: Yihan Wang <yihwang@nvidia.com>	2025-12-12 15:15:13 +08:00
yuanjingx87	eeb03f314a	[None][infra] Replace the deprecated github token (#9915 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-12-11 22:46:14 -08:00
Yifei Wang	9d1f2a9925	[#6425 ][fix] address CUDA stream sync issue in ModelRunnerCPP (#6426 ) Signed-off-by: yifei.w <yifei.w@bytedance.com>	2025-12-12 13:33:22 +08:00
Ivy Zhang	fded6c393d	[TRTLLM-9262][test] add groupgemm ada case for rcca (#9833 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-12-12 13:23:33 +08:00
Kaiyu Xie	110820bb15	[TRTLLM-9792] [feat] Support multiple instances on single node for slurm scripts (#9900 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-12-12 12:12:08 +08:00
Chuang Zhu	bd441e9822	[None][infra] revert ucx to 1.19 (#9936 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-12 11:37:19 +08:00
Yiteng Niu	3e39afea9a	[None][infra] update nspect version for api change (#9899 ) Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>	2025-12-12 11:27:42 +08:00
dominicshanshan	093465ed29	[https://nvbugs/5599176 ][fix] Unwaive fixed test for Ray (#9861 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2025-12-12 11:24:05 +08:00
TensorRT LLM	0132769c22	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2025-12-12 03:20:43 +00:00
Yiqing Yan	5065b60cd1	[None][infra] Fix mergeWaiveList stage (#9892 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-12-12 11:19:42 +08:00
xinhe-nv	e8efeb765d	[TRTLLM-9717][fix] fix multi nodes tests cases (#9736 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-12 10:14:23 +08:00
Chuang Zhu	4670e0c297	[None][infra] update ucx to 1.20 (#9786 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-12 09:49:46 +08:00
JunyiXu-nv	710c592d7c	[https://nvbugs/5727517 ][fix] Preserve ip:port for disagg (#9859 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-12 09:45:34 +08:00
Kanghwan	98c68c195b	[None][infra] Ignore comments from bots and CI accounts (#9929 ) Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>	2025-12-12 09:20:51 +08:00
jthomson04	4f6d4da035	[None][perf] Fix TPOT when `min_tokens` set (#9862 ) Signed-off-by: jthomson04 <jwillthomson19@gmail.com>	2025-12-11 13:55:31 -08:00
Kanghwan	95d928f071	[None][infra] Add workflow to auto-label 'waiting for feedback' on team comments (#9886 ) Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>	2025-12-12 05:43:30 +08:00
Venky	fd1270b9ab	[TRTC-43] [feat] Add config db and docs (#9420 ) Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com> Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>	2025-12-12 04:00:03 +08:00
Simeng Liu	24f92721f2	[https://nvbugs/5597647 ][ci] Unwaive fixed tests. (#9812 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>	2025-12-12 02:29:30 +08:00
Erin	89dabf5aa1	[TRTLLM-9736][feat] AsyncLLM and verl integ (#9353 ) Signed-off-by: Liwei Ma <liweim@nvidia.com> Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com> Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com> Co-authored-by: Liwei Ma <liweim@nvidia.com> Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Co-authored-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-12-11 09:33:25 -08:00
JadoTu	02edb19f43	[None] [feat] add eos_token_id in generation_config to sampling params (#9514 ) Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>	2025-12-12 00:52:03 +08:00
xxi	488d38f88d	[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS (#9772 )	2025-12-12 00:22:13 +08:00
Fanrong Li	af2849cc7a	[None][doc] Add DeepSeek-V3.2 to the supported models (#9893 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-12-11 18:04:48 +08:00

1 2 3 4 5 ...

4155 Commits