TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-04 02:02:01 +08:00

Author	SHA1	Message	Date
yuanjingx87	e1cc8d2337	[None][infra] Add sonarqube scanning in lockfile generation pipeline (#10700 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2026-01-18 01:11:28 -08:00
Eran Geva	a11f0dbd61	[#10696 ][fix] AutoDeploy prevent torch.export from specializing batch dimension when max_batch_size=1 (#10697 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-18 10:42:49 +02:00
Yanchao Lu	0af1a0e478	[None][test] Waive main post-merge test failures 1/18 (#10777 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2026-01-18 15:34:48 +08:00
TensorRT LLM	f8c26409f9	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2026-01-18 03:07:08 +00:00
Yanchao Lu	0096b50ba0	[None][infra] Update upgrade related docs for release 1.2 (#10760 ) (#10773 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Emma Qiao <qqiao@nvidia.com>	2026-01-18 00:14:27 +08:00
Grzegorz Kwasniewski	7bf4dd9f63	[TRTLLM-10318][feat] Fixing Nemotron sharding: support for sharding buffers (#10319 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com> Signed-off-by: Lucas <11156568+lucaslie@users.noreply.github.com> Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com> Co-authored-by: Lucas <11156568+lucaslie@users.noreply.github.com>	2026-01-17 04:02:06 -05:00
Yuxian Qiu	cef67b4f8d	[None][fix] convert to CUDA tensor before calling _resmooth_kernel. (#10770 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-17 16:18:34 +08:00
Yuxian Qiu	b65560fc32	[https://nvbugs/5794313 ][chore] unwaive tests. (#10660 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-17 14:15:15 +08:00
Yukun He	3d16daf696	[None][fix] Fix tmp dir being deleted too early in unit test. (#10740 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2026-01-17 13:49:10 +08:00
chenfeiz0326	56073f501a	[TRTLLM-8263][feat] Add Aggregated Perf Tests (#10598 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-17 13:16:36 +08:00
TensorRT LLM	24d7e499b4	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2026-01-17 03:10:32 +00:00
Frida Hou	069ad68d3c	[None][fix] AutoDeploy: skip mxfp4_moe test unless on Hopper (#10729 ) Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>	2026-01-16 16:24:37 -05:00
Chenghao Zhang	0b748d5bba	[None][chore] update flashinfer to 0.6.0 (#10522 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2026-01-16 16:22:06 -05:00
Chenghao Zhang	b6acd96616	[None][fix] AutoDeploy: Fix the nvfp4 fused_moe (#10727 ) Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>	2026-01-16 12:04:40 -08:00
Stefan Niebler	0cfd08745c	[TRTLLM-9735][feat] Add processed logprobs functionality to TorchSampler (#9675 ) Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com> Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com> Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>	2026-01-16 10:52:41 -08:00
Tian Zheng	cfebfbb505	[https://nvbugs/5783509 ][fix] Fix a hang issue when enabling skip softmax on Blackwell (#10490 ) Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>	2026-01-16 18:59:54 +08:00
xinhe-nv	cc43edc8f4	[None][fix] waive tests on sm89 (#10753 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-16 17:35:42 +08:00
Stefan Niebler	c4db030b88	[TRTLLM-8425][doc] Update sampling documentation (#10083 ) Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>	2026-01-16 16:58:49 +08:00
Wanli Jiang	722978b837	[TRTLLM-10305][feat] Support customized seq len larger than model config (#10600 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2026-01-16 16:07:36 +08:00
Kaiyu Xie	4f86c5f5ce	[None] [feat] Support multiple accuracy tasks for slurm scripts (#10500 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com> Co-authored-by: Zhenhuan Chen <zhenhuanc@nvidia.com>	2026-01-16 15:50:32 +08:00
dongfengy	6dfb8d7084	[None][fix] Fix Piecewise Cuda Graph for GPTOSS (#10631 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>	2026-01-16 15:47:34 +08:00
xinhe-nv	0256c7234f	[None][chore] Remove closed bugs (#10586 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-16 15:04:11 +08:00
jmydurant	b163e66182	[None][doc] update doc (add minimax model) (#10746 ) Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>	2026-01-16 14:54:52 +08:00
Necofish	03cdf5804f	[None][fix] impl fused triton kernel for e8m0 resmooth to reduce memory footprint (#10327 ) Signed-off-by: Nekofish-L <liuxiangyang@mail.ustc.edu.cn> Co-authored-by: Kanghwan <861393+karljang@users.noreply.github.com>	2026-01-15 22:13:18 -08:00
Yukun He	f001c4946d	[https://nvbugs/5782112 ][fix] Fix hanging issue for MNNVL Allreduce under PP (#10633 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2026-01-16 13:03:36 +08:00
Emma Qiao	e2c3373749	[None][infra] Waive failed cases for main branch on 01/16 (#10738 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-16 12:46:35 +08:00
Bo Li	7686fbbcbe	[https://nvbugs/5810940 ][chore] Update waive lists for nvbugs/5810940. (#10737 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2026-01-16 12:08:14 +08:00
Chuang Zhu	8257b67ea5	[https://nvbugs/5791936 ][fix] Add warning for gen-only paused (#10664 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2026-01-16 11:18:24 +08:00
TensorRT LLM	6541e41c74	[None][infra] Check in most recent lock file from nightly pipeline Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>	2026-01-16 03:13:42 +00:00
Enwei Zhu	7b8b9ccbaf	[https://nvbugs/5669671 ][fix] Support GuidedDecoder with sharded logits (#10698 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2026-01-16 11:04:26 +08:00
Enwei Zhu	9f741fb254	[https://nvbugs/5800521 ][ci] Move test_openai_chat_guided_decoding to H100 stage (to avoid potential OOM) (#10703 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2026-01-16 10:42:52 +08:00
xxi	ce561b6a8e	[TRTLLM-9111][feat] MoE test refactor: Extend MoE quantization test utilities with comprehensive quant algorithm support (#10691 ) Signed-off-by: xxi <xxi@nvidia.com>	2026-01-16 10:26:33 +08:00
Chuang Zhu	7e2cbc0756	[https://nvbugs/5598674 ][fix] enable partial reuse in gemma and gpt oss test (#10559 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2026-01-16 10:26:15 +08:00
heyuhhh	e3f27e06c7	[None][chore] Waive star attention unittests (#10439 ) Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>	2026-01-16 10:12:32 +08:00
Yuxian Qiu	ef838cc852	[https://nvbugs/5701445 ][chore] isolate test. (#10444 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-16 10:04:12 +08:00
Lucas Liebenwein	49c6f73554	[None][bug] AutoDeploy: fix regression in kv cache resize memory estimation (#10726 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-16 09:52:03 +08:00
Iman Tabrizian	5ad8cf6d5e	[https://nvbugs/5738168 ][fix] unwaive test accuracy/test_disaggregated_serving.py::TestDeepSeekV32Exp::test_auto_dtype[False] (#10584 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-01-16 06:04:45 +08:00
Thor Johnsen	0998a7bf20	[https://nvbugs/5721661 ][fix] Prevent out-of-bounds read (#9879 ) Signed-off-by: thorjohnsen <41591019+thorjohnsen@users.noreply.github.com>	2026-01-15 10:51:40 -06:00
heyuhhh	dfac07c045	[None][feat] Support to export data in trtllm-eval (#10075 ) Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>	2026-01-15 23:27:08 +08:00
forrestl	43b9db3364	[None][doc] doc updates (#10711 ) Signed-off-by: forrestl	2026-01-15 21:46:49 +08:00
Lizhi Zhou	93db0d5e18	[TRTLLM-9942][feat] new request states and kvcache transceiver APIs in generation-first disagg (#10406 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2026-01-15 19:18:21 +08:00
Jun Yang	3bc17e1aa3	[None][doc] doc updates (#10704 ) Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>	2026-01-15 19:05:26 +08:00
Lizhi Zhou	ff277b591e	[https://nvbugs/5791830 ][fix] fix pp loop hang caused by i-sending new requests (#10665 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2026-01-15 16:33:55 +08:00
yufeiwu-nv	cd55fb4551	[None][test] Remove NIM test (#10657 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2026-01-15 16:30:47 +08:00
Pengbo Wang	683515b1bd	[None][feat] Use XQA JIT impl by default and mitigate perf loss with sliding window (#10335 ) Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>	2026-01-15 15:47:00 +08:00
Perkz Zheng	71ccc07d2b	[None][feat] update trtllm-gen to support groupsTokensHeadsQ (#10261 ) Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com> Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com> Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2026-01-15 02:24:25 -05:00
Ludwig Schneider	e12a7119cf	[https://nvbugs/5741392 ][fix] [chore] Remove test exemptions from waivers tile (#10517 ) Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>	2026-01-14 22:07:52 -08:00
Yiqing Yan	f4ace99218	[None][chore] Bump version to 1.3.0rc0 (#10681 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2026-01-15 13:55:44 +08:00
ruodil	22240e43eb	[None][test] store per user output and per gpu output metric in csv file (#10658 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-15 00:51:08 -05:00
Emma Qiao	7b3b6f1161	[None][infra] Waive failed tests on main 01/15 (#10683 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-15 13:40:37 +08:00

1 2 3 4 5 ...

4700 Commits