TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

Author	SHA1	Message	Date
Fanrong Li	4931c5eb3a	[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-05 16:43:42 +08:00
Kaiyu Xie	85b4c92d60	[None] [chore] Update to cutlass 4.3 (#8637 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-11-28 08:54:34 +08:00
Fanrong Li	5a99c9734d	[TRTLLM-8777][feat] Update DeepGEMM to the latest commit to include optimizations for DeepSeek-v3.2 (#9380 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-25 08:58:08 +08:00
cheshirekow	6e5384d03c	[TRTLLM-9299][infra] Add third-party docs for python (#9366 ) In this change we rename 3rdparty/README.md (which contains the process playboook for C++ dependencies) to 3rdparty/cpp-thirdparty.md and add a new 3rdparty/py-thirdparty.md file which contains the process playbook for python dependencies. We also update the main 3rdparty/README.md file to serve as a starting-point referring to both of these files. Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com> Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>	2025-11-23 22:58:25 -08:00
cheshirekow	2810be7b3b	[TRTLLM-9211][infra] Minor fixes to 3rdparty/CMakelists (#9365 ) This change addresses the nitpick comments from coderabbit on the previous pull request !8986. None of the changes appear to be critical as the build is healthy without them, but they should provide some protection against future breakages if we change CMake version or or modify other build logic. This change consists of the following: 1. Add GIT_SUBMODULE_RECURSE ON to FetchContent_Declare calls for deepgemm and flashmla to ensure submodules are initialized in cmake versions where it is not the default. 2. Modify error messages in deep_gemm and flash_mla CMakeLists to indicate that submodule initialization failed if the expected submodule directories are not present. 3. Remove the NVTX include directories if the build is configured with NVTX_DISABLE off, to avoid potential confusions if NVTX is included on the compile commands when disabled. 4. Fix a minor CMake syntax issue in cpp/CMakeLists.txt where a message() call was missing parentheses around a string. Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com> Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>	2025-11-23 22:57:02 -08:00
cheshirekow	9b2abb8d28	[TRTLLM-9208][infra] Document the process for C++ deps (#9016 ) Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com> Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>	2025-11-21 09:22:11 -08:00
cheshirekow	1379cfac3a	[TRTLLM-9197][infra] Move thirdparty stuff to it's own listfile (#8986 ) Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com> Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>	2025-11-20 16:44:23 -08:00
Chang Liu	e47c787dd7	[TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache (#8405 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com> Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>	2025-10-24 13:40:41 -04:00
Ruoqian Guo	984d4fe0fe	[None][feat] Update 3rdparty/DeepGEMM to latest commit (#8488 ) Signed-off-by: Ruoqian Guo <22525902+ruoqianguo@users.noreply.github.com> Co-authored-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2025-10-21 06:56:50 +08:00
Jin Li	47e6eea3fa	[https://nvbugs/5543770 ][fix] Update to Cutlass v4.2.1 (#8055 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-10-16 22:46:19 +08:00
Enwei Zhu	8330d5363a	[TRTLLM-8209][feat] Support new structural tag API (upgrade XGrammar to 0.1.25) (#7893 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-09-23 09:10:09 +08:00
Barry Kang	8484aa9858	[None][fix] Fix DeepGEMM commit (#7875 ) Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2025-09-22 13:52:50 +08:00
xiweny	423e5f6a3c	[TRTLLM-6286] [feat] Update CUTLASS to 4.2 and enable SM103 group gemm (#7832 ) Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>	2025-09-19 09:50:54 +08:00
Barry Kang	4f0e6b5f96	[None][feat] Cherry-pick DeepGEMM related commits from release/1.1.0rc2 (#7716 ) Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2025-09-18 13:51:48 +08:00
xiweny	c076a02b38	[TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568 ) Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com> Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Signed-off-by: Daniel Stokes <dastokes@nvidia.com> Signed-off-by: Zhanrui Sun <zhanruis@nvidia.com> Signed-off-by: Xiwen Yu <xiweny@nvidia.com> Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com> Signed-off-by: Yiqing Yan <yiqingy@nvidia.com> Signed-off-by: Bo Deng <deemod@nvidia.com> Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com> Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Co-authored-by: Daniel Stokes <dastokes@nvidia.com> Co-authored-by: Zhanrui Sun <zhanruis@nvidia.com> Co-authored-by: Jiagan Cheng <jiaganc@nvidia.com> Co-authored-by: Yiqing Yan <yiqingy@nvidia.com> Co-authored-by: Bo Deng <deemod@nvidia.com> Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-09-16 09:56:18 +08:00
Zongfei Jing	0ff8df95b7	[https://nvbugs/5433581 ][fix] DeepGEMM installation on SBSA (#6588 ) Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>	2025-08-06 16:44:21 +08:00
Chuang Zhu	4d040b50b7	[None][chore] ucx establish connection with zmq (#6090 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-08-05 02:50:45 -04:00
Enwei Zhu	4b299cb77e	feat: Support structural tag in C++ runtime and upgrade xgrammar to 0.1.21 (#6408 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-07-31 09:53:52 +08:00
Linda	4d071eb2d1	feat: binding type build argument (pybind, nanobind) (#5802 ) Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>	2025-07-11 00:48:50 +09:00
Daniel Stokes	942841417e	opensource: Opensource MOE MXFP8-MXFP4 implementation (#5222 ) Signed-off-by: Daniel Stokes <40156487+djns99@users.noreply.github.com>	2025-06-26 12:18:19 +08:00
Yukun He	5097c86168	chore: Change cutlass version back to 4.0 (#5041 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-06-09 22:57:05 +08:00
Yukun He	137fe35539	fix: Fix warmup phase batch size out of range. (#4986 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com> Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>	2025-06-09 19:19:16 +08:00
yunruis	4e2fefc076	upgrade cutlass to 4.0 (#4794 ) Signed-off-by: yunruis <yunruis@nvidia.com>	2025-06-03 09:55:02 +08:00
shaharmor98	ede7058544	Feat/ Integrate peftCacheManager in PyExecutor creation (#3372 ) * integrate peftCacheManager in PyExecutor creation Signed-off-by: Shahar Mor <smor@nvidia.com>	2025-04-15 15:14:43 +08:00
Julien Debache	d7a0bf934c	fix: updating ucxx, which appears to avoid occasional segfaults when profiling (#3420 ) Signed-off-by: jdebache <jdebache@nvidia.com>	2025-04-10 19:48:20 +08:00
Robin Kobus	d9522c5906	feat: Update cutlass (#2981 ) * chore: update cutlass to v3.8.0 Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * refactor: update include directives for consistency and organization in weightOnlyBatchedGemv headers Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * Fix fpA_intB_gemm compilation Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> --------- Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>	2025-03-26 22:36:27 +08:00
Kaiyu Xie	ab5b19e027	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
Kaiyu Xie	e88da961c5	Update TensorRT-LLM (#2783 )	2025-02-13 18:40:22 +08:00
Dan Blanaru	16d2467ea8	Update TensorRT-LLM (#2755 ) * Update TensorRT-LLM --------- Co-authored-by: Denis Kayshev <topenkoff@gmail.com> Co-authored-by: akhoroshev <arthoroshev@gmail.com> Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com> Update	2025-02-11 03:01:00 +00:00
石晓伟	548b5b7310	Update TensorRT-LLM (#2532 ) * blossom-ci.yml: run vulnerability scan on blossom * open source efb18c1256f8c9c3d47b7d0c740b83e5d5ebe0ec --------- Co-authored-by: niukuo <6831097+niukuo@users.noreply.github.com> Co-authored-by: pei0033 <59505847+pei0033@users.noreply.github.com> Co-authored-by: Kyungmin Lee <30465912+lkm2835@users.noreply.github.com> Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2024-12-04 21:16:56 +08:00
Kaiyu Xie	f14d1d433c	Update TensorRT-LLM (#2389 ) * Update TensorRT-LLM --------- Co-authored-by: Alessio Netti <netti.alessio@gmail.com>	2024-10-29 22:24:38 +08:00
Kaiyu Xie	8681b3a4c0	open source 4dbf696ae9b74a26829d120b67ab8443d70c8e58 (#2297 ) * Update TensorRT-LLM --------- Co-authored-by: Bhuvanesh Sridharan <bhuvanesh.sridharan@sprinklr.com> Co-authored-by: Qingquan Song <ustcsqq@gmail.com>	2024-10-08 12:19:19 +02:00
Kaiyu Xie	31ac30e928	Update TensorRT-LLM (#2215 ) * Update TensorRT-LLM --------- Co-authored-by: Sherlock Xu <65327072+Sherlock113@users.noreply.github.com>	2024-09-10 18:21:22 +08:00
Kaiyu Xie	74b324f667	Update TensorRT-LLM (#2110 )	2024-08-13 22:34:33 +08:00
Kaiyu Xie	be9cd719f7	Update TensorRT-LLM (#2094 ) * Update TensorRT-LLM --------- Co-authored-by: akhoroshev <arthoroshev@gmail.com> Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com> Co-authored-by: Tayef Shah <tayefshah@gmail.com> Co-authored-by: lfz941 <linfanzai941@gmail.com>	2024-08-07 16:44:43 +08:00
Kaiyu Xie	66ef1df492	Update TensorRT-LLM (#1492 ) * Update TensorRT-LLM --------- Co-authored-by: Loki <lokravi@amazon.com>	2024-04-24 14:44:22 +08:00
Kaiyu Xie	728cc0044b	Update TensorRT-LLM (#1233 ) * Update TensorRT-LLM --------- Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com> Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>	2024-03-05 18:32:53 +08:00
Kaiyu Xie	0f041b7b57	Update TensorRT-LLM (#1098 ) * Update TensorRT-LLM * update submodule * Remove unused binaries	2024-02-18 15:48:08 +08:00
Kaiyu Xie	6755a3f077	Update TensorRT-LLM (#422 ) * Update TensorRT-LLM --------- Co-authored-by: Tltin <TltinDeng01@gmail.com> Co-authored-by: zhaohb <zhaohbcloud@126.com> Co-authored-by: Bradley Heilbrun <brad@repl.it> Co-authored-by: nqbao11 <nqbao11.01@gmail.com> Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai>	2023-11-18 00:05:54 +08:00
Kevin Xie	6111f5210b	Update submodule	2023-09-28 10:28:36 -07:00
Kevin Xie	496456efec	Update submodule	2023-09-28 10:00:48 -07:00
Kaiyu Xie	7736d528a1	Add 3rd party dependency	2023-09-20 00:50:59 -07:00

42 Commits