TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

Author	SHA1	Message	Date
Chang Liu	31bc14b350	[TRTLLM-9654][feat] Support DeepSeek-V32 chat template (#9814 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-19 17:05:38 +08:00
Mike Iovine	69b4e52757	[None][chore] Update linter rules for mass integration Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-11-20 12:43:13 -05:00
Tailing Yuan	cc4c980e03	[None][feat] Add Qwen3-Next to layer-wise benchmarks (#9065 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2025-11-14 10:03:00 +08:00
chenfeiz0326	cc4ab8d9d1	[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database (#8653 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-11-03 16:23:13 +08:00
Robin Kobus	1b3ad7259d	[None][feat] Use ruff for formatting and linting new files by default (#8629 ) Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>	2025-11-01 16:11:40 +01:00
mpikulski	97ce0ecefe	[TRTLLM-8436][feat] batched sampling and top-k logprobs improvements (#8398 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2025-10-20 11:15:41 +02:00
mpikulski	8298e93bd8	[TRTLLM-8414][chore] BREAKING CHANGE: refine sampling strategy selection (#8132 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2025-10-08 15:46:50 +02:00
William Zhang	92576488d3	[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013 ) * Why? Some models (e.g. anything produced by Mistral) can have both sharded safetensors and a consolidated safetensor in the same checkpoint directory. In such cases, prefetching both to memory is a waste of time, and memory. * What? This commit skips over consolidated safetensors when they are not the only safetensor file present in the checkpoint directory Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-08-25 23:56:21 -04:00
Jiagan Cheng	afb116f703	[None][fix] Fix python-only build that uses TRTLLM_USE_PRECOMPILED (#6825 ) Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>	2025-08-14 23:26:35 +08:00
Sergey Klevtsov	27fc35175e	[None][feat] CUTLASS MoE FC2+Finalize fusion (#3294 ) Signed-off-by: Sergey Klevtsov <sklevtsov@nvidia.com>	2025-08-12 15:56:48 +08:00
2ez4bz	87fe44fd29	feat(models): Mistral3.1 VLM pytorch backend support (#5529 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-07-09 13:17:40 -07:00
Daniel Stokes	942841417e	opensource: Opensource MOE MXFP8-MXFP4 implementation (#5222 ) Signed-off-by: Daniel Stokes <40156487+djns99@users.noreply.github.com>	2025-06-26 12:18:19 +08:00
2ez4bz	dc52b67492	linting(python): Enable ruff on more files (wave 1/N) (#5140 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-06-14 19:19:34 +08:00
QI JUN	d51ae53940	move the reset models into `examples/models/core` directory (#3555 ) * move rest models to examples/models/core directory Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * update multimodal readme Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix example path Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix cpp test Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix tensorrt test Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> --------- Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-04-19 20:48:59 -07:00
Kaiyu Xie	3aa6b11d13	Update TensorRT-LLM (#2936 ) * Update TensorRT-LLM --------- Co-authored-by: changcui <cuichang147@gmail.com>	2025-03-18 21:25:19 +08:00
Kaiyu Xie	ab5b19e027	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
Dan Blanaru	16d2467ea8	Update TensorRT-LLM (#2755 ) * Update TensorRT-LLM --------- Co-authored-by: Denis Kayshev <topenkoff@gmail.com> Co-authored-by: akhoroshev <arthoroshev@gmail.com> Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com> Update	2025-02-11 03:01:00 +00:00

17 Commits