TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Yan Chunwei 7568deb2f1 [nvbug/5387226] chore: add propogation for trust_remote_code to AutoConfig (#6001 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>		2025-07-16 16:05:38 +08:00
..
baichuan
bert
bloom
chatglm
clip
cogvlm
commandr
dbrx
deepseek_v1
deepseek_v2
dit
eagle	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00
enc_dec
falcon
gemma	[5305318] fix: Fix the accuracy issue when reduce_fusion is enabled for GEMMA model. (#5801 )	2025-07-08 19:51:05 +08:00
gpt	feat: Add support for fp8 rowwise quantization (#4876 )	2025-06-14 06:37:48 -07:00
gptj
gptneox
grok
llama
mamba
medusa	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00
mllama
mmdit_sd3
mpt
multimodal_encoders
nemotron_nas
opt
phi
phi3	fix: Unable to load phi4-model with tp_size>1 (#5962 )	2025-07-16 11:39:41 +08:00
qwen	[feat] Add TensorRT-Engine Qwen3 (dense) model support (#5650 )	2025-07-10 10:26:06 +08:00
recurrentgemma
redrafter	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00
stdit
unet
__init__.py	[feat] Add TensorRT-Engine Qwen3 (dense) model support (#5650 )	2025-07-10 10:26:06 +08:00
automodel.py	[nvbug/5387226] chore: add propogation for trust_remote_code to AutoConfig (#6001 )	2025-07-16 16:05:38 +08:00
convert_utils.py
generation_mixin.py
model_weights_loader.py
modeling_utils.py	[TRTLLM-6291] feat: Add user-provided speculative decoding support (#5204 )	2025-07-07 16:30:43 +02:00