TensorRT-LLMs/tensorrt_llm/models
Yan Chunwei 7568deb2f1
[nvbug/5387226] chore: add propogation for trust_remote_code to AutoConfig (#6001)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-07-16 16:05:38 +08:00
..
baichuan
bert
bloom
chatglm
clip
cogvlm
commandr
dbrx
deepseek_v1
deepseek_v2
dit
eagle [refactor] Simplification of Speculative decoding configs (#5639) 2025-07-10 11:37:30 -04:00
enc_dec
falcon
gemma [5305318] fix: Fix the accuracy issue when reduce_fusion is enabled for GEMMA model. (#5801) 2025-07-08 19:51:05 +08:00
gpt feat: Add support for fp8 rowwise quantization (#4876) 2025-06-14 06:37:48 -07:00
gptj
gptneox
grok
llama
mamba
medusa [refactor] Simplification of Speculative decoding configs (#5639) 2025-07-10 11:37:30 -04:00
mllama
mmdit_sd3
mpt
multimodal_encoders
nemotron_nas
opt
phi
phi3 fix: Unable to load phi4-model with tp_size>1 (#5962) 2025-07-16 11:39:41 +08:00
qwen [feat] Add TensorRT-Engine Qwen3 (dense) model support (#5650) 2025-07-10 10:26:06 +08:00
recurrentgemma
redrafter [refactor] Simplification of Speculative decoding configs (#5639) 2025-07-10 11:37:30 -04:00
stdit
unet
__init__.py [feat] Add TensorRT-Engine Qwen3 (dense) model support (#5650) 2025-07-10 10:26:06 +08:00
automodel.py [nvbug/5387226] chore: add propogation for trust_remote_code to AutoConfig (#6001) 2025-07-16 16:05:38 +08:00
convert_utils.py
generation_mixin.py
model_weights_loader.py
modeling_utils.py [TRTLLM-6291] feat: Add user-provided speculative decoding support (#5204) 2025-07-07 16:30:43 +02:00