..
baichuan
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
bert
Update TensorRT-LLM ( #2502 )
2024-11-26 16:51:34 +08:00
bloom
Update TensorRT-LLM
2024-08-20 18:55:15 +08:00
chatglm
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
clip
Update ( #2978 )
2025-03-23 16:39:35 +08:00
cogvlm
Update TensorRT-LLM ( #2562 )
2024-12-11 00:31:05 -08:00
commandr
Update TensorRT-LLM ( #2562 )
2024-12-11 00:31:05 -08:00
dbrx
Update TensorRT-LLM ( #1793 )
2024-06-18 18:18:23 +08:00
deepseek_v1
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
deepseek_v2
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
dit
Update TensorRT-LLM ( #2215 )
2024-09-10 18:21:22 +08:00
eagle
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
enc_dec
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00
falcon
Update TensorRT-LLM ( #2562 )
2024-12-11 00:31:05 -08:00
gemma
feat: Add Gemma3 text-only model support ( #3247 )
2025-04-10 12:34:58 +08:00
gpt
Support prequantized fp8 ckpt for nemotron-mini-4b-instruct ( #3046 )
2025-04-01 14:52:09 +08:00
gptj
Update TensorRT-LLM ( #2562 )
2024-12-11 00:31:05 -08:00
gptneox
Update TensorRT-LLM ( #1891 )
2024-07-04 14:37:19 +08:00
grok
Update TensorRT-LLM ( #2562 )
2024-12-11 00:31:05 -08:00
llama
feat: Add NVFP4 UB pattern optimization pass in torch compile ( #3371 )
2025-04-11 21:25:29 +08:00
mamba
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
medusa
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
mllama
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00
mmdit_sd3
Update TensorRT-LLM ( #2849 )
2025-03-04 18:44:00 +08:00
mpt
Update TensorRT-LLM ( #1763 )
2024-06-11 16:59:02 +08:00
multimodal_encoders
Update ( #2978 )
2025-03-23 16:39:35 +08:00
nemotron_nas
Update TensorRT-LLM ( #2820 )
2025-02-25 21:21:49 +08:00
opt
Add initial EAGLE-3 implementation ( #3035 )
2025-03-29 22:31:24 +08:00
phi
Update ( #2978 )
2025-03-23 16:39:35 +08:00
phi3
Add support for Phi-4-mini ( #2990 )
2025-04-02 08:34:39 +08:00
qwen
Update TensorRT-LLM ( #2849 )
2025-03-04 18:44:00 +08:00
recurrentgemma
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
redrafter
fix: redrafter sampling ( #3278 )
2025-04-08 07:49:32 +08:00
stdit
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
unet
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00
__init__.py
feat: Add Gemma3 text-only model support ( #3247 )
2025-04-10 12:34:58 +08:00
automodel.py
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
convert_utils.py
Update TensorRT-LLM ( #2532 )
2024-12-04 21:16:56 +08:00
generation_mixin.py
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00
model_weights_loader.py
Add support for Phi-4-mini ( #2990 )
2025-04-02 08:34:39 +08:00
modeling_utils.py
feat: Add Gemma3 text-only model support ( #3247 )
2025-04-10 12:34:58 +08:00