TensorRT-LLMs/tensorrt_llm/_torch/models
Venky b3146d095d
[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2026-01-22 07:24:11 -08:00
..
checkpoints [#8241][feat] Support model_kwargs for pytorch backend (#10351) 2026-01-21 20:51:38 -08:00
__init__.py [None][feat] MiniMax M2 support (#10532) 2026-01-14 17:38:58 +08:00
modeling_auto.py [TRTC-122][feat] Eagle3 Specdec UX improvements (#10124) 2026-01-22 07:24:11 -08:00
modeling_bert.py
modeling_clip.py
modeling_deepseekv3.py [https://nvbugs/5781589][fix] Implement pp skip forward for all spec workers. (#10578) 2026-01-14 09:36:35 +08:00
modeling_exaone4.py [https://nvbugs/5569713][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 (#8429) 2025-11-20 12:43:13 -05:00
modeling_exaone_moe.py [None][feat] K-EXAONE MTP support (#10796) 2026-01-22 13:43:00 +09:00
modeling_gemma3.py
modeling_gemma3vl.py [None][fix] Multimodal InputProcessor dummy builder fix (#8916) 2025-11-19 22:32:21 -08:00
modeling_glm.py [None][feat] GLM-4.5-Air support (#10653) 2026-01-22 11:42:09 +08:00
modeling_gpt_oss.py [None][feat] Support nvfp4 for gptoss (#8956) 2026-01-04 08:57:44 -05:00
modeling_hunyuan_dense.py
modeling_hunyuan_moe.py [TRTLLM-8958][feat] and [TRTLLM-8960]: create ConfigurableMoE and support TRTLLMGenFusedMoE as backend (#9486) 2025-12-01 08:37:07 +08:00
modeling_hyperclovax.py [None][fix] Multimodal InputProcessor dummy builder fix (#8916) 2025-11-19 22:32:21 -08:00
modeling_llama_min_latency.py [TRTLLM-9293][feat] Enable partial weight loading to support streaming update weights (#9224) 2025-11-26 10:59:06 +08:00
modeling_llama.py [None][fix] default disable gemm+allreduce fusion (#10656) 2026-01-20 12:31:17 +08:00
modeling_llava_next.py [TRTLLM-9409][feat] Pass MRoPE tensors for EPD disagg (#9758) 2025-12-22 06:32:49 -05:00
modeling_minimaxm2.py [None][feat] MiniMax M2 support (#10532) 2026-01-14 17:38:58 +08:00
modeling_mistral_large3.py [None][feat] Support Mistral Large3 LLM part (#9820) 2025-12-13 11:44:27 +08:00
modeling_mistral.py [None][fix] Mistral large 3 few code refine (#10405) 2026-01-08 06:38:49 -05:00
modeling_mixtral.py
modeling_mllama.py
modeling_multimodal_encoder.py
modeling_multimodal_utils.py [TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689) 2025-12-15 20:05:20 -08:00
modeling_nemotron_h.py [TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347) 2026-01-13 17:13:55 +08:00
modeling_nemotron_nano.py [None][fix] Multimodal InputProcessor dummy builder fix (#8916) 2025-11-19 22:32:21 -08:00
modeling_nemotron_nas.py [None][feat] add specdec to nemotron nas (#8985) 2025-11-19 19:28:35 +01:00
modeling_nemotron.py
modeling_phi3.py
modeling_phi4mm.py [None][fix] Multimodal InputProcessor dummy builder fix (#8916) 2025-11-19 22:32:21 -08:00
modeling_pixtral.py
modeling_qwen2vl.py [None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435) 2026-01-10 00:13:26 +09:00
modeling_qwen3_moe.py [TRTLLM-9992][perf] Enable PDL for CuteDSL kernels and overlap MoeOutputMemset (#10043) 2025-12-20 03:12:41 -05:00
modeling_qwen3_next.py [None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562) 2026-01-13 19:17:03 +08:00
modeling_qwen3.py [#4745][fix] Pass lora_params through Qwen2/3 model forward (#10174) 2026-01-07 15:30:17 +08:00
modeling_qwen3vl_moe.py [TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689) 2025-12-15 20:05:20 -08:00
modeling_qwen3vl.py [None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435) 2026-01-10 00:13:26 +09:00
modeling_qwen_moe.py
modeling_qwen.py [#4745][fix] Pass lora_params through Qwen2/3 model forward (#10174) 2026-01-07 15:30:17 +08:00
modeling_radio.py
modeling_seedoss.py
modeling_siglip.py
modeling_speculative.py [TRTC-122][feat] Eagle3 Specdec UX improvements (#10124) 2026-01-22 07:24:11 -08:00
modeling_starcoder2.py [TRTLLM-7967][feat] Adding Starcoder2 PyTorch Backend Support (#8923) 2025-11-24 11:23:22 -08:00
modeling_utils.py [https://nvbugs/5781589][fix] Implement pp skip forward for all spec workers. (#10578) 2026-01-14 09:36:35 +08:00
modeling_vila.py [None][fix] Multimodal InputProcessor dummy builder fix (#8916) 2025-11-19 22:32:21 -08:00