TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-05 18:51:38 +08:00

History

Venky b3146d095d [TRTC-122][feat] Eagle3 Specdec UX improvements (#10124 ) Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>		2026-01-22 07:24:11 -08:00
..
checkpoints	[#8241 ][feat] Support model_kwargs for pytorch backend (#10351 )	2026-01-21 20:51:38 -08:00
__init__.py	[None][feat] MiniMax M2 support (#10532 )	2026-01-14 17:38:58 +08:00
modeling_auto.py	[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124 )	2026-01-22 07:24:11 -08:00
modeling_bert.py
modeling_clip.py
modeling_deepseekv3.py	[https://nvbugs/5781589 ][fix] Implement pp skip forward for all spec workers. (#10578 )	2026-01-14 09:36:35 +08:00
modeling_exaone4.py	[https://nvbugs/5569713 ][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 (#8429 )	2025-11-20 12:43:13 -05:00
modeling_exaone_moe.py	[None][feat] K-EXAONE MTP support (#10796 )	2026-01-22 13:43:00 +09:00
modeling_gemma3.py
modeling_gemma3vl.py	[None][fix] Multimodal InputProcessor dummy builder fix (#8916 )	2025-11-19 22:32:21 -08:00
modeling_glm.py	[None][feat] GLM-4.5-Air support (#10653 )	2026-01-22 11:42:09 +08:00
modeling_gpt_oss.py	[None][feat] Support nvfp4 for gptoss (#8956 )	2026-01-04 08:57:44 -05:00
modeling_hunyuan_dense.py
modeling_hunyuan_moe.py	[TRTLLM-8958][feat] and [TRTLLM-8960]: create ConfigurableMoE and support TRTLLMGenFusedMoE as backend (#9486 )	2025-12-01 08:37:07 +08:00
modeling_hyperclovax.py	[None][fix] Multimodal InputProcessor dummy builder fix (#8916 )	2025-11-19 22:32:21 -08:00
modeling_llama_min_latency.py	[TRTLLM-9293][feat] Enable partial weight loading to support streaming update weights (#9224 )	2025-11-26 10:59:06 +08:00
modeling_llama.py	[None][fix] default disable gemm+allreduce fusion (#10656 )	2026-01-20 12:31:17 +08:00
modeling_llava_next.py	[TRTLLM-9409][feat] Pass MRoPE tensors for EPD disagg (#9758 )	2025-12-22 06:32:49 -05:00
modeling_minimaxm2.py	[None][feat] MiniMax M2 support (#10532 )	2026-01-14 17:38:58 +08:00
modeling_mistral_large3.py	[None][feat] Support Mistral Large3 LLM part (#9820 )	2025-12-13 11:44:27 +08:00
modeling_mistral.py	[None][fix] Mistral large 3 few code refine (#10405 )	2026-01-08 06:38:49 -05:00
modeling_mixtral.py
modeling_mllama.py
modeling_multimodal_encoder.py
modeling_multimodal_utils.py	[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689 )	2025-12-15 20:05:20 -08:00
modeling_nemotron_h.py	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 )	2026-01-13 17:13:55 +08:00
modeling_nemotron_nano.py	[None][fix] Multimodal InputProcessor dummy builder fix (#8916 )	2025-11-19 22:32:21 -08:00
modeling_nemotron_nas.py	[None][feat] add specdec to nemotron nas (#8985 )	2025-11-19 19:28:35 +01:00
modeling_nemotron.py
modeling_phi3.py
modeling_phi4mm.py	[None][fix] Multimodal InputProcessor dummy builder fix (#8916 )	2025-11-19 22:32:21 -08:00
modeling_pixtral.py
modeling_qwen2vl.py	[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435 )	2026-01-10 00:13:26 +09:00
modeling_qwen3_moe.py	[TRTLLM-9992][perf] Enable PDL for CuteDSL kernels and overlap MoeOutputMemset (#10043 )	2025-12-20 03:12:41 -05:00
modeling_qwen3_next.py	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 )	2026-01-13 19:17:03 +08:00
modeling_qwen3.py	[#4745 ][fix] Pass lora_params through Qwen2/3 model forward (#10174 )	2026-01-07 15:30:17 +08:00
modeling_qwen3vl_moe.py	[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689 )	2025-12-15 20:05:20 -08:00
modeling_qwen3vl.py	[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435 )	2026-01-10 00:13:26 +09:00
modeling_qwen_moe.py
modeling_qwen.py	[#4745 ][fix] Pass lora_params through Qwen2/3 model forward (#10174 )	2026-01-07 15:30:17 +08:00
modeling_radio.py
modeling_seedoss.py
modeling_siglip.py
modeling_speculative.py	[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124 )	2026-01-22 07:24:11 -08:00
modeling_starcoder2.py	[TRTLLM-7967][feat] Adding Starcoder2 PyTorch Backend Support (#8923 )	2025-11-24 11:23:22 -08:00
modeling_utils.py	[https://nvbugs/5781589 ][fix] Implement pp skip forward for all spec workers. (#10578 )	2026-01-14 09:36:35 +08:00
modeling_vila.py	[None][fix] Multimodal InputProcessor dummy builder fix (#8916 )	2025-11-19 22:32:21 -08:00