TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-06 19:21:52 +08:00

History

benzh-2025 4c8468c5d3 [None][fix] default disable gemm+allreduce fusion (#10656 )		2026-01-20 12:31:17 +08:00
..
checkpoints	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 )	2026-01-13 17:13:55 +08:00
__init__.py	[None][feat] MiniMax M2 support (#10532 )	2026-01-14 17:38:58 +08:00
modeling_auto.py
modeling_bert.py
modeling_clip.py
modeling_deepseekv3.py	[https://nvbugs/5781589 ][fix] Implement pp skip forward for all spec workers. (#10578 )	2026-01-14 09:36:35 +08:00
modeling_exaone4.py
modeling_exaone_moe.py	[TRTLLM-10195][feat] K-EXAONE support (#10355 )	2026-01-12 00:29:51 +09:00
modeling_gemma3.py
modeling_gemma3vl.py
modeling_glm.py	[https://nvbugs/5781589 ][fix] Implement pp skip forward for all spec workers. (#10578 )	2026-01-14 09:36:35 +08:00
modeling_gpt_oss.py
modeling_hunyuan_dense.py
modeling_hunyuan_moe.py
modeling_hyperclovax.py
modeling_llama_min_latency.py
modeling_llama.py	[None][fix] default disable gemm+allreduce fusion (#10656 )	2026-01-20 12:31:17 +08:00
modeling_llava_next.py
modeling_minimaxm2.py	[None][feat] MiniMax M2 support (#10532 )	2026-01-14 17:38:58 +08:00
modeling_mistral_large3.py
modeling_mistral.py
modeling_mixtral.py
modeling_mllama.py
modeling_multimodal_encoder.py
modeling_multimodal_utils.py
modeling_nemotron_h.py	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 )	2026-01-13 17:13:55 +08:00
modeling_nemotron_nano.py
modeling_nemotron_nas.py
modeling_nemotron.py
modeling_phi3.py
modeling_phi4mm.py
modeling_pixtral.py
modeling_qwen2vl.py
modeling_qwen3_moe.py
modeling_qwen3_next.py	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 )	2026-01-13 19:17:03 +08:00
modeling_qwen3.py
modeling_qwen3vl_moe.py
modeling_qwen3vl.py
modeling_qwen_moe.py
modeling_qwen.py
modeling_radio.py
modeling_seedoss.py
modeling_siglip.py
modeling_speculative.py	[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099 )	2026-01-14 21:06:07 -08:00
modeling_starcoder2.py
modeling_utils.py	[https://nvbugs/5781589 ][fix] Implement pp skip forward for all spec workers. (#10578 )	2026-01-14 09:36:35 +08:00
modeling_vila.py