This website requires JavaScript.
Explore
Help
Sign In
kanshan
/
TensorRT-LLMs
Watch
1
Star
0
Fork
0
You've already forked TensorRT-LLMs
mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced
2026-02-06 19:21:52 +08:00
Code
Issues
Actions
1
Packages
Projects
Releases
Wiki
Activity
a4152c80f6
TensorRT-LLMs
/
tensorrt_llm
/
_torch
/
models
History
benzh-2025
4c8468c5d3
[None][fix] default disable gemm+allreduce fusion (
#10656
)
2026-01-20 12:31:17 +08:00
..
checkpoints
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (
#10347
)
2026-01-13 17:13:55 +08:00
__init__.py
[None][feat] MiniMax M2 support (
#10532
)
2026-01-14 17:38:58 +08:00
modeling_auto.py
modeling_bert.py
modeling_clip.py
modeling_deepseekv3.py
[
https://nvbugs/5781589
][fix] Implement pp skip forward for all spec workers. (
#10578
)
2026-01-14 09:36:35 +08:00
modeling_exaone4.py
modeling_exaone_moe.py
[TRTLLM-10195][feat] K-EXAONE support (
#10355
)
2026-01-12 00:29:51 +09:00
modeling_gemma3.py
modeling_gemma3vl.py
modeling_glm.py
[
https://nvbugs/5781589
][fix] Implement pp skip forward for all spec workers. (
#10578
)
2026-01-14 09:36:35 +08:00
modeling_gpt_oss.py
modeling_hunyuan_dense.py
modeling_hunyuan_moe.py
modeling_hyperclovax.py
modeling_llama_min_latency.py
modeling_llama.py
[None][fix] default disable gemm+allreduce fusion (
#10656
)
2026-01-20 12:31:17 +08:00
modeling_llava_next.py
modeling_minimaxm2.py
[None][feat] MiniMax M2 support (
#10532
)
2026-01-14 17:38:58 +08:00
modeling_mistral_large3.py
modeling_mistral.py
modeling_mixtral.py
modeling_mllama.py
modeling_multimodal_encoder.py
modeling_multimodal_utils.py
modeling_nemotron_h.py
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (
#10347
)
2026-01-13 17:13:55 +08:00
modeling_nemotron_nano.py
modeling_nemotron_nas.py
modeling_nemotron.py
modeling_phi3.py
modeling_phi4mm.py
modeling_pixtral.py
modeling_qwen2vl.py
modeling_qwen3_moe.py
modeling_qwen3_next.py
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (
#10562
)
2026-01-13 19:17:03 +08:00
modeling_qwen3.py
modeling_qwen3vl_moe.py
modeling_qwen3vl.py
modeling_qwen_moe.py
modeling_qwen.py
modeling_radio.py
modeling_seedoss.py
modeling_siglip.py
modeling_speculative.py
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (
#10099
)
2026-01-14 21:06:07 -08:00
modeling_starcoder2.py
modeling_utils.py
[
https://nvbugs/5781589
][fix] Implement pp skip forward for all spec workers. (
#10578
)
2026-01-14 09:36:35 +08:00
modeling_vila.py