TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Lucas Liebenwein a2fb5afecf [#11032 ][feat] MLA revisited and GLM 4.7 Flash support (#11324 )		2026-02-09 23:26:51 -05:00
..
attn_backend_triton.yaml	[None][chore] Update AutoDeploy model list (#10505 )	2026-01-10 08:47:37 +02:00
compile_backend_torch_cudagraph.yaml
dashboard_default.yaml
demollm_triton.yaml
gemma3_1b.yaml
glm-4.7-flash.yaml	[#11032 ][feat] MLA revisited and GLM 4.7 Flash support (#11324 )	2026-02-09 23:26:51 -05:00
llama3_3_70b.yaml	[#10013 ][feat] AutoDeploy: native cache manager integration (#10635 )	2026-01-27 11:23:22 -05:00
llama4_maverick_lite.yaml
llama4_scout.yaml	[#10013 ][feat] AutoDeploy: native cache manager integration (#10635 )	2026-01-27 11:23:22 -05:00
multimodal.yaml
num_hidden_layers_5.yaml	[None][chore] update model list (#11364 )	2026-02-09 21:27:39 +02:00
openelm.yaml
qwen3_vl.yaml	[None][chore] update model list (#11364 )	2026-02-09 21:27:39 +02:00
simple_shard_only.yaml
world_size_1.yaml
world_size_2.yaml
world_size_4.yaml
world_size_8.yaml