mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 20:23:08 +08:00

Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>

2025-09-14 20:10:10 -07:00

Multimodal Feature Support Matrix (PyTorch Backend)

Model	CUDA Graph	Encoder IFB	KV Cache Reuse	Chunked Prefill
Gemma 3	Yes	Yes	N/A	N/A
HyperCLOVA	Yes	Yes	No	No
VILA	Yes	No	No	No
LLaVA-NeXT	Yes	Yes	Yes	Yes
Llama 4	Yes	Yes	No	No
Mistral-Small-3.1	Yes	Yes	No	No
Phi-4-multimodal	Yes	Yes	No	No
Qwen2-VL	Yes	Yes	Yes	Yes
Qwen2.5-VL	Yes	Yes	Yes	Yes