mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
1.1 KiB
1.1 KiB
Multimodal Feature Support Matrix (PyTorch Backend)
| Model | CUDA Graph | Encoder IFB | KV Cache Reuse | Chunked Prefill |
|---|---|---|---|---|
| Gemma 3 | Yes | Yes | N/A | N/A |
| HyperCLOVA | Yes | Yes | No | No |
| VILA | Yes | No | No | No |
| LLaVA-NeXT | Yes | Yes | Yes | Yes |
| Llama 4 | Yes | Yes | No | No |
| Mistral-Small-3.1 | Yes | Yes | No | No |
| Phi-4-multimodal | Yes | Yes | Yes | No |
| Qwen2-VL | Yes | Yes | Yes | Yes |
| Qwen2.5-VL | Yes | Yes | Yes | Yes |