Multimodal Feature Support Matrix (PyTorch Backend)#

Model

CUDA Graph

Encoder IFB

KV Cache Reuse

Chunked Prefill

Gemma 3

Yes

Yes

N/A

N/A

HyperCLOVA

Yes

Yes

No

No

VILA

Yes

No

No

No

LLaVA-NeXT

Yes

Yes

Yes

Yes

Llama 4

Yes

Yes

No

No

Mistral-Small-3.1

Yes

Yes

No

No

Phi-4-multimodal

Yes

Yes

No

No

Qwen2-VL

Yes

Yes

Yes

Yes

Qwen2.5-VL

Yes

Yes

Yes

Yes