Files
llama.cpp/ggml
Jeff Bolz c74759a244 vulkan: Use cm2 decode_vector for mul_mat_id B matrix loads (#23991)
This allows vec4 loads of the B elements. Also increase BK to 64 when this is
enabled. Neither of these alone is consistently faster, but together these give
a nice speedup.

In ggml-vulkan.cpp, we need to make sure the B matrix alignment and stride are
multiples of 4.
2026-06-08 10:40:37 +02:00
..
2024-07-13 18:12:39 +02:00