mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-30 08:10:20 +00:00
9c96465f99
* opencl: add `copy_to_contiguous` and utilize mm kernels * opencl: only copy to cont for f32 and f16 tensors * opencl: use cont mm for fallback when dst is large * opencl: use nb local to copy-to-cont * opencl: use local offset as well