Files
llama.cpp/ggml
Masashi Yoshimura 7c908502ea ggml-webgpu: improve MTP inference by using mat-vec path for small batches (#24811)
* ggml-webgpu: improve small batches decoding

* Add barrier to the NUM_COLS loop in mul-mat-vec
2026-06-23 17:13:55 +09:00
..
2024-07-13 18:12:39 +02:00