mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-29 15:50:22 +00:00
24af22fc36
* ggml : optimize cuda ssm_scan using warp-level reduction * ggml : apply code review suggestions (style, const, constexpr) * ggml : add TODO regarding stride consistency