mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-07-01 16:50:20 +00:00
cc9e331213
- change `k_copy_src1_to_contiguous` so that uses a precomputed contiguous mapping where all rows "owned" by an expert are in one slice with a know starts and ends - switch the `O(n_as * n_routed_rows)` contraption to a counting sort-based procedure with `O(n_as + n_routed_rows)` complexity