mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-30 08:10:20 +00:00
ad51c0a720
* vulkan: remove the need for the dryrun Allocate pipelines and descriptor sets when requested. Reallocate the prealloc buffers when needed, and flush any pending work before reallocating. For rms_partials and total_mul_mat_bytes, use the sizes computed the last time the graph was executed. * remove dryrun parameters