Files
llama.cpp/ggml
Aman Gupta f8f0a47a55 cuda: reserve space for quantize kv-cache at startup (#23907)
* cuda: reserve space for quantize kv-cache at startup

* address review comments

* remove forward decl

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* remove assert in ggml-cuda.cu

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2026-06-03 18:39:59 +08:00
..
2024-07-13 18:12:39 +02:00