mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-07-02 01:00:20 +00:00
0f7fada56b
* cuda: reset device in get_memory function if no backend is active * also count device and host buffers * exclude hip and musa from counting and device reset * use device mutex instead of atomic * undo backend_free function move