mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-26 14:20:21 +00:00
ef22b3e4ac
* docs: fix metrics endpoint description in server README Required model query parameter for router mode described. Removed metrics: - llamacpp:kv_cache_usage_ratio - llamacpp:kv_cache_tokens Added metrics: - llamacpp:prompt_seconds_total - llamacpp:tokens_predicted_seconds_total - llamacpp:n_decode_total - llamacpp:n_busy_slots_per_decode * server: fix metrics type for n_busy_slots_per_decode metric