llama.cpp/tools at e93666076038c0bd26397feed6cfb8a6c6d04f74 - llama.cpp - Gitea: Git with a cup of tea

kanshan/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-26 14:20:21 +00:00

Files

T

History

willjoha ef22b3e4ac docs: fix metrics endpoint description in server README (#22879 )

* docs: fix metrics endpoint description in server README

Required model query parameter for router mode described.

Removed metrics:
- llamacpp:kv_cache_usage_ratio
- llamacpp:kv_cache_tokens

Added metrics:
- llamacpp:prompt_seconds_total
- llamacpp:tokens_predicted_seconds_total
- llamacpp:n_decode_total
- llamacpp:n_busy_slots_per_decode

* server: fix metrics type for n_busy_slots_per_decode metric

2026-05-11 18:32:26 +02:00

..

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

spec : parallel drafting support (#22838 )

2026-05-11 19:09:43 +03:00

docs : update speculative decoding parameters after refactor (#22397 ) (#22539 )

2026-05-04 08:52:07 +03:00

cvector-generator

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

fit-params : refactor + add option to output estimated memory per device (#22171 )

2026-04-21 09:54:36 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

spec : refactor params (#22397 )

2026-04-28 09:07:33 +03:00

mtmd: fix whisper audio tail truncation by exposing padded buffer to FFT (#22770 )

2026-05-07 14:01:01 +02:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

fit-params : refactor + add option to output estimated memory per device (#22171 )

2026-04-21 09:54:36 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

fix: rpc-server cache may not work in Windows environments (#22394 )

2026-04-27 17:25:09 +03:00

docs: fix metrics endpoint description in server README (#22879 )

2026-05-11 18:32:26 +02:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

CMakeLists.txt

llama: end-to-end tests (#19802 )

2026-03-08 12:30:21 +01:00