llama.cpp/tools at b9510 - llama.cpp - Gitea: Git with a cup of tea

kanshan/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-28 15:20:20 +00:00

Files

T

History

Yongyue Sun 6f3a9f3dee server: avoid unnecessary checkpoint restore when new tokens are present (#24110 )

* server: avoid unnecessary checkpoint restore when new tokens are present

The pos_min_thold calculation unconditionally subtracts 1 to ensure at
least one token is evaluated for logits when no new tokens exist.
However, when the request contains new tokens beyond the cached prefix,
this -1 is overly conservative and may trigger an unnecessary checkpoint
restore.

Conditionally apply the -1 only when n_past >= task.n_tokens() (no new
tokens), avoiding redundant KV state restoration when there is actual
work to do.

* cont : add ref

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

2026-06-04 16:09:01 +03:00

..

cmake : add install() for impl libraries + fix apple builds (#23511 )

2026-05-22 11:46:26 +03:00

app: add llama update self updater (#23865 )

2026-05-29 23:02:40 +02:00

common : fix state save in common_prompt_batch_decode (#23468 )

2026-06-02 15:44:15 +02:00

cvector-generator

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

cmake : add install() for impl libraries + fix apple builds (#23511 )

2026-05-22 11:46:26 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

Support -fa auto in llama-bench (#23714 )

2026-05-31 02:03:57 +05:30

fix(mtmd): handle Gemma 4 audio projector embedding size (#24091 )

2026-06-04 11:51:23 +02:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

perplexity : fix format specifier in LOG_ERR (#23788 )

2026-05-28 10:34:58 +03:00

cmake : add install() for impl libraries + fix apple builds (#23511 )

2026-05-22 11:46:26 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

fix: rpc-server cache may not work in Windows environments (#22394 )

2026-04-27 17:25:09 +03:00

server: avoid unnecessary checkpoint restore when new tokens are present (#24110 )

2026-06-04 16:09:01 +03:00

libs : rename libcommon -> libllama-common (#21936 )

2026-04-17 11:11:46 +03:00

logs : reduce (#23021 )

2026-05-14 13:05:52 +03:00

webui: fix tool selector toggle/counter, key tools by stable identity (#24065 )

2026-06-04 13:09:49 +02:00

CMakeLists.txt

cmake: skip cvector-generator and export-lora when CPU backend is disabled (#24053 )

2026-06-04 13:13:19 +03:00