llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-26 14:20:21 +00:00

Files

T

Radoslav Gerganov bcf7546160 server : add arg for disabling prompt caching (#18776 )

* server : add arg for disabling prompt caching

Disabling prompt caching is useful for clients who are restricted to
sending only OpenAI-compat requests and want deterministic
responses.

* address review comments

* address review comments

2026-01-12 19:21:34 +02:00

batched-bench

tool/ex/tests: consistently free ctx, then model (#18168 )

2025-12-22 11:00:37 +01:00

cli

server: update docs for sleeping [no ci] (#18777 )

2026-01-12 13:01:24 +01:00

completion

server: update docs for sleeping [no ci] (#18777 )

2026-01-12 13:01:24 +01:00

cvector-generator

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

export-lora

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

fit-params

llama-fit-params: free memory target per device (#18679 )

2026-01-08 10:07:58 +01:00

gguf-split

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

imatrix

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

llama-bench

common: fix return value check for setpriority (#18412 )

2025-12-29 11:07:49 +02:00

mtmd

mtmd: Add Gemma3n multimodal support with MobileNetV5 vision encoder (#18256 )

2026-01-09 23:42:38 +01:00

perplexity

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

quantize

quantize: prevent input/output file collision (#18451 )

2025-12-31 23:29:03 +08:00

rpc

Install rpc-server when GGML_RPC is ON. (#17149 )

2025-11-11 10:53:59 +00:00

server

server : add arg for disabling prompt caching (#18776 )

2026-01-12 19:21:34 +02:00

tokenize

cmake : Do not install tools on iOS targets (#15903 )

2025-09-16 09:54:44 +07:00

tts

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

CMakeLists.txt

cmake: only build cli when server is enabled (#18670 )

2026-01-09 16:43:26 +01:00