llama.cpp/common at b7445 - llama.cpp - Gitea: Git with a cup of tea

kanshan/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-28 15:20:20 +00:00

Files

T

History

TrevorS 4b2a4778f8 arg: allow -kvu flag for llama-perplexity (#18117 )

The -kvu (--kv-unified) flag is required for hellaswag and winogrande
benchmarks which use coupled sequences. Without unified KV cache,
these benchmarks fail with:

  split_equal: sequential split is not supported when there are
  coupled sequences in the input batch (you may need to use the -kvu flag)

This change adds LLAMA_EXAMPLE_PERPLEXITY to the allowed examples for
the -kvu argument, enabling its use with llama-perplexity.

2025-12-17 08:33:02 +02:00

..

arg.cpp

arg: allow -kvu flag for llama-perplexity (#18117 )

2025-12-17 08:33:02 +02:00

arg.h

arg: fix common_params_parse not accepting negated arg (#17991 )

2025-12-13 12:53:37 +01:00

base64.hpp

llava : expose as a shared library for downstream projects (#3613 )

2023-11-07 00:36:23 +03:00

build-info.cpp.in

cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167 )

2025-06-13 10:38:52 +02:00

chat-parser-xml-toolcall.cpp

Fix Kimi-K2 tool-call parsing issues (#17376 )

2025-12-08 14:32:04 +01:00

chat-parser-xml-toolcall.h

Fix Kimi-K2 tool-call parsing issues (#17376 )

2025-12-08 14:32:04 +01:00

chat-parser.cpp

Fix Kimi-K2 tool-call parsing issues (#17376 )

2025-12-08 14:32:04 +01:00

chat-parser.h

common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )

2025-11-18 18:54:15 +01:00

chat-peg-parser.cpp

common : add nemotron 3 parsing (#18077 )

2025-12-16 04:05:23 -06:00

chat-peg-parser.h

common : introduce composable PEG parser combinators for chat parsing (#17136 )

2025-12-03 12:45:32 +02:00

chat.cpp

common : add nemotron 3 parsing (#18077 )

2025-12-16 04:05:23 -06:00

chat.h

chat : reserve memory in compute_diffs and improve naming (#17729 )

2025-12-03 17:22:10 +02:00

CMakeLists.txt

server: add presets (config) when using multiple models (#17859 )

2025-12-10 22:18:21 +01:00

common.cpp

llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 )

2025-12-15 09:24:59 +01:00

common.h

llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 )

2025-12-15 09:24:59 +01:00

console.cpp

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

console.h

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

download.cpp

common : add minimalist multi-thread progress bar (#17602 )

2025-12-12 12:44:35 +01:00

download.h

server: introduce API for serving / loading / unloading multiple models (#17470 )

2025-12-01 19:41:04 +01:00

http.h

common: introduce http.h for httplib-based client (#16373 )

2025-10-01 20:22:18 +03:00

json-partial.cpp

common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )

2025-11-18 18:54:15 +01:00

json-partial.h

sync : vendor (#13901 )

2025-05-30 16:25:45 +03:00

json-schema-to-grammar.cpp

common : add nemotron 3 parsing (#18077 )

2025-12-16 04:05:23 -06:00

json-schema-to-grammar.h

common : add nemotron 3 parsing (#18077 )

2025-12-16 04:05:23 -06:00

llguidance.cpp

llguidance : set tokenizer slices to default (#13424 )

2025-05-10 17:19:52 +02:00

log.cpp

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

log.h

cli: new CLI experience (#17824 )

2025-12-10 15:28:59 +01:00

ngram-cache.cpp

ggml : portability fixes for VS 2017 (#12150 )

2025-03-04 18:53:26 +02:00

ngram-cache.h

llama : use LLAMA_TOKEN_NULL (#11062 )

2025-01-06 10:52:15 +02:00

peg-parser.cpp

common : add nemotron 3 parsing (#18077 )

2025-12-16 04:05:23 -06:00

peg-parser.h

common : introduce composable PEG parser combinators for chat parsing (#17136 )

2025-12-03 12:45:32 +02:00

preset.cpp

preset: handle negated arg, reverse the meaning if needed (#18041 )

2025-12-14 22:08:10 +01:00

preset.h

server: add presets (config) when using multiple models (#17859 )

2025-12-10 22:18:21 +01:00

regex-partial.cpp

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

regex-partial.h

common: add partial regex support (#12808 )

2025-05-14 19:50:57 +01:00

sampling.cpp

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

sampling.h

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

speculative.cpp

common : refactor common_sampler + grammar logic changes (#17937 )

2025-12-14 10:11:13 +02:00

speculative.h

server : implement universal assisted decoding (#12635 )

2025-07-31 14:25:23 +02:00

unicode.cpp

common : introduce composable PEG parser combinators for chat parsing (#17136 )

2025-12-03 12:45:32 +02:00

unicode.h

common : introduce composable PEG parser combinators for chat parsing (#17136 )

2025-12-03 12:45:32 +02:00