llama.cpp/tools/server/tests/unit at 8f83d6c271d194bde2d410145a0ce73bc42e85cd - llama.cpp - Gitea: Git with a cup of tea

kanshan/llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-26 14:20:21 +00:00

Files

T

History

Xuan-Son Nguyen f5c6ae1827 mtmd, server: add "placeholder bitmap" for counting tokens , add */input_tokens API (#23913 )

* mtmd: add "placeholder bitmap" for counting tokens w/o preprocessing

* fast path skip preproc for placeholder

* fix build

* correct the api

* add server endpoint + tests

* add object name

* update docs

* add proxy handling

* fix build

* fix audio input path

* use is_placeholder in process_mtmd_prompt()

* nits

* nits (2)

* docs: clarify chat/completions/input_tokens is not official

* fix merge problem

2026-06-06 11:06:51 +02:00

..

test_basic.py

server : support multiple model aliases via comma-separated --alias (#19926 )

2026-02-27 07:05:23 +01:00

test_chat_completion.py

mtmd, server: add "placeholder bitmap" for counting tokens , add */input_tokens API (#23913 )

2026-06-06 11:06:51 +02:00

test_compat_anthropic.py

server: Add cached_tokens info to oaicompat responses (#19361 )

2026-03-19 19:09:33 +01:00

test_compat_gcp.py

server: support Vertex AI compatible API (#22545 )

2026-05-08 15:23:04 +02:00

test_compat_oai_responses.py

server: /v1/responses (partial) (#18486 )

2026-01-21 17:47:23 +01:00

test_completion.py

backend sampling: support returning post-sampling probs (#22622 )

2026-05-10 19:12:02 +02:00

test_ctx_shift.py

memory : remove KV cache size padding (#16812 )

2025-10-28 20:19:44 +02:00

test_embedding.py

llama : fix pooling assertion crash in chunked GDN detection path (#20468 )

2026-03-13 20:53:42 +02:00

test_ignore_eos.py

server: respect the ignore eos flag (#21203 )

2026-04-08 17:12:15 +02:00

test_infill.py

server : support unified cache across slots (#16736 )

2025-11-02 18:14:04 +02:00

test_kv_keep_only_active.py

server: rename debug tags to match --cache-idle-slots naming (#22292 )

2026-04-24 09:28:44 +03:00

test_lora.py

server : disable context shift by default (#15416 )

2025-08-19 16:46:37 +03:00

test_proxy.py

server: Parse port numbers from MCP server URLs in CORS proxy (#20208 )

2026-03-09 17:47:54 +01:00

test_rerank.py

server / ranking : add sorting and management of top_n (#16403 )

2025-10-11 16:39:04 +03:00

test_router.py

server: implement /models?reload=1 (#21848 )

2026-05-04 16:23:26 +02:00

test_security.py

server: Bypass API Key validation for WebUI static bundle assets (#21269 )

2026-04-01 21:32:15 +02:00

test_sleep.py

server: add auto-sleep after N seconds of idle (#18228 )

2025-12-21 02:24:42 +01:00

test_slot_save.py

server : disable context shift by default (#15416 )

2025-08-19 16:46:37 +03:00

test_speculative.py

spec : parallel drafting support (#22838 )

2026-05-11 19:09:43 +03:00

test_template.py

tests : use reasoning instead of reasoning_budget in server tests (#20432 )

2026-03-12 13:41:01 +01:00

test_tokenize.py

server : disable context shift by default (#15416 )

2025-08-19 16:46:37 +03:00

test_tool_call.py

common/autoparser: fixes for newline handling / forced tool calls (#22654 )

2026-05-04 13:18:11 +02:00

test_vision_api.py

mtmd, server: add "placeholder bitmap" for counting tokens , add */input_tokens API (#23913 )

2026-06-06 11:06:51 +02:00