mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-28 15:20:20 +00:00
ebc10770ac
When reasoning-budget is set in model.ini, the per-request thinking_budget_tokens from the WebUI was ignored because the model.ini value took unconditional precedence. Swap the precedence so the WebUI per-request value is checked first, with the model.ini value serving as a fallback default. Assisted-by: pi:llama.cpp/Qwen3.6-27B