2 Commits

Author SHA1 Message Date
Xuan-Son Nguyen f5c6ae1827 mtmd, server: add "placeholder bitmap" for counting tokens , add */input_tokens API (#23913)
* mtmd: add "placeholder bitmap" for counting tokens w/o preprocessing

* fast path skip preproc for placeholder

* fix build

* correct the api

* add server endpoint + tests

* add object name

* update docs

* add proxy handling

* fix build

* fix audio input path

* use is_placeholder in process_mtmd_prompt()

* nits

* nits (2)

* docs: clarify chat/completions/input_tokens is not official

* fix merge problem
2026-06-06 11:06:51 +02:00
Junwon Hwang 48b88c3b00 model: Add EXAONE 4.5 implementations (#21733)
* Add EXAONE 4.5 and Add GQA for MMproj

* mtmd: EXAONE 4.5 vision markers and projector path

EXAONE 4.5 uses <vision> and </vision> for image boundaries; Qwen keeps
<|vision_start|> and <|vision_end|>.

Route EXAONE 4.5 through the Qwen2.5-VL-style encode path (window attention
pattern, optional mmproj input norm). Update exaone4_5 projector weights and
convert_hf_to_gguf for mmproj export.

* mtmd: load EXAONE4 nextn tensors correctly

Align EXAONE4 tensor registration with EXAONE_MOE for NextN/MTP slots and avoid skip-flag propagation on duplicated rope_freqs so model loading succeeds for EXAONE 4.5 GGUF.

* Minor fixes

* Address PR feedback

* Address PR feedback

* Fix EXAONE after merge

* Fix EXAONE 4.5 conversion

* Address PR feedback

* Refactor EXAONE 4.5 conversion

* Address PR feedback

* Fix unintended deletion

* Minor fix

---------

Co-authored-by: LG-AI-EXAONE <exaonemodels@lgresearch.ai>
2026-06-01 11:48:53 +02:00