llama.cpp

mirror of https://github.com/ggml-org/llama.cpp.git synced 2026-06-28 07:10:21 +00:00

Files

T

Hans Florian bfb4308b05 model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716 )

* Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models:

* Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model
* Reused gemma4 tokenizer for the 311m model

* granite-embedding-*-multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2

* added new GGUF key <arch>.hidden_activation (LLM_KV_HIDDEN_ACT) + writer
* added a forward declaration of llm_ffn_op_type to llama-hparams.h
* added llm_ffn_op in hparams
* added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged).
* centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string()
* modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph

* Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code

* Added the hashes for the granite embedding multilingual R2 models
* Set the hidden_activation in the GGUF if the field is present in config.json (such as for the granite embedding models)

2026-06-02 17:55:11 +02:00

__init__.py

convert : support Step3.7-Flash (#23845 )

2026-06-02 09:54:49 +02:00

afmoe.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

arctic.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

baichuan.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

bailingmoe.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

base.py

model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716 )

2026-06-02 17:55:11 +02:00

bert.py

model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716 )

2026-06-02 17:55:11 +02:00

bitnet.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

bloom.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

chameleon.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

chatglm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

codeshell.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

cogvlm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

command_r.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

dbrx.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

deci.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

deepseek.py

mtmd: Add DeepSeekOCR 2 Support (#20975 )

2026-05-29 16:13:51 +02:00

dots1.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

dotsocr.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

dream.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

ernie.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

exaone.py

model: Add EXAONE 4.5 implementations (#21733 )

2026-06-01 11:48:53 +02:00

falcon_h1.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

falcon.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

gemma.py

mtmd: fix gemma 4 audio rms norm eps (#23815 )

2026-05-28 16:31:37 +02:00

glm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

gpt2.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

gpt_oss.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

gptneox.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

granite.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

grok.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

grovemoe.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

hunyuan.py

mtmd, model : merge HunyuanOCR into HunyuanVL and fix OCR vision precision (#23329 )

2026-05-21 00:35:37 +02:00

internlm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

internvl.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

jais.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

jamba.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

januspro.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

kimi_linear.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

kimivl.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

lfm2.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

lighton_ocr.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

llada.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

llama4.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

llama.py

vocab : add Carbon-3B (HybridDNATokenizer) support (#23410 )

2026-05-21 08:34:32 +02:00

llava.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

maincoder.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

mamba.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

mimo.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

minicpm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

minimax.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

mistral3.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

mistral.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

mpt.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

nemotron.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

olmo.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

openelm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

orion.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

pangu.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

phi.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

pixtral.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

plamo.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

plm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

qwen3vl.py

convert : fix Qwen3 ASR conversion (#23081 )

2026-05-15 18:38:39 +02:00

qwen.py

convert : add compressed-tensors NVFP4 support (#21095 )

2026-05-25 14:16:11 +02:00

qwenvl.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

refact.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

rwkv.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

sarashina2.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

smallthinker.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

smolvlm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

stablelm.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

starcoder.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

step3.py

StepFun 3.5 MTP (#23274 )

2026-06-02 17:44:35 +02:00

t5.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

talkie.py

model : add support for talkie-1930-13b (#22596 )

2026-05-26 07:57:38 +03:00

ultravox.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

wavtokenizer.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

xverse.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00

youtuvl.py

Refactor: convert_hf_to_gguf.py (#17114 )

2026-05-15 15:18:12 +02:00