Files
llama.cpp/src
Aarnav Pai d73cd07674 graph: Fix granite speech model inference by applying embedding scale when deepstack is not used (#24357)
* llama-graph : apply embedding scale when deepstack is not used

* nits: remove non-existant hunyuan-vl from the tests

* apply suggestion from @gabe-l-hart

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-06-09 19:46:27 +02:00
..
2026-06-07 20:50:54 +08:00
2026-06-07 20:50:54 +08:00
2026-06-07 20:50:54 +08:00
2026-06-07 20:50:54 +08:00
2026-04-03 10:33:03 +02:00