* chat: fix whitespace problems once and for all
* Purge trailing spaces from grammar generation
* Revert "Purge trailing spaces from grammar generation"
This reverts commit b0827ecb7d.
* ui: add svg block visualizer based on allozaur's mermaid PR
* ui: rationalise diagram block styling and pre transforms shared by mermaid and svg
* ui: live render streaming svg blocks
* ui: also render svg authored in xml code fences
* ui: refactor svg block rendering, address review from allozaur
- Move the svg size ceiling and DOMPurify config out of sanitize-svg.ts into /constants.
- Rename the svg-diagram class to svg-block so the name no longer implies diagrams only.
- Replace the svg, xml and svg tag magic strings in the markdown pipeline with shared constants.
- Promote the data-svg-rendered marker and its sibling data attributes to constants.
* ui: render svg blocks in a shadow root for animation and live zoom
Mount each sanitized svg inside an open shadow root so author <style> and
keyframe or smil animations run while staying scoped to the host element.
Relax the sanitizer to forbid only foreignObject and script, which lets
animation, href and external resource refs through for wider compatibility.
Render the inline block and the zoom dialog from the same reactive source,
so a streaming svg keeps drawing live inside the open zoom popup.
* Add boilerplate for file types
* Add heic-to and implement conversion
* Load heic library from CDN
* Use jpg instead of png for conversion
* Move const to constants file
* ui: make mobile layout keyboard-aware via interactive-widget and dvh shell anchor
* ui: fix duplicate PWA refresh popup by scoping the storage check to non-PWA pages
* Add arch support for cohere2-MoE
* Removed redundant gating_func checks
* Changed ffn lookup to prefer prefix_dense_intermediate_size
* Renamed arch to cohere2moe
* Removed redundant lmhead check and chat template changes
* Removed lm_head.weight check from modify tensors, load output tensor not required, fallback to token_embd.weight
* Changed to (routed+shared)*0.5 for shared expert combined avg
* fixed sliding_window_pattern issue and pattern
* Fixed transformers crash 'first_k_dense_replace' error
* Remove comment
* Removed cohere2-moe as a tokenizer type and kept as tiny_aya. Renamed North-Mini-Code-1.0.
* Fixed MTP fail, changed to use iSWA
* Fixed remaining todos: cohere2moe renamed, changed swa parsing to use get_key_or_arr, removed extra get_arr use
* Force metadata usage
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Remove Cohere2 checkpoint comment
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Remove MTP comment
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Regenerate cohere2moe tokenizer hash
* Add cohere2moe to Llama Model Saver supported list
* Check for zerobios tensors and add support for Command to use LayerNorm
* Map expert_selection_fn to sigmoid in base.py instead of command.py
* use bools for foundnorm/foundnormrms
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* vulkan: support non-contig unary/glu ops
Change unary/glu ops to pass in all strides and use fastdiv for the index
calculation. Put all unary ops in one file, similar to glu, to share the
code. codex went ahead and added expm1 without me asking, but I had to
make it do a real precision analysis rather than just making stuff up.
unary.comp initially couldn't use generic_unary_head because there wasn't
space for xielu's additional constants. Fixing this required packing the
fastdiv 'L' values.
* attempt to workaround compiler bug
* resolve conflict from #23991
* use expm1
* server: clean up static assets handling
* nits
* simplify file name handling, use static file name everywhere
* cmake/ui : bundle UI assets in an archive
* ui : run prettier on post-build.js
---------
Co-authored-by: Alde Rojas <hello@alde.dev>
When reasoning-budget is set in model.ini, the per-request
thinking_budget_tokens from the WebUI was ignored because the
model.ini value took unconditional precedence.
Swap the precedence so the WebUI per-request value is checked
first, with the model.ini value serving as a fallback default.
Assisted-by: pi:llama.cpp/Qwen3.6-27B
* ui: bake jpeg exif orientation into uploaded images
stb_image in mtmd ignores exif metadata, so rotated smartphone photos
reach the model with raw pixel orientation. The webui now reads the
exif orientation tag at send time and feeds it into the existing
capImageDataURLSize canvas pass: the browser applies the rotation when
decoding, so capped images come out upright for free, and images under
the cap threshold get a single plain redraw when orientation > 1.
At most one re-encode ever happens per image. Upright jpegs with
capping disabled pass through untouched, bit perfect.
Adds jpeg-orientation.ts with a minimal exif parser working on a
bounded base64 prefix (both endianness, returns 1 on any malformed
input) and unit tests against handcrafted jpeg byte streams.
* ui: move jpeg exif constants into lib/constants
* ui: add browser test for jpeg orientation and capping
Covers capImageDataURLSize end to end in chromium with real Pillow
generated jpeg fixtures across exif orientations 1/3/5/6/8: upright
quadrant colors checked pixel-wise, expected dimensions with and
without capping, no orientation tag left in the output, and strict
passthrough when nothing needs rewriting.
* restore SYCL build and release, remove github cache
* modify for test only
* verify the ccache is used
* remove debug code change
* rm duplicate action, update key in ccache
* add action ccache-clear after building in both ubuntu and windows
* set %NUMBER_OF_PROCESSORS% in widnows build