Neo Zhang
6f1034b32a
[SYCL] support OPs: conv_2d, conv_2d_dw, conv2d_transpose ( #24600 )
...
* fix conflict
* fix format issue, rename
* rm debug code
* correct the file name
2026-06-18 09:40:03 +03:00
Neo Zhang
d1759e4156
[SYCL] Add conv_3d ( #24691 )
...
* add conv_3d
* optimize
* update ops.md
* restore test script
* rm unused code
* rm copyright notes
2026-06-17 17:20:01 +03:00
Neo Zhang
58728bdbf0
sycl : Enable to support fp16 by OPs: SQR, SQRT, LOG, SIN, COS, CLAMP ( #24692 )
2026-06-17 08:58:03 +03:00
Neo Zhang
fdd109883d
[SYCL] Support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND ( #24363 )
...
* support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND
* fix conflict
* rebase, support new UT case of repeat, concat
2026-06-16 08:34:29 +03:00
Neo Zhang
987fbd821d
[SYCL] add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp ( #24584 )
...
* add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp
* update ops.md
2026-06-15 10:01:07 +03:00
Jeff Bolz
1a7718b4c5
vulkan: support non-contig unary/glu ops ( #24215 )
...
* vulkan: support non-contig unary/glu ops
Change unary/glu ops to pass in all strides and use fastdiv for the index
calculation. Put all unary ops in one file, similar to glu, to share the
code. codex went ahead and added expm1 without me asking, but I had to
make it do a real precision analysis rather than just making stuff up.
unary.comp initially couldn't use generic_unary_head because there wasn't
space for xielu's additional constants. Fixing this required packing the
fastdiv 'L' values.
* attempt to workaround compiler bug
* resolve conflict from #23991
* use expm1
2026-06-13 08:44:15 -05:00
Neo Zhang
4162522688
[SYCL] Add more types in GET_ROWS OP ( #23710 )
...
* add to support Q1_0, NVFP4, IQ2_XXS, IQ2_XS, IQ2_S, IQ3_XXS, IQ1_S, IQ1_M, IQ3_S, IQ4_NL, IQ4_XS, I32, MXFP4, Q2_K, Q3_K, Q5_K, and Q6_K in GET_ROWS OP
* correct the link
2026-06-01 09:53:04 +03:00
Masashi Yoshimura
927dada6c9
ggml-webgpu: Enables running gpt-oss-20b ( #22906 )
...
* Enable to run gpt-oss-20b and refactor mulmat-q
* disable test-backend-ops in ubuntu-24-webgpu
2026-05-12 07:27:40 -07:00
Neo Zhang
7d442abf5c
[SYCL] Add OP im2col_3d ( #22903 )
...
* add im2col_3d
* format code
* update the ops.md
2026-05-11 08:01:47 +03:00
Intel AI Get-to Market Customer Success and Solutions
ad09224658
sycl: add FILL, CUMSUM, DIAG, SOLVE_TRI, SSM_SCAN, GATED_DELTA_NET ( #22149 )
...
* sycl: add FILL, CUMSUM, DIAG, SOLVE_TRI, SSM_SCAN, GATED_DELTA_NET
Signed-off-by: Chun Tao <chun.tao@intel.com >
* Fix abort during test-backend-ops
Signed-off-by: Todd Malsbary <todd.malsbary@intel.com >
* Regenerate ops.md
Signed-off-by: Todd Malsbary <todd.malsbary@intel.com >
* Add scope_dbg_print to newly added SYCL ops.
Also add scope_dbg_print to existing ssm_conv op.
Signed-off-by: Todd Malsbary <todd.malsbary@intel.com >
---------
Signed-off-by: Chun Tao <chun.tao@intel.com >
Signed-off-by: Todd Malsbary <todd.malsbary@intel.com >
Co-authored-by: Chun Tao <chun.tao@intel.com >
Co-authored-by: Todd Malsbary <todd.malsbary@intel.com >
2026-05-07 18:51:33 +03:00
Reese Levine
dd2914dc81
ggml-webgpu: support for SSM_SCAN and disable set_rows error checking ( #22327 )
...
* Implement ssm_scan
* Remove blocking in graph_compute and check for set rows
* Fix bindings
* Update op support
2026-04-25 09:18:15 +03:00
Kusha Gharahi
ae2d34899e
metal: Implement ROLL op ( #21946 )
...
* nix: support unified apple-sdk
* Impl roll op for Metal
* Revert "nix: support unified apple-sdk"
This reverts commit abfa473360 .
* update ops.md
* update op docs
2026-04-16 11:54:37 +03:00
Masashi Yoshimura
d0a6dfeb28
ggml-webgpu: Add the support of MUL_MAT_ID ( #21147 )
...
* Add mul_mat_id support to WebGPU
* Apply suggestion from @reeselevine
---------
Co-authored-by: Reese Levine <reeselevine1@gmail.com >
2026-04-06 13:08:46 -07:00
Vishal Singh
f1ac84119c
ggml-zendnn : add MUL_MAT_ID op support for MoE models ( #21315 )
...
* ggml-zendnn : add MUL_MAT_ID op support for MoE models
- Add MUL_MAT_ID op acceleration for Mixture-of-Experts models
- MUL_MAT_ID op fallback to CPU backend if total experts > 32
- Point ZenDNN lib to latest bits ZenDNN-2026-WW13
* ggml-zendnn : add braces to sgemm failure condition for consistency
Co-authored-by: Aaron Teo <taronaeo@gmail.com >
---------
Co-authored-by: Aaron Teo <taronaeo@gmail.com >
2026-04-03 12:19:08 +03:00
Seyoung Jeong
6d99b44c7e
docs : fix Metal backend op support status in ops.md ( #20779 )
...
Regenerate docs/ops/Metal.csv using test-backend-ops on Apple M5
and rebuild docs/ops.md via scripts/create_ops_docs.py.
Five ops were incorrectly marked as not supported (❌ ) for Metal:
- DIAG: ❌ → ✅
- POOL_1D: ❌ → ✅
- SET: ❌ → ✅
- SOLVE_TRI: ❌ → ✅
- GATED_DELTA_NET:❌ → 🟡 (partial, depends on head_size % 32)
2026-03-20 11:06:38 +02:00
Reese Levine
c1258830b2
ggml webgpu: ops support for qwen3.5 (SET, TRI_SOLVE, SSM_CONV, GATED_DELTA_NET) + GET_ROWS optimization ( #20687 )
...
* Implement l2_norm, set, tri
* Add DIAG/SOLVE_TRI
* Add SSM_CONV
* Better get_rows and gated_delta_net to support qwen3.5
* Clean up, update ops.md
* Fix binding_index type for wasm
* Fix read write annotations
* cleanups
2026-03-19 08:45:28 -07:00
Masashi Yoshimura
509a31d00f
ggml-webgpu: Update the RMS_NORM preprocessor and add L2_NORM ( #20665 )
...
* Update the preprocessor of RMS_NORM and add L2_NORM.
* Fix the name of rms_norm to row_norm.
2026-03-18 21:08:59 -07:00
Masashi Yoshimura
ea01d196d7
ggml-webgpu: Add supports for DIAG and TRI ( #20664 )
...
* Add supports for DIAG and TRI.
* Remove extra ttype and add a comment for TRI op.
2026-03-18 21:08:35 -07:00
Neo Zhang
b6c83aad55
[SYCL] ehance UPSCALE to support all UT cases ( #20637 )
...
* [SYCL] ehance UPSCALE to support more cases
* rm test case result of SYCL1
2026-03-17 10:01:52 +08:00
Neo Zhang
a93c0ef0fa
add op gated_delta_net ( #20455 )
2026-03-14 22:01:57 +08:00
Masashi Yoshimura
f2ab047f27
ggml-webgpu: Add supports for GGML_OP_REPEAT ( #20230 )
...
* Add GGML_OP_REPEAT to webgpu backend.
* Add i16 support for GGML_OP_REPEAT.
2026-03-11 14:40:36 -07:00
Neo Zhang
0cec84f999
fix op rope, add rope_back ( #20293 )
2026-03-11 09:53:34 +08:00
a3894281
0f1e9d14cc
docs: update CPU backend ops to mark POOL_1D as supported ( #20304 )
2026-03-10 21:31:24 +08:00
Bertay Eren
0beb8db3a0
ggml-vulkan: add SGN operator, auto-generate Vulkan.csv and ops.md ( #20219 )
2026-03-09 07:24:16 +01:00
GiantPrince
d088d5b74f
ggml-vulkan: Add ELU op support ( #20183 )
...
* ggml-Vulkan: add ELU support
* ggml-Vulkan: remove extra spaces and variables
* ggml-Vulkan: fix format issue
* ggml-Vulkan: fix format issue
* fix whitespace issue
* Update Vulkan.csv and ops.md
2026-03-08 12:38:17 +01:00
Neo Zhang
213c4a0b81
[SYCL] supprt Flash Attention for fp32/fp16/Q4/Q5/Q8 ( #20190 )
...
* support flash-attention for fp32/fp16/Q4/Q5/Q8
* rm warining
* update for JIT
2026-03-08 12:00:07 +08:00
Masashi Yoshimura
541bf37622
Add concat op to webgpu. ( #20068 )
2026-03-04 11:19:00 -08:00
Masashi Yoshimura
11c325c6e0
ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support. ( #19700 )
...
* ggml-webgpu: Add unary op (SQR, SQRT, SIN, COS) support.
* Fix to cast the src value to f32 before sin/cos computing.
2026-02-19 09:18:30 -07:00
Nechama Krashinski
537eadb1b9
sycl: add F16 support for GGML_OP_CEIL ( #19306 )
...
* Fix SYCL CEIL operator
* sycl: implement GGML_OP_CEIL
2026-02-06 23:13:44 +08:00
Tamar
4d5e972673
sycl: implement GGML_OP_TOP_K ( #19242 )
2026-02-02 21:05:51 +08:00
s8322
1025fd2c09
sycl: implement GGML_UNARY_OP_SOFTPLUS ( #19114 )
...
* sycl: add softplus unary op implementation
* sycl: add softplus unary op implementation
* docs(ops): mark SYCL SOFTPLUS as supported
* docs: update SYCL status for SOFTPLUS
2026-01-30 12:01:38 +08:00
RachelMantel
c7358ddf64
sycl: implement GGML_OP_TRI ( #19089 )
...
* sycl: implement GGML_OP_TRI
* docs: update ops.md for SYCL TRI
* docs: regenerate ops.md
* docs: update SYCL support for GGML_OP_TRI
2026-01-30 12:00:49 +08:00
Reese Levine
a89002f07b
ggml webgpu: support for backend sampling ( #18880 )
...
* ggml webgpu: add SOFTPLUS unary operator
Implements SOFTPLUS (log(1 + exp(x))) with f16/f32 support. Uses f32
precision for intermediate calculations to prevent f16 overflow.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* Follow Vulkan backend numerical stability pattern
* ggml webgpu: add EXPM1 unary operator
Implements EXPM1 (exp(x) - 1) with f16/f32 support.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* ggml webgpu: add FLOOR unary operator
Implements FLOOR (rounds down to nearest integer) with f16/f32 support.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* ggml webgpu: add CEIL unary operator
Implements CEIL (rounds up to nearest integer) with f16/f32 support.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* ggml webgpu: add ROUND unary operator
Implements ROUND (rounds to nearest integer) with f16/f32 support.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* ggml webgpu: add TRUNC unary operator
Implements TRUNC (truncates towards zero) with f16/f32 support.
* Add shader implementation and 4 variants (f32/f16, inplace/non-inplace)
* Register pipelines and device support
* docs : update WebGPU support for unary operators (FLOOR, CEIL, ROUND, TRUNC, EXPM1, SOFTPLUS)
* Updates to webgpu get_memory
* Add argmax
* Add argmax,cumsum,sum,sum_rows
* Add necessary CPY/GET_ROWS operators
* Support for argsort using multi-pass strategy
* Update set_rows for i32 indices, move to pre-wgsl
* Port unary operators to pre-wgsl and support FILL
* Implement PAD
* Add support for top-k
* clean up, scope pipeline init mutex
* fix newline
* Add support for log
* Update LOG for better precision, and ops doc
---------
Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com >
2026-01-16 16:12:43 -08:00
hipudding
6ba6a3c76f
docs : update ops.md for CANN backend ( #18654 )
2026-01-16 13:32:17 +01:00
Aaron Teo
2656c0d265
docs(ggml): update backend ops ( #18734 )
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com >
2026-01-10 18:48:17 +08:00
nwyin
e443fbcfa5
ggml webgpu: add CEIL operation support ( #18605 )
...
* ggml-webgpu: add CEIL operation support
Add support for the CEIL unary operation in the WebGPU backend:
- Add CEIL_FUNC shader template in unary_op.wgsl
- Add 4 shader variants (f32, f16, inplace versions)
- Initialize CEIL pipelines in ggml-webgpu.cpp
- Register CEIL in supports_op function
* docs: update WebGPU ops support for CEIL
2026-01-05 11:38:57 -08:00
gatbontonpc
9a6369bb60
metal : add count_equal op ( #18314 )
...
* add count equal for metal
* remove trailing whitespace
* updated doc ops table
* changed shmem to i32
* added multi tg and templating
* removed BLAS support from Metal docs
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* add memset to set dst to 0
* metal : cleanup
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
2025-12-31 10:39:48 +02:00
Neo Zhang Jianyu
4aced7a631
[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai ( #17826 )
...
* support gpt-oss GPU by OP add-id, mul_mat for mxfp4, swiglu_oai, fix warning
* fix fault ut case, update ops.md
* rebase, fix format issue
2025-12-15 10:35:15 +08:00
lhez
2d2e1030e3
docs : update opencl ops ( #17904 )
2025-12-10 15:20:00 +01:00
Sigbjørn Skjæret
63391852b0
docs : update cpu and cuda ops ( #17890 )
...
* update cuda ops
* update CPU as well
2025-12-09 23:31:29 +01:00
Vishal Singh
017761daf5
ggml-zendnn : add ZenDNN backend for AMD CPUs ( #17690 )
...
* ggml-zennn: add ZenDNN backend support
* ggml-zendnn : address ZenDNN backend review fixes and suggestions
* docs : apply blockquote syntax to ZenDNN docs
---------
Co-authored-by: Manoj Kumar <mkumar@zettabolt.com >
2025-12-07 00:13:33 +08:00
Reese Levine
fd57b24c0f
ggml webgpu: unary op suppport, code refactoring, ops support ( #17764 )
...
* Squashed commit of the following:
commit b3c6bf4b0450d8d452b934df27a0fb7cb53cd755
Author: Abhijit Ramesh <abhijitramesh2k@gmail.com >
Date: Mon Dec 1 18:29:00 2025 -0800
ggml webgpu: fix xielu parameter passing (#11 )
The XIELU operation was incorrectly using static_cast to convert
float parameters to uint32_t, which converted numeric values instead
of preserving IEEE 754 bit patterns. This caused incorrect values
to be interpreted by the GPU shader.
* Use reinterpret_cast to preserve float bit patterns when passing
through uint32_t params buffer
* Update WGSL shader parameter types from u32 to f32
* Re-enable XIELU support (was disabled due to numerical issues)
Fixes NMSE test failures for XIELU operation on WebGPU backend.
commit 5ca9b5e49e
Author: neha-ha <137219201+neha-ha@users.noreply.github.com >
Date: Tue Nov 18 12:17:00 2025 -0800
Refactored pipelines and workgroup calculations (#10 )
* refactored pipelines
* refactored workgroup calculation
* removed commented out block of prior maps
* Clean up ceiling division pattern
---------
Co-authored-by: Neha Abbas <nehaabbas@eduroam-169-233-141-223.ucsc.edu >
Co-authored-by: Reese Levine <reeselevine1@gmail.com >
Author: James Contini <jamescontini@gmail.com >
Date: Wed Oct 29 23:13:06 2025 -0700
formatted embed wgsl and ggml-webgpu.cpp
commit e1f6baea31
Author: James Contini <jamescontini@gmail.com >
Date: Wed Oct 29 23:08:37 2025 -0700
implemented REPL_Template support and removed bug in unary operators kernel
commit 8c70b8fece
Author: James Contini <jamescontini@gmail.com >
Date: Wed Oct 15 16:14:20 2025 -0700
responded and dealt with PR comments
commit f9282c660c
Author: James Contini <jamescontini@gmail.com >
Date: Sun Oct 12 13:41:41 2025 -0700
removed unnecesarry checking if node->src[1] exists for unary operators
commit 4cf28d7dec
Author: James Contini <jamescontini@gmail.com >
Date: Sun Oct 12 13:32:45 2025 -0700
All operators (inlcluding xielu) working
commit 74c6add176
Author: James Contini <jamescontini@gmail.com >
Date: Fri Oct 10 13:16:48 2025 -0700
fixed autoconfig
commit 362749910b
Author: James Contini <jamescontini@gmail.com >
Date: Fri Oct 10 13:10:46 2025 -0700
removed vestigial files
commit cb08583337
Author: James Contini <jamescontini@gmail.com >
Date: Fri Oct 10 12:59:32 2025 -0700
abides by editor-config
commit 5360e2852a
Author: James Contini <jamescontini@gmail.com >
Date: Fri Oct 10 12:45:57 2025 -0700
rms_norm double declaration bug atoned
commit 7b09baa4aa
Merge: 8a6ec843 74b8fc17
Author: James Contini <jamescontini@gmail.com >
Date: Fri Oct 10 11:50:03 2025 -0700
resolving merge conflicts
commit 8a6ec843a5
Author: James Contini <jamescontini@gmail.com >
Date: Wed Oct 8 18:06:47 2025 -0700
unary operators pass ggml tests
commit c3ae38278a
Author: James Contini <jamescontini@gmail.com >
Date: Wed Oct 1 16:22:40 2025 -0700
neg passes backend test
commit aa1c9b2f88
Author: James Contini <jamescontini@gmail.com >
Date: Tue Sep 30 23:55:27 2025 -0700
neg f16xf32xip builds and runs, havent actually ran a model that uses neg kernel yet though
Co-authored-by: James Contini <jamescontini@gmail.com >
Co-authored-by: Neha Abbas <neabbas@ucsc.edu >
Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com >
* Remove extra code and format
* Add ops documentation (finally)
* Update ggml/src/ggml-webgpu/wgsl-shaders/embed_wgsl.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
---------
Co-authored-by: James Contini <jamescontini@gmail.com >
Co-authored-by: Neha Abbas <neabbas@ucsc.edu >
Co-authored-by: Abhijit Ramesh <abhijitramesh2k@gmail.com >
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2025-12-05 12:25:51 -08:00
Gabe Goodhart
3143a755c8
docs : update ops.md (Metal, BLAS) ( #17768 )
...
* docs: Regen Metal.csv
Branch: UpdateOpsMd
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com >
* docs: Regen BLAS.csv
Branch: UpdateOpsMd
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com >
* docs: Update ops.md
Branch: UpdateOpsMd
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com >
---------
Signed-off-by: Gabe Goodhart <ghart@us.ibm.com >
2025-12-05 00:55:34 +01:00
Jeff Bolz
9810cb8247
ops.md: update vulkan support ( #17661 )
2025-12-01 15:26:21 -06:00
Giuseppe Scrivano
7d77f07325
vulkan: implement ADD1, ARANGE, FILL, SOFTPLUS, STEP, ROUND, CEIL, FLOOR, TRUNC ( #17319 )
...
* vulkan: initialize array
* vulkan: implement ADD1
* vulkan: implement ARANGE
* vulkan: implement FILL
* vulkan: implement SOFTPLUS
* vulkan: implement STEP
* vulkan: implement ROUND
* vulkan: implement CEIL
* vulkan: implement FLOOR
* vulkan: implement TRUNC
* docs: update Vulkan ops
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com >
2025-11-19 17:29:45 +01:00
Pavels Zaicenkovs
dbed61294a
vulkan: add LOG operation support for F32 and F16 ( #17183 )
...
* vulkan: add LOG operation support for F32 and F16
Part of #14909 .
* vulkan: Fix LOG operation types
* docs: Update operation support documentation for Vulkan LOG operation
* vulkan: fix log_f16 shader
* docs: restore missing LOG test cases and regenerate ops.md
2025-11-16 22:50:09 +01:00
shani-f
72bd7321a7
sycl : unify unary kernels with a generic implementation and enable wide operator support ( #17213 )
...
* SYCL: add generic unary op implementation for multiple ops (ABS/SGN/…); unify non-contiguous access
* SYCL: update documentation and sycl.csv to reflect new unary op support
* update ops.md after syncing SYCL.csv changes
* Fix SYCL.csv merge conflict
* Update ops.md after fixing SYCL.csv conflicts
* Fix SYCL.csv tail after merge conflict and regenerate ops.md
* Fix line endings and final newline in SYCL.csv
* Remove TOPK_MOE entries from SYCL.csv as requested
* Update ops.md after removing TOPK_MOE from SYCL.csv
* Regenerated SYCL.csv and synced ops.md with upstream
* Update ops.md using create_ops_docs.py
2025-11-16 00:52:42 +01:00
Giuseppe Scrivano
1568d13c2c
vulkan: implement ABS and NEG ( #17245 )
...
* docs: update Vulkan ops
* vulkan: add NEG op
* vulkan: add ABS op
---------
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com >
2025-11-15 12:00:29 +01:00
Piotr Wilkin (ilintar)
389ac78b26
ggml : add ops SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM ( #17063 )
...
* Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM
* Update ggml/include/ggml.h
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update tests/test-backend-ops.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Code review
* Whitespace
* Update tests/test-backend-ops.cpp
Co-authored-by: Diego Devesa <slarengh@gmail.com >
* This is actually sigmoid, duh.
* Add CONST, remove TRI_KEEP, other changes from review
* Update tests/test-backend-ops.cpp
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update ggml/src/ggml.c
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update ggml/src/ggml.c
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
* Update ggml/src/ggml-cuda/unary.cu
Co-authored-by: Aman Gupta <amangupta052@gmail.com >
* Remove extra script
* Update ggml/src/ggml.c
Co-authored-by: Diego Devesa <slarengh@gmail.com >
* Update tests/test-backend-ops.cpp
Co-authored-by: Diego Devesa <slarengh@gmail.com >
* moving changes from laptop [no ci]
* pre-rebase
* Update tests/test-backend-ops.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Update tests/test-backend-ops.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
* Refactor tests
* ggml : cleanup
* cont : fix ggml_fill srcs
* tests : add note
* ggml : add ggml_fill_inplace
* ggml : add asserts
* ggml : fix ggml_fill constant cast
* cont : ggml_tri minor
* Use TENSOR_LOCALS
* Fix regression from #14596 , regenerate
* Don't make commits at night...
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com >
Co-authored-by: Diego Devesa <slarengh@gmail.com >
Co-authored-by: Aman Gupta <amangupta052@gmail.com >
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com >
2025-11-13 20:54:47 +02:00
Neo Zhang Jianyu
07751f8d44
update SYCL support OPs ( #17208 )
...
Co-authored-by: Zhang Jianyu <zhang.jianyu@outlook.com >
2025-11-13 08:42:23 +08:00