9 Commits

Author SHA1 Message Date
lyd1992 f351455f0f [CPU][RISC-V] Add RVV-optimized attention kernels for RISC-V Vector Extension (#40119)
Signed-off-by: liuyudong <liuyudong@iscas.ac.cn>
Co-authored-by: Claude <noreply@anthropic.com>
2026-05-15 12:08:23 +08:00
Akash kaothalkar 420b0a5c95 [Hardware][Power]Add Power VSX Attention Backend and fix l2 Cache Crash (#40451)
Signed-off-by: Akash Kaothalkar <akashkaothalkar@akashs-mbp.bl1-in.ibm.com>
Signed-off-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
Signed-off-by: Akash kaothalkar <akash.kaothalkar@ibm.com>
Co-authored-by: Akash Kaothalkar <akashkaothalkar@akashs-mbp.bl1-in.ibm.com>
Co-authored-by: Akash Kaothalkar <akash.kaothalkar@ibm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-05-04 20:51:09 -07:00
Tianmu Li 22524f7a92 [Feat] CPU fp8 attn for AMX/AVX-512 (#39445)
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-04-29 20:43:21 +08:00
R3hankhan 34ce0ffd1f [CPU][Perf] Accelerate Attention head for s390x using vector intrinsics (#34434)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2026-02-24 07:25:39 -08:00
R3hankhan 4dffc5e044 [CPU] Split attention dispatch by head_dim alignment (#32161)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2026-02-03 19:37:15 -08:00
R3hankhan 8e27663b6a [CPU] Add head sizes 80 and 112 with vec16 fallback (#31968)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2026-01-09 22:14:46 +08:00
Aditya Tewari cebda2a4af [CPU] Support for Whisper (#30062)
Signed-off-by: Aditya Tewari <aditya.tewari@arm.com>
2025-12-10 04:58:42 -08:00
Fadi Arafeh 730bd35378 [perf][cpu] Accelerate paged attention GEMMs (QK, PV) on Arm CPUs with NEON (#29193)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
2025-11-22 09:04:36 -08:00
Li, Jiang 7f829be7d3 [CPU] Refactor CPU attention backend (#27954)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-12 09:43:06 +08:00