Commit Graph

11 Commits

Author SHA1 Message Date
Julien Debache
6bddaf6df6
chore: Improve documentation of Kv_block_array (#5765)
Signed-off-by: Julien Debache <julien.debache@hotmail.com>
2025-07-05 22:25:27 +02:00
qsang-nv
e9cd810071
keep sm90 headsize 128 cubins (#5320)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-26 12:14:01 +08:00
qsang-nv
faca19c2f0
update setup.py for special cases (#5227)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-17 16:41:07 +08:00
qsang-nv
5a01ba5260
use cu for fmha_v2 (#4694)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-15 18:40:44 +08:00
qsang-nv
180b91f957
update fmha_v2 (#4895)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-05 22:14:28 +08:00
Perkz Zheng
40a7161f4f
fix: fmha_v2 compilation (#4659)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-05-27 17:39:39 +08:00
qsang-nv
157fe62965
fix fmha v2 tests (#4661)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-05-27 09:47:01 +08:00
nv-guomingz
3549b68c1c
chroe:clean useless flag (#4567)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-05-23 07:05:15 +08:00
Perkz Zheng
6a35c599ef
Clean: fmha codes (#4496)
clean codes

Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-05-21 11:45:47 +08:00
Perkz Zheng
1c5b0d6a13
[Feat] add chunked-attention kernels on Hopper (for llama4) (#4291)
* update cubins

Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>

* add mtp for fmha_v2 MLA kernels and add chunked-attention support for hopper fmha kernels

Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>

---------

Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
2025-05-19 09:57:10 -07:00
qsang-nv
0fd59d64ab
infra: open source fmha v2 kernels (#4185)
* add fmha repo

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix format

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix code style

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix header

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix header kernel_traits.h

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* add .gitignore file

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* add SLIDING_WINDOW_ATTENTION

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix style

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix format

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* update setup.py

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* update build_wheel.py

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

---------

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
Signed-off-by: qsang-nv <200703406+qsang-nv@users.noreply.github.com>
2025-05-15 10:56:34 +08:00