Julien Debache
|
6bddaf6df6
|
chore: Improve documentation of Kv_block_array (#5765)
Signed-off-by: Julien Debache <julien.debache@hotmail.com>
|
2025-07-05 22:25:27 +02:00 |
|
qsang-nv
|
e9cd810071
|
keep sm90 headsize 128 cubins (#5320)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
|
2025-06-26 12:14:01 +08:00 |
|
qsang-nv
|
faca19c2f0
|
update setup.py for special cases (#5227)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
|
2025-06-17 16:41:07 +08:00 |
|
qsang-nv
|
5a01ba5260
|
use cu for fmha_v2 (#4694)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
|
2025-06-15 18:40:44 +08:00 |
|
qsang-nv
|
180b91f957
|
update fmha_v2 (#4895)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
|
2025-06-05 22:14:28 +08:00 |
|
Perkz Zheng
|
40a7161f4f
|
fix: fmha_v2 compilation (#4659)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-05-27 17:39:39 +08:00 |
|
qsang-nv
|
157fe62965
|
fix fmha v2 tests (#4661)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
|
2025-05-27 09:47:01 +08:00 |
|
nv-guomingz
|
3549b68c1c
|
chroe:clean useless flag (#4567)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-05-23 07:05:15 +08:00 |
|
Perkz Zheng
|
6a35c599ef
|
Clean: fmha codes (#4496)
clean codes
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-05-21 11:45:47 +08:00 |
|
Perkz Zheng
|
1c5b0d6a13
|
[Feat] add chunked-attention kernels on Hopper (for llama4) (#4291)
* update cubins
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
* add mtp for fmha_v2 MLA kernels and add chunked-attention support for hopper fmha kernels
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
---------
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
|
2025-05-19 09:57:10 -07:00 |
|
qsang-nv
|
0fd59d64ab
|
infra: open source fmha v2 kernels (#4185)
* add fmha repo
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix code style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header kernel_traits.h
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add .gitignore file
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add SLIDING_WINDOW_ATTENTION
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update setup.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update build_wheel.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
---------
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
Signed-off-by: qsang-nv <200703406+qsang-nv@users.noreply.github.com>
|
2025-05-15 10:56:34 +08:00 |
|