| .. |
|
fmha
|
[TRTLLM-6674][feat] (Breaking Change) Hopper SWA non-cyclic kernels + KV reuse + Spec Dec (#6379)
|
2025-08-05 07:47:41 +00:00 |
|
convert.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_demo_bert_params.h
|
hopper-style context MLA (#5713)
|
2025-07-23 14:37:20 +08:00 |
|
fused_multihead_attention_kernel_1xN_multi_cta.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_1xN_noloop.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_1xN.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_2x2.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_4x1_hopper_noloop.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_4x1_hopper.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_4xN_hopper_noloop.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel_4xN_hopper.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_kernel.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_attention_utils.h
|
hopper-style context MLA (#5713)
|
2025-07-23 14:37:20 +08:00 |
|
fused_multihead_attention.cpp
|
[TRTLLM-6674][feat] (Breaking Change) Hopper SWA non-cyclic kernels + KV reuse + Spec Dec (#6379)
|
2025-08-05 07:47:41 +00:00 |
|
fused_multihead_attention.h
|
hopper-style context MLA (#5713)
|
2025-07-23 14:37:20 +08:00 |
|
fused_multihead_cross_attention_kernel_1xN_noloop.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_cross_attention_kernel_1xN.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_cross_attention.cpp
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_cross_attention.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
fused_multihead_flash_attention_kernel_noloop_tiled.h
|
[TRTLLM-6674][feat] (Breaking Change) Hopper SWA non-cyclic kernels + KV reuse + Spec Dec (#6379)
|
2025-08-05 07:47:41 +00:00 |
|
fused_multihead_flash_attention_kernel_noloop.h
|
[TRTLLM-6674][feat] (Breaking Change) Hopper SWA non-cyclic kernels + KV reuse + Spec Dec (#6379)
|
2025-08-05 07:47:41 +00:00 |
|
fused_multihead_flash_attention_kernel.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_bf16.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_fp8.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_fp16.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_fp32.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_impl.h
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |
|
softmax_int8.cu
|
infra: open source fmha v2 kernels (#4185)
|
2025-05-15 10:56:34 +08:00 |