TensorRT-LLMs/cpp/tensorrt_llm/kernels/flashMLA
yunruis 30c5b4183a
refactoring: port customized kernels with public cutlass version (#5027)
Signed-off-by: yunruis 

Merge this to unblock others since the full CI has been run through
2025-06-13 16:19:31 +08:00
..
CMakeLists.txt Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
flash_fwd_mla_bf16_sm90.cu refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
flash_fwd_mla_fp8_sm90.cu refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
flash_fwd_mla_fp16_sm90.cu refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
flash_fwd_mla_kernel.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
flash_fwd_mla_metadata.cu refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
flash_mla.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
fp8_transpose_v.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
named_barrier.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
softmax.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
static_switch.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00
utils.h refactoring: port customized kernels with public cutlass version (#5027) 2025-06-13 16:19:31 +08:00