TensorRT-LLMs/cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttention/instantiation
Kaiyu Xie db4edea1e1
Update TensorRT-LLM (#1763)
* Update TensorRT-LLM

---------

Co-authored-by: Kota Tsuyuzaki <bloodeagle40234@gmail.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
2024-06-11 16:59:02 +08:00
..
decoderMaskedMultiheadAttention32_bf16_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention32_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention32_float_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention32_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention32_half_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention32_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention48_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention48_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention48_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_bf16_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_float_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_half_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention64_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention80_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention80_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention80_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention96_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention96_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention96_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention104_bf16 .cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention104_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention104_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention112_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention112_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention112_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_bf16_block_sparse_attn.cu Update TensorRT-LLM (#1763) 2024-06-11 16:59:02 +08:00
decoderMaskedMultiheadAttention128_bf16_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_bf16_qk_tanh_scale.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_float_block_sparse_attn.cu Update TensorRT-LLM (#1763) 2024-06-11 16:59:02 +08:00
decoderMaskedMultiheadAttention128_float_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_float_qk_tanh_scale.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_half_block_sparse_attn.cu Update TensorRT-LLM (#1763) 2024-06-11 16:59:02 +08:00
decoderMaskedMultiheadAttention128_half_implicit_relative_attn.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_half_qk_tanh_scale.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention128_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention144_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention144_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention144_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention160_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention160_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention160_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention192_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention192_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention192_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention224_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention224_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention224_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention256_bf16.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention256_float.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00
decoderMaskedMultiheadAttention256_half.cu Update TensorRT-LLM (#1688) 2024-05-28 20:07:49 +08:00