TensorRT-LLMs/cpp/tensorrt_llm/kernels/decoderMaskedMultiheadAttention
Kaiyu Xie 4de32a86ae
Update TensorRT-LLM (#188)
* Update batch manager
* Update src

---------

Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: jdemouth-nvidia <11447840+jdemouth-nvidia@users.noreply.github.com>
2023-10-30 16:06:41 +08:00
..
copy_cu.py Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention48_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention48_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention48_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention64_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention64_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention64_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention80_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention80_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention80_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention96_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention96_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention96_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention112_bf16.cu Kaiyu/update main (#5) 2023-10-18 22:38:53 +08:00
decoderMaskedMultiheadAttention112_float.cu Kaiyu/update main (#5) 2023-10-18 22:38:53 +08:00
decoderMaskedMultiheadAttention112_half.cu Kaiyu/update main (#5) 2023-10-18 22:38:53 +08:00
decoderMaskedMultiheadAttention128_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention128_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention128_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention144_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention144_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention144_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention160_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention160_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention160_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention192_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention192_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention192_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention224_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention224_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention224_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention256_bf16.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention256_float.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention256_half.cu Initial commit 2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttentionLaunch.h Kaiyu/update main (#5) 2023-10-18 22:38:53 +08:00
decoderMaskedMultiheadAttentionTemplate.h Update TensorRT-LLM (#188) 2023-10-30 16:06:41 +08:00
mmha_notes.md Initial commit 2023-09-20 00:29:41 -07:00