TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Kaiyu Xie 6755a3f077 Update TensorRT-LLM (#422 ) * Update TensorRT-LLM --------- Co-authored-by: Tltin <TltinDeng01@gmail.com> Co-authored-by: zhaohb <zhaohbcloud@126.com> Co-authored-by: Bradley Heilbrun <brad@repl.it> Co-authored-by: nqbao11 <nqbao11.01@gmail.com> Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai>		2023-11-18 00:05:54 +08:00
..
copy_cu.py	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_bf16.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_float.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention32_half.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention48_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention48_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention48_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention64_bf16.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention64_float.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention64_half.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention80_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention80_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention80_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention96_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention96_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention96_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention112_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention112_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention112_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention128_bf16.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention128_float.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention128_half.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention144_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention144_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention144_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention160_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention160_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention160_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention192_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention192_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention192_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention224_bf16.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention224_float.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention224_half.cu	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00
decoderMaskedMultiheadAttention256_bf16.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention256_float.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttention256_half.cu	Initial commit	2023-09-20 00:29:41 -07:00
decoderMaskedMultiheadAttentionLaunch.h	Update TensorRT-LLM (#422 )	2023-11-18 00:05:54 +08:00
decoderMaskedMultiheadAttentionTemplate.h	Update TensorRT-LLM (#422 )	2023-11-18 00:05:54 +08:00
mmha_notes.md	Initial commit	2023-09-20 00:29:41 -07:00