TensorRT-LLMs/cpp/tensorrt_llm/kernels/trtllmGenKernels/fmha/cubin
Kaiyu Xie 2ea17cdad2
Update TensorRT-LLM (#2792)
* Update TensorRT-LLM

---------

Co-authored-by: jlee <jungmoolee@clika.io>
2025-02-18 21:27:39 +08:00
..
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QBfloat16KvBfloat16AccFp32OBfloat16H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OBfloat16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE2m1H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OE4m3H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QE4m3KvE4m3AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H32LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H64LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseMultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskDenseVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutContiguousKvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskDenseVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPackedQkvMaskSlidingWindowCausalVarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP32VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP64VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128MultiCtasKvModeVarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ8StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16PersistentSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ16StaticSwapsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128PersistentSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticKeepsMmaAbForGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskDenseP128VarSeqLenTileSizeQ128StaticSpecDecodingGeneration_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP32VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP64VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128PersistentContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
FmhaSm100Kernel_QFp16KvFp16AccFp32OFp16H128LayoutPagedKvMaskSlidingWindowCausalP128VarSeqLenTileSizeQ128StaticContext_cubin.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
kernelMetaInfo.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00