TensorRT-LLMs/cpp/tensorrt_llm/plugins
DylanChen-NV 74dca0aa7b
[NVBUG-5304516/5319741]Qwen2.5VL FP8 support (#5029)
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
2025-07-09 23:16:42 +08:00
..
api
bertAttentionPlugin
common
cpSplitPlugin
cudaStreamPlugin
cumsumLastDimPlugin
doraPlugin
eaglePlugin
fp4GemmPlugin
fp8RowwiseGemmPlugin
fusedLayernormPlugin
gemmAllReducePlugin Fix GEMM+AR fusion on blackwell (#5563) 2025-07-09 08:48:47 +08:00
gemmPlugin [NVBUG-5304516/5319741]Qwen2.5VL FP8 support (#5029) 2025-07-09 23:16:42 +08:00
gemmSwigluPlugin
gptAttentionCommon
gptAttentionPlugin
identityPlugin
layernormQuantizationPlugin
lookupPlugin
loraPlugin
lowLatencyGemmPlugin
lowLatencyGemmSwigluPlugin
lruPlugin
mambaConv1dPlugin
mixtureOfExperts
ncclPlugin
qserveGemmPlugin
quantizePerTokenPlugin
quantizeTensorPlugin
quantizeToFP4Plugin
rmsnormQuantizationPlugin
selectiveScanPlugin
smoothQuantGemmPlugin
topkLastDimPlugin
weightOnlyGroupwiseQuantMatmulPlugin
weightOnlyQuantMatmulPlugin
CMakeLists.txt
exports.def
exports.map