TensorRT-LLMs/cpp/tensorrt_llm/plugins
Tracin 6c91f1c7ac
Mxfp8xmxfp4 quant mode(#4978)
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-06-10 22:01:37 +08:00
..
api
bertAttentionPlugin
common
cpSplitPlugin
cudaStreamPlugin
cumsumLastDimPlugin
doraPlugin
eaglePlugin
fp4GemmPlugin
fp8RowwiseGemmPlugin Mxfp8xmxfp4 quant mode(#4978) 2025-06-10 22:01:37 +08:00
fusedLayernormPlugin
gemmAllReducePlugin
gemmPlugin
gemmSwigluPlugin Mxfp8xmxfp4 quant mode(#4978) 2025-06-10 22:01:37 +08:00
gptAttentionCommon
gptAttentionPlugin
identityPlugin
layernormQuantizationPlugin
lookupPlugin
loraPlugin
lowLatencyGemmPlugin
lowLatencyGemmSwigluPlugin
lruPlugin
mambaConv1dPlugin
mixtureOfExperts feat: Add Mixture of Experts FP8xMXFP4 support (#4750) 2025-06-09 13:25:04 +08:00
ncclPlugin
qserveGemmPlugin
quantizePerTokenPlugin
quantizeTensorPlugin
quantizeToFP4Plugin
rmsnormQuantizationPlugin
selectiveScanPlugin
smoothQuantGemmPlugin Mxfp8xmxfp4 quant mode(#4978) 2025-06-10 22:01:37 +08:00
topkLastDimPlugin
weightOnlyGroupwiseQuantMatmulPlugin chore: Mass integration of release/0.20. (#4871) 2025-06-04 14:12:27 +08:00
weightOnlyQuantMatmulPlugin
CMakeLists.txt
exports.def
exports.map