TensorRT-LLMs/cpp/tensorrt_llm/plugins
Enwei Zhu 4b82b8b4c7
[TRTLLM-5330] perf: Optimize MoE supplementary kernels for large-scale EP (#5215)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-06-17 15:23:24 +08:00
..
api
bertAttentionPlugin
common
cpSplitPlugin
cudaStreamPlugin
cumsumLastDimPlugin
doraPlugin
eaglePlugin
fp4GemmPlugin
fp8RowwiseGemmPlugin
fusedLayernormPlugin
gemmAllReducePlugin
gemmPlugin
gemmSwigluPlugin
gptAttentionCommon
gptAttentionPlugin
identityPlugin
layernormQuantizationPlugin
lookupPlugin
loraPlugin
lowLatencyGemmPlugin
lowLatencyGemmSwigluPlugin
lruPlugin
mambaConv1dPlugin
mixtureOfExperts [TRTLLM-5330] perf: Optimize MoE supplementary kernels for large-scale EP (#5215) 2025-06-17 15:23:24 +08:00
ncclPlugin
qserveGemmPlugin
quantizePerTokenPlugin
quantizeTensorPlugin
quantizeToFP4Plugin
rmsnormQuantizationPlugin
selectiveScanPlugin
smoothQuantGemmPlugin
topkLastDimPlugin
weightOnlyGroupwiseQuantMatmulPlugin
weightOnlyQuantMatmulPlugin
CMakeLists.txt
exports.def
exports.map