This website requires JavaScript.
Explore
Help
Sign In
kanshan
/
TensorRT-LLMs
Watch
1
Star
0
Fork
0
You've already forked TensorRT-LLMs
mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced
2026-02-02 01:01:35 +08:00
Code
Issues
Actions
1
Packages
Projects
Releases
Wiki
Activity
1e4fa13d33
TensorRT-LLMs
/
cpp
/
tensorrt_llm
/
plugins
History
Enwei Zhu
4b82b8b4c7
[TRTLLM-5330] perf: Optimize MoE supplementary kernels for large-scale EP (
#5215
)
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-06-17 15:23:24 +08:00
..
api
bertAttentionPlugin
common
cpSplitPlugin
cudaStreamPlugin
cumsumLastDimPlugin
doraPlugin
eaglePlugin
fp4GemmPlugin
fp8RowwiseGemmPlugin
fusedLayernormPlugin
gemmAllReducePlugin
gemmPlugin
gemmSwigluPlugin
gptAttentionCommon
gptAttentionPlugin
identityPlugin
layernormQuantizationPlugin
lookupPlugin
loraPlugin
lowLatencyGemmPlugin
lowLatencyGemmSwigluPlugin
lruPlugin
mambaConv1dPlugin
mixtureOfExperts
[TRTLLM-5330] perf: Optimize MoE supplementary kernels for large-scale EP (
#5215
)
2025-06-17 15:23:24 +08:00
ncclPlugin
qserveGemmPlugin
quantizePerTokenPlugin
quantizeTensorPlugin
quantizeToFP4Plugin
rmsnormQuantizationPlugin
selectiveScanPlugin
smoothQuantGemmPlugin
topkLastDimPlugin
weightOnlyGroupwiseQuantMatmulPlugin
weightOnlyQuantMatmulPlugin
CMakeLists.txt
exports.def
exports.map