This website requires JavaScript.
Explore
Help
Sign In
kanshan
/
TensorRT-LLMs
Watch
1
Star
0
Fork
0
You've already forked TensorRT-LLMs
mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced
2026-01-14 06:27:45 +08:00
Code
Issues
Actions
1
Packages
Projects
Releases
Wiki
Activity
41ef1ade19
TensorRT-LLMs
/
cpp
/
tensorrt_llm
/
plugins
History
DylanChen-NV
74dca0aa7b
[NVBUG-5304516/5319741]Qwen2.5VL FP8 support (
#5029
)
...
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
2025-07-09 23:16:42 +08:00
..
api
bertAttentionPlugin
common
cpSplitPlugin
cudaStreamPlugin
cumsumLastDimPlugin
doraPlugin
eaglePlugin
fp4GemmPlugin
fp8RowwiseGemmPlugin
fusedLayernormPlugin
gemmAllReducePlugin
Fix GEMM+AR fusion on blackwell (
#5563
)
2025-07-09 08:48:47 +08:00
gemmPlugin
[NVBUG-5304516/5319741]Qwen2.5VL FP8 support (
#5029
)
2025-07-09 23:16:42 +08:00
gemmSwigluPlugin
gptAttentionCommon
gptAttentionPlugin
identityPlugin
layernormQuantizationPlugin
lookupPlugin
loraPlugin
lowLatencyGemmPlugin
lowLatencyGemmSwigluPlugin
lruPlugin
mambaConv1dPlugin
mixtureOfExperts
ncclPlugin
qserveGemmPlugin
quantizePerTokenPlugin
quantizeTensorPlugin
quantizeToFP4Plugin
rmsnormQuantizationPlugin
selectiveScanPlugin
smoothQuantGemmPlugin
topkLastDimPlugin
weightOnlyGroupwiseQuantMatmulPlugin
weightOnlyQuantMatmulPlugin
CMakeLists.txt
exports.def
exports.map