..
api
Update TensorRT-LLM ( #2783 )
2025-02-13 18:40:22 +08:00
bertAttentionPlugin
Support RingAttention in the BertAttention plugin and the DiT model ( #3661 )
2025-05-09 08:06:54 +08:00
common
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
cpSplitPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
cudaStreamPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
cumsumLastDimPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
doraPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
eaglePlugin
fix: Eagle decoding in TRT flow ( #4229 )
2025-05-14 16:10:49 +02:00
fp4GemmPlugin
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
fp8RowwiseGemmPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
fusedLayernormPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
gemmAllReducePlugin
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
gemmPlugin
feat: Add FP8 support for SM 120 ( #3248 )
2025-04-14 16:05:41 -07:00
gemmSwigluPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
gptAttentionCommon
fix: fix for cp > kvHeadNum ( #3002 )
2025-03-26 12:39:02 +08:00
gptAttentionPlugin
Feat: Variable-Beam-Width-Search (VBWS) part3 ( #3338 )
2025-04-08 23:51:27 +08:00
identityPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
layernormQuantizationPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
lookupPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
loraPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
lowLatencyGemmPlugin
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
lowLatencyGemmSwigluPlugin
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
lruPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
mambaConv1dPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
mixtureOfExperts
[perf] Reduce the workspace size of FP4 activation scales for MoE ( #4303 )
2025-05-30 09:03:52 +08:00
ncclPlugin
refactor: Introduce MpiTag enumeration and update MPI function signatures ( #3893 )
2025-05-04 13:24:29 +02:00
qserveGemmPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
quantizePerTokenPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
quantizeTensorPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
quantizeToFP4Plugin
update FP4 quantize layout ( #3045 )
2025-04-03 13:13:54 -04:00
rmsnormQuantizationPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
selectiveScanPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
smoothQuantGemmPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
topkLastDimPlugin
Update TensorRT-LLM ( #2873 )
2025-03-11 21:13:42 +08:00
weightOnlyGroupwiseQuantMatmulPlugin
chore: Mass integration of release/0.20. ( #4871 )
2025-06-04 14:12:27 +08:00
weightOnlyQuantMatmulPlugin
Update TensorRT-LLM ( #2792 )
2025-02-18 21:27:39 +08:00
CMakeLists.txt
feat: support add internal cutlass kernels as subproject ( #3658 )
2025-05-06 11:35:07 +08:00
exports.def
Update
2023-10-10 23:22:17 -07:00
exports.map
Update TensorRT-LLM ( #1530 )
2024-04-30 17:19:10 +08:00