..
cudaCoreGemm
[NVBUG-5304516/5319741]Qwen2.5VL FP8 support ( #5029 )
2025-07-09 23:16:42 +08:00
fused_gated_gemm
[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. ( #7851 )
2025-09-25 21:02:35 +08:00
routing
[None][feat] Add routing support for the new model for both cutlass and trtllm moe backend ( #9792 )
2025-12-15 19:59:08 -08:00
sampling
[None][feat] Support ignored prompt length for penalties via new sampling config parameter ( #8127 )
2025-10-27 13:12:31 -04:00
smoothQuant
[None] [feat] Add model gpt-oss ( #6645 )
2025-08-07 03:04:18 -04:00
weightOnly
[feat] Optimizations on weight-only batched gemv kernel ( #5420 )
2025-06-30 10:20:16 +08:00
banRepeatNGramsKernelsTest.cpp
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00
CMakeLists.txt
[ #8476 ][chore] Update license ( #8807 )
2025-11-19 15:05:25 -08:00
decodingKernelTest.cpp
refactor: Clean up DecodingInput and DecodingOutput ( #5617 )
2025-07-01 14:31:42 +02:00
eaglePackDataTest.cpp
[None] [ci] Reorganize CMake and Python integration test infrastructure for C++ tests ( #6754 )
2025-08-24 20:53:17 +02:00
fusedMoeCommKernelTest.cpp
[TRTLLM-6876][feat] Add low precision all2all for mnnvl ( #7155 )
2025-08-28 18:26:16 +08:00
logitsBitmaskTest.cpp
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
mixtureOfExpertsTest.cu
[ https://nvbugs/5726962 ][feat] Apply fusion for W4AFP8_AWQ MoE ( #9838 )
2026-01-06 10:16:41 +08:00
mlaChunkedPrefillTest.cu
[TRTLLM-7192][feat] optimize MLA chunked prefill && support fp8 mla chunked prefill ( #7477 )
2025-09-15 21:43:49 +08:00
mlaPreprocessTest.cu
[None][feat] Use Separate QKV Input Layout for Context MLA ( #6538 )
2025-08-19 22:04:48 +08:00
moeLoadBalanceKernelTest.cpp
[None] [ci] Reorganize CMake and Python integration test infrastructure for C++ tests ( #6754 )
2025-08-24 20:53:17 +02:00
prepareCustomMaskTest.cpp
[TRTLLM-8778][feat] Add tree attention support for blackwell arch ( #8975 )
2025-11-17 09:01:53 +08:00
ropeTest.cu
[None][feat] Support NVFP4 KV Cache ( #6244 )
2025-09-01 09:24:52 +08:00
shiftKCacheKernelTest.cu
Update TensorRT-LLM ( #2755 )
2025-02-11 03:01:00 +00:00
sparseAttentionKernelsTest.cpp
[None] [feat] Use triton kernels for RocketKV prediction module ( #8682 )
2025-11-13 18:51:09 -08:00
sparseKvCacheTest.cu
[TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support ( #8086 )
2025-10-14 08:23:16 -07:00
stopCriteriaKernelsTest.cpp
chore: remove usernames from comments ( #3291 )
2025-04-05 13:44:28 +08:00