mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-22 11:42:41 +08:00
* feat: use NVRTC for DeepGEMM JIT compilation Signed-off-by: Zihua Wu * fix: add license Signed-off-by: Zihua Wu * feat: store NVRTC JIT results in memory by default Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * feat: refinement Signed-off-by: Zihua Wu * test: set timeout to 7200 Signed-off-by: Zihua Wu --------- Signed-off-by: Zihua Wu |
||
|---|---|---|
| .. | ||
| allreduce_gemm | ||
| fp8_blockscale_gemm | ||
| fp8_rowwise_gemm | ||
| fpA_intB_gemm | ||
| fused_gated_gemm | ||
| int8_gemm | ||
| python | ||
| CMakeLists.txt | ||
| cutlass_heuristic.cpp | ||
| cutlass_heuristic.h | ||
| cutlass_preprocessors.cpp | ||
| cutlass_preprocessors.h | ||
| cutlass_type_conversion.h | ||