TensorRT-LLMs/tensorrt_llm/_torch/custom_ops
Chang Liu 5f737b8dbe
[None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-29 12:45:09 +08:00
..
__init__.py [None][feat] Support Qwen3 next (#7892) 2025-09-29 21:16:07 +08:00
cpp_custom_ops.py [None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701) 2025-10-29 12:45:09 +08:00
cute_dsl_custom_ops.py [TRTLLM-6898][feat] Add swapab, tileN64, cga sync support for cute dsl nvfp4 gemm (#7764) 2025-09-18 21:20:04 +08:00
flashinfer_custom_ops.py [None][feat] Support Qwen3 next (#7892) 2025-09-29 21:16:07 +08:00
torch_custom_ops.py [TRTLLM-7318][feat] MnnvlThroughput AlltoAll implementation. (#7499) 2025-10-27 13:23:06 -04:00
trtllm_gen_custom_ops.py [None][feat] Update TRTLLM MoE MxFP4 cubins; autotune tileN (#8156) 2025-10-23 09:14:18 +08:00
userbuffers_custom_ops.py feat: Introduce UB allocator for pytorch flow (#3257) 2025-04-08 18:39:49 +08:00