TensorRT-LLMs/cpp/tensorrt_llm/kernels/fusedLayernormKernels
Yuan Tong 32b244af38
feat: reduce unnecessary kernel generation (#5476)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-07-04 14:37:49 +08:00
..
CMakeLists.txt Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
fp4_converter.cuh feat: Add Mixture of Experts FP8xMXFP4 support (#4750) 2025-06-09 13:25:04 +08:00
layernorm_param.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
low_latency_layernorm.cuh feat: reduce unnecessary kernel generation (#5476) 2025-07-04 14:37:49 +08:00
ws_layernorm_fp4_traits.cu opensource: Opensource MOE MXFP8-MXFP4 implementation (#5222) 2025-06-26 12:18:19 +08:00
ws_layernorm.cuh feat: reduce unnecessary kernel generation (#5476) 2025-07-04 14:37:49 +08:00
ws_layernorm.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00