TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Enwei Zhu 7cd5a67e25 [TRTLLM-9372][feat] Enable CuteDSL MoE with Large EP (#9592 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>		2025-12-05 22:08:52 -08:00
..
CMakeLists.txt	feat: [Deepseek] Add trtllm-gen MOE FP4 MOE backend (#3387 )	2025-04-21 10:01:33 +08:00
DevKernel.cu	[None][feat] TRT-LLM Gen MoE optimize DeepSeek Fp8 activation kernel (#9175 )	2025-11-21 15:35:00 +01:00
DevKernel.h	[None][feat] TRT-LLM Gen MoE optimize DeepSeek Fp8 activation kernel (#9175 )	2025-11-21 15:35:00 +01:00
IntFastDiv.h	[fix] Fix comment to pass guardwords check (#5191 )	2025-06-13 15:49:59 +08:00
RoutingDeepSeek.cu	[TRTLLM-9372][feat] Enable CuteDSL MoE with Large EP (#9592 )	2025-12-05 22:08:52 -08:00
RoutingKernel.cuh	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 )	2025-11-18 17:40:12 -08:00
RoutingKernel.h	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 )	2025-11-18 17:40:12 -08:00
RoutingKernelTopK.cuh	[TRTLLM-8637][feat] Optimize the routing kernel for DeepseekV3 (MoE CUTLASS backend); Add support for KimiK2 and Qwen-next (MoE TRTLLM backend) (#7761 )	2025-10-20 10:08:31 +08:00
RoutingLlama4.cu	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 )	2025-11-18 17:40:12 -08:00
RoutingRenormalize.cu	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 )	2025-11-18 17:40:12 -08:00
runner.cu	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 )	2025-11-18 17:40:12 -08:00
runner.h	[None][feat] Update TRTLLM MoE cubins; reduce mxfp4 weight padding requirement; tighten TMA bound (#9025 )	2025-11-17 10:04:29 +08:00