TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Min Yu 9cae7277ea [https://nvbugs/5726962 ][feat] Apply fusion for W4AFP8_AWQ MoE (#9838 ) Signed-off-by: Min Yu <171526537+yumin066@users.noreply.github.com> Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com> Co-authored-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>		2026-01-06 10:16:41 +08:00
..
allreduce_gemm_runner.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
common.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
cutlass_kernel_selector.h	opensource: Opensource MOE MXFP8-MXFP4 implementation (#5222 )	2025-06-26 12:18:19 +08:00
fp4_gemm.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
low_latency_gemm.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
moe_gemm_kernels.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
moe_kernels.h	[https://nvbugs/5726962 ][feat] Apply fusion for W4AFP8_AWQ MoE (#9838 )	2026-01-06 10:16:41 +08:00
moe_util_kernels.h	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00