Bo Li
|
bf1b958f1a
|
[TRTLLM-7319][perf] Fuse slicing into MoE. (#6728)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Sergey Klevtsov <sklevtsov@nvidia.com>
Co-authored-by: Sergey Klevtsov <sklevtsov@nvidia.com>
|
2025-08-25 16:52:30 -04:00 |
|
NVJiangShao
|
2f2f5cc72c
|
[TRTLLM-6744][feat] Remove input_sf swizzle for module WideEPMoE (#6231)
Signed-off-by: Jiang Shao <91270701+StudyingShao@users.noreply.github.com>
|
2025-08-08 11:13:42 +08:00 |
|
Daniel Stokes
|
ec6c7dff1a
|
feat: Add support for MXFP8xMXFP4 in pytorch (#5535)
Signed-off-by: Daniel Stokes <40156487+djns99@users.noreply.github.com>
|
2025-07-06 15:32:06 -07:00 |
|
Li Min
|
16fc99391f
|
refactor: [TRTLLM-6150] Refactor moe permute and finalize op by removing duplicated code (#5557)
Signed-off-by: Mindy Li <11663212+limin2021@users.noreply.github.com>
|
2025-06-30 08:48:04 -07:00 |
|
Enwei Zhu
|
b4dab23e7b
|
[TRTLLM-5965] perf: Optimize MoE sort kernels for large-scale EP (#5435)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-06-30 01:02:07 +08:00 |
|
Li Min
|
6021a439ab
|
Make moe permute and final as custom op (#5412)
Signed-off-by: Mindy Li <11663212+limin2021@users.noreply.github.com>
|
2025-06-27 15:48:33 -07:00 |
|