ChristinaZ
|
dff77efa2a
|
[None][feat] Add routing support for the new model for both cutlass and trtllm moe backend (#9792)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-12-15 19:59:08 -08:00 |
|
Enwei Zhu
|
7cd5a67e25
|
[TRTLLM-9372][feat] Enable CuteDSL MoE with Large EP (#9592)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-05 22:08:52 -08:00 |
|
Enwei Zhu
|
13fbd4366a
|
[TRTLLM-9370][feat] Integration of CuteDSL NVFP4 grouped GEMM (Part 2: SwiGLU Fusion and Finalize Fusion) (#9288)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-11-21 14:03:38 -08:00 |
|
ChristinaZ
|
fbf6c16cd2
|
[None][fix] Update the default invalid value for deepseek mode of routing (#9222)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-11-19 10:14:06 +08:00 |
|
Enwei Zhu
|
7c4777a571
|
[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-11-18 17:40:12 -08:00 |
|
dongxuy04
|
a370643b26
|
[None][fix] support topk autotuner input for expert slot per group larger than 32 (#9087)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
|
2025-11-14 08:37:20 +08:00 |
|
ChristinaZ
|
c8b9998acb
|
[TRTLLM-8637][feat] Optimize the routing kernel for DeepseekV3 (MoE CUTLASS backend); Add support for KimiK2 and Qwen-next (MoE TRTLLM backend) (#7761)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-10-20 10:08:31 +08:00 |
|
ChristinaZ
|
db1c271bc6
|
[None][feat] Revise the calculation related to TileN in routing of MOE TRTLLM backend (#8148)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-10-16 09:15:46 +08:00 |
|
ChristinaZ
|
be576a3152
|
[None] [feat] Enable run_post_quant_allgather for MoE TRTLLM backend (#6794)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-09-23 08:24:21 +08:00 |
|
ChristinaZ
|
c5fb692a7d
|
Refactor the rest routing part for the routing kernels in the MoE TRT-LLM backend (#5771)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-07-11 16:37:56 +08:00 |
|