Zongfei Jing
|
bb2f883296
|
[None] [feat] Add test script and raster M for gather fc1 kernel (#10429)
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
|
2026-01-07 09:31:49 +08:00 |
|
alel
|
6b8ae6fa81
|
[None][feat] CuteDSL MOE FC1 Enhancement (#10088)
Signed-off-by: Yuhan Li <51736452+liyuhannnnn@users.noreply.github.com>
|
2026-01-06 09:30:43 +08:00 |
|
ZhichenJiang
|
46e4af5688
|
[TRTLLM-9831][perf] Enable 2CTA with autotune for CuteDSL MoE and Grouped GEMM optimizations (#10201)
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-25 09:04:20 -05:00 |
|
ZhichenJiang
|
4e55b83101
|
[None][perf] Add more optimization options for MOE CuteDSL finalized kernel (#10042)
Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>
|
2025-12-18 22:49:28 +08:00 |
|
tburt-nv
|
6147452158
|
[https://nvbugs/4141427][chore] Add more details to LICENSE file (#9881)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
2025-12-13 08:35:31 +08:00 |
|
alel
|
4107254c82
|
[TRTLLM-6222][feat] Several perf opt for cuteDSL nvf4 gemm (#9428)
Signed-off-by: Yuhan Li <51736452+liyuhannnnn@users.noreply.github.com>
|
2025-12-01 18:10:45 +08:00 |
|