Wanli Jiang
|
421eb9e39c
|
[None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-02-12 09:25:31 -05:00 |
|
Iman Tabrizian
|
7d992972b2
|
[TRTLLM-10273][feat] Move MambaCacheManager from Python to C++ (#10540)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2026-02-10 07:20:56 -08:00 |
|
Tailing Yuan
|
91528365a9
|
[None][feat] Add performance alignment to layer-wise benchmarks (#11018)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2026-01-29 14:01:51 +08:00 |
|
Enwei Zhu
|
72ef732bcf
|
[TRTLLM-10147][perf] Balanced random MoE workload generator for CuteDSL kernel UT, autotuner and layerwise benchmark (#10279)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2026-01-25 21:02:30 +08:00 |
|
Tailing Yuan
|
38296a472b
|
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2026-01-13 19:17:03 +08:00 |
|