TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-18 16:55:08 +08:00

History

William Zhang 2dd3ebf037 [#9150 ][feat] Add code for nano v3 to custom implementation in AD (#9465 ) * Why? We would like to show an alternative to monkey-patching in AutoDeploy. * What? This commit builds on the existing custom model implementation for NemotronH and adds the bits relevant for MoE layers. Part of #9150. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2025-12-02 08:56:44 -08:00
..
_utils_test	[#9271 ][perf] Enable multi-stream MOE optimization in AutoDeploy (#9322 )	2025-11-24 19:50:10 -08:00
unit	[#9150 ][feat] Add code for nano v3 to custom implementation in AD (#9465 )	2025-12-02 08:56:44 -08:00