TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-18 16:55:08 +08:00

History

William Zhang 2dd3ebf037 [#9150 ][feat] Add code for nano v3 to custom implementation in AD (#9465 ) * Why? We would like to show an alternative to monkey-patching in AutoDeploy. * What? This commit builds on the existing custom model implementation for NemotronH and adds the bits relevant for MoE layers. Part of #9150. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2025-12-02 08:56:44 -08:00
..
multigpu	[#9198 ][feat] Refactor dist ops in AutoDeploy (#9301 )	2025-12-02 02:36:32 +08:00
singlegpu	[#9150 ][feat] Add code for nano v3 to custom implementation in AD (#9465 )	2025-12-02 08:56:44 -08:00