TensorRT-LLMs/tests/unittest/_torch/auto_deploy/unit
William Zhang a4049fc557
[#9413][fix] Minor fixes to nemotron H and custom models in AD (#9416)
* Why?

There were a couple of issues with the recently merged custom model
injection for AutoDeploy + the reference implementation of nemotron
H:
- `d_mlp` was left in despite being mathematically always null (could
  lead to runtime issues during sharding).
- the custom model mapping was inherited by children factories.

* What?

This commit fixes these issues, and refactors the key of the custom
implementation to be based on the name of the configuration class as
well.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-11-24 20:17:33 -08:00
..
multigpu [TRTLLM-8201][feat] Nemotron H MoE Sharding (#8744) 2025-11-05 12:35:29 -08:00
singlegpu [#9413][fix] Minor fixes to nemotron H and custom models in AD (#9416) 2025-11-24 20:17:33 -08:00