mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-06 11:11:36 +08:00
* Why? There were a couple of issues with the recently merged custom model injection for AutoDeploy + the reference implementation of nemotron H: - `d_mlp` was left in despite being mathematically always null (could lead to runtime issues during sharding). - the custom model mapping was inherited by children factories. * What? This commit fixes these issues, and refactors the key of the custom implementation to be based on the name of the configuration class as well. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| _utils_test | ||
| unit | ||