mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* Why? We would like to support EPD disaggregated serving for Qwen3 VL. * What? This commit adds such support, and extends existing unit tests for correctness checks. Some minor (protected) interface changes had to be made to the weight mapper as a side-effect. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| hf | ||
| mistral | ||
| __init__.py | ||
| auto_mapper.py | ||
| base_checkpoint_loader.py | ||
| base_config_loader.py | ||
| base_weight_loader.py | ||
| base_weight_mapper.py | ||