mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-22 02:35:21 +08:00
* Why? We would like to support EPD disaggregated serving for Qwen3 VL. * What? This commit adds such support, and extends existing unit tests for correctness checks. Some minor (protected) interface changes had to be made to the weight mapper as a side-effect. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| test_external_embedding.py | ||
| test_find_num_image_tokens.py | ||
| test_fuse_input_embeds.py | ||
| test_mm_encoder_standalone.py | ||
| test_multimodal_runtime.py | ||
| test_share_multiparams.py | ||