TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

William Zhang abb8106c01 [https://nvbugs/5835925 ][fix] Add EPD disagg support for Qwen3 VL MoE (#10962 ) * Why? Trying to instantiate a `MultimodalEncoder` for a Qwen3 VL MoE model would fail during weight loading. * What? This commit fixes the bug, alongside: - explicit, intentional support for EPD for Qwen3 VL MoE. - extends EPD unit tests for Qwen3 VL MoE, albeit with dummy weights. - unit tests for the weight mapper fixes. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>		2026-02-09 23:53:40 +08:00
..
hf	[https://nvbugs/5835925 ][fix] Add EPD disagg support for Qwen3 VL MoE (#10962 )	2026-02-09 23:53:40 +08:00
mistral	[None][fix] Reduce host memory usage during model loading (#11119 )	2026-02-05 08:57:40 -08:00
__init__.py	[None][feat] support Qwen3-VL dense model in pytorch backend (#9060 )	2025-12-31 17:54:26 +09:00
auto_mapper.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
base_checkpoint_loader.py	[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583 )	2025-12-05 16:07:20 +01:00
base_config_loader.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
base_weight_loader.py	[None][fix] Reduce host memory usage during model loading (#11119 )	2026-02-05 08:57:40 -08:00
base_weight_mapper.py	[None][fix] Reduce host memory usage during model loading (#11119 )	2026-02-05 08:57:40 -08:00