TensorRT-LLMs/tensorrt_llm/_torch/models/checkpoints
William Zhang 92576488d3
[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013)
* Why?

Some models (e.g. anything produced by Mistral) can have both sharded
safetensors and a consolidated safetensor in the same checkpoint
directory. In such cases, prefetching both to memory is a waste of time,
and memory.

* What?

This commit skips over consolidated safetensors when they are not the
only safetensor file present in the checkpoint directory

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-25 23:56:21 -04:00
..
hf [None][feat] Skip prefetching consolidated safetensors when appropriate (#7013) 2025-08-25 23:56:21 -04:00
__init__.py [TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372) 2025-07-17 00:50:30 +08:00
auto_mapper.py [TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372) 2025-07-17 00:50:30 +08:00
base_checkpoint_loader.py [TRTLLM-6823][doc] Add checkpoint refactor docs (#6592) 2025-08-10 19:47:39 -04:00
base_config_loader.py [TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372) 2025-07-17 00:50:30 +08:00
base_weight_loader.py [TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372) 2025-07-17 00:50:30 +08:00
base_weight_mapper.py [nvbug 5380101][fix] Fix nemotronNAS loading for TP>1 (#6447) 2025-07-30 07:22:32 -04:00