TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

tomeras91 6e712dd1cc [None][fix] enable NvFP4/FP8 quantization for Nemotron-H architecture (#7589 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>		2025-09-09 11:42:22 +03:00
..
hf	[None][fix] enable NvFP4/FP8 quantization for Nemotron-H architecture (#7589 )	2025-09-09 11:42:22 +03:00
__init__.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
auto_mapper.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
base_checkpoint_loader.py	[TRTLLM-6823][doc] Add checkpoint refactor docs (#6592 )	2025-08-10 19:47:39 -04:00
base_config_loader.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
base_weight_loader.py	[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats (#5372 )	2025-07-17 00:50:30 +08:00
base_weight_mapper.py	[nvbug 5380101][fix] Fix nemotronNAS loading for TP>1 (#6447 )	2025-07-30 07:22:32 -04:00