TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

2ez4bz ccb62ef97e [TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 ) This commit adds some level of FP8 support to Mistral Small 3.1 by: * disabling quantization for the vision sub-model since `modelopt` does support quantizing it (yet). * extending existing accuracy tests to use a modelopt produced FP8 checkpoint. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2025-08-13 21:25:55 -04:00
..
dev	Update (#2978 )	2025-03-23 16:39:35 +08:00
qa	[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 )	2025-08-13 21:25:55 -04:00
test-db	[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 )	2025-08-13 21:25:55 -04:00
waives.txt	[https://nvbugs/5401114 ][fix] Unwaive Gemma3 tests (#6870 )	2025-08-13 20:05:35 -04:00