TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 12:12:39 +08:00

History

2ez4bz ccb62ef97e [TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 ) This commit adds some level of FP8 support to Mistral Small 3.1 by: * disabling quantization for the vision sub-model since `modelopt` does support quantizing it (yet). * extending existing accuracy tests to use a modelopt produced FP8 checkpoint. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2025-08-13 21:25:55 -04:00
..
cnn_dailymail.yaml	[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 )	2025-08-13 21:25:55 -04:00
gpqa_diamond.yaml	[TRTLLM-4932] Add Llama-3.1-Nemotron-Nano-8B-v1-FP8 accuracy tests (#4933 )	2025-06-12 15:06:28 +08:00
gsm8k.yaml	[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 )	2025-08-13 21:25:55 -04:00
humaneval.yaml	Update (#2978 )	2025-03-23 16:39:35 +08:00
json_mode_eval.yaml	test: Add json_mode_eval for guided decoding evaluation (#5179 )	2025-06-16 10:03:55 +08:00
mmlu.yaml	[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731 )	2025-08-13 21:25:55 -04:00
passkey_retrieval_64k.yaml	test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982 )	2025-03-25 07:34:10 +08:00
passkey_retrieval_128k.yaml	test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982 )	2025-03-25 07:34:10 +08:00
SlimPajama-6B.yaml	test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982 )	2025-03-25 07:34:10 +08:00
zero_scrolls.yaml	Update (#2978 )	2025-03-23 16:39:35 +08:00