mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-23 12:12:39 +08:00
This commit adds some level of FP8 support to Mistral Small 3.1 by: * disabling quantization for the vision sub-model since `modelopt` does support quantizing it (yet). * extending existing accuracy tests to use a modelopt produced FP8 checkpoint. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| cnn_dailymail.yaml | ||
| gpqa_diamond.yaml | ||
| gsm8k.yaml | ||
| humaneval.yaml | ||
| json_mode_eval.yaml | ||
| mmlu.yaml | ||
| passkey_retrieval_64k.yaml | ||
| passkey_retrieval_128k.yaml | ||
| SlimPajama-6B.yaml | ||
| zero_scrolls.yaml | ||