mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
test: add accuracy reference (#6479)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
This commit is contained in:
parent
17e0d0fb1a
commit
ca534e4798
@ -22,6 +22,7 @@ meta-llama/Llama-4-Scout-17B-16E-Instruct:
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 79.62
|
||||
- quant_algo: FP8
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 80.37
|
||||
deepseek-ai/DeepSeek-V3-Lite:
|
||||
- accuracy: 64.74
|
||||
|
||||
@ -70,9 +70,10 @@ meta-llama/Llama-4-Scout-17B-16E-Instruct:
|
||||
- accuracy: 80.00
|
||||
- quant_algo: NVFP4
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 88.63
|
||||
accuracy: 79.60
|
||||
- quant_algo: FP8
|
||||
accuracy: 89.46
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 78.58
|
||||
mistralai/Mistral-7B-v0.1:
|
||||
- accuracy: 66
|
||||
mistralai/Mistral-7B-Instruct-v0.3:
|
||||
|
||||
@ -433,5 +433,3 @@ examples/test_qwen.py::test_llm_qwen_smooth_quant_single_gpu_summary[qwen2_vl_7b
|
||||
examples/test_recurrentgemma.py::test_llm_recurrentgemma_1gpu[use_cpp_session-recurrentgemma-2b-use_paged_cache-fp8-float16-enable_attn_plugin-enable_gemm_plugin] SKIP (https://nvbugs/5419070)
|
||||
examples/test_bert.py::test_llm_bert_general[compare_hf-enable_remove_input_padding-use_attention_plugin-enable_context_fmha-tp:1-pp:1-float16-BertForSequenceClassification-bert/bert-base-uncased-yelp-polarity] SKIP (https://nvbugs/5421989)
|
||||
examples/test_bert.py::test_llm_bert_general[compare_hf-enable_remove_input_padding-use_attention_plugin-enable_context_fmha-tp:1-pp:1-float16-RobertaForSequenceClassification-bert/twitter-roberta-base-emotion] SKIP (https://nvbugs/5421989)
|
||||
accuracy/test_llm_api_pytorch.py::TestLlama4ScoutInstruct::test_fp8[tp8ep8-cuda_graph=True] SKIP (https://nvbugs/5409414)
|
||||
accuracy/test_llm_api_pytorch.py::TestLlama4ScoutInstruct::test_fp8[tp4-cuda_graph=True] SKIP (https://nvbugs/5409414)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user