TensorRT-LLMs/tests/integration/defs/accuracy/references/longbench_v1.yaml
Tian Zheng 5efee01da1
[None][feat] Add Skip Softmax MLA kernels for Blackwell and Fix an accuracy bug of NVFP4 KV (#10813)
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2026-01-26 16:46:33 +08:00

13 lines
378 B
YAML

Qwen3/Qwen3-30B-A3B-Instruct-2507:
# Skip Softmax Attention ref accuracy
- extra_acc_spec: "target_sparsity=0.0"
accuracy: 47.357
- extra_acc_spec: "target_sparsity=0.5"
accuracy: 47.102
- extra_acc_spec: "target_sparsity=0.9"
accuracy: 46.169
deepseek-ai/DeepSeek-V3-0324:
- quant_algo: NVFP4
extra_acc_spec: "target_sparsity=0.9"
accuracy: 44.94