TensorRT-LLMs/longbench_v2.yaml at main - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Fanrong Li 2f526583fb

[None][chore] Move the rocketkv e2e test to post-merge (#9768 )

Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>

2025-12-08 13:22:16 +08:00

13 lines

289 B

YAML

Raw Permalink Blame History

 DeepSeek-R1-0528:
   - quant_algo: FP8_BLOCK_SCALES
     kv_cache_quant_algo: FP8
     spec_dec_algo: MTP
     accuracy: 52.093
   - quant_algo: NVFP4
     kv_cache_quant_algo: FP8
     spec_dec_algo: MTP
     accuracy: 52.093
 meta-llama/Llama-3.1-8B-Instruct:
   - accuracy: 26.00
     sigma: 25.8