TensorRT-LLMs/json_mode_eval.yaml at 2b2781019892461ba27db8aee3215440d8aed76d - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-28 06:33:15 +08:00

Enwei Zhu 5ff3a65b23

[TRTLLM-7028][feat] Enable guided decoding with speculative decoding (part 2: one-model engine) (#6948 )

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

2025-09-03 15:16:11 -07:00

11 lines

237 B

YAML

Raw Blame History

 meta-llama/Llama-3.1-8B-Instruct:
   - accuracy: 74.00
   - spec_dec_algo: Eagle
     accuracy: 74.00
   - spec_dec_algo: NGram
     accuracy: 74.00
 deepseek-ai/DeepSeek-V3-Lite:
   - accuracy: 77.00
   - spec_dec_algo: MTP
     accuracy: 77.00