mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Enwei Zhu 74df12bbaa [TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 ) * fix formula Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * update doc Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * fix Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * 1st version Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * polish Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * fix Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> --------- Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-05-08 19:35:23 +08:00
..
README.md	[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 )	2025-05-08 19:35:23 +08:00
requirements.txt	[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 )	2025-05-08 19:35:23 +08:00

[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 )

* fix formula

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* update doc

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* 1st version

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* polish

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

---------

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

2025-05-08 19:35:23 +08:00

README.md

[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 )

2025-05-08 19:35:23 +08:00

requirements.txt

[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval (#3946 )

2025-05-08 19:35:23 +08:00

README.md

Accuracy Evaluation Tool `trtllm-eval`

We provide a CLI tool trtllm-eval for evaluating model accuracy. It shares the core evaluation logics with the accuracy test suite of TensorRT-LLM.

trtllm-eval is built on the offline API -- LLM API. It provides developers a unified entrypoint for accuracy evaluation. Compared with the online API trtllm-serve, offline API provides clearer error messages and simplifies the debugging workflow.

trtllm-eval follows the CLI interface of trtllm-serve.

pip install -r requirements.txt

# Evaluate Llama-3.1-8B-Instruct on MMLU
wget https://people.eecs.berkeley.edu/~hendrycks/data.tar && tar -xf data.tar
trtllm-eval --model meta-llama/Llama-3.1-8B-Instruct mmlu --dataset_path data

# Evaluate Llama-3.1-8B-Instruct on GSM8K
trtllm-eval --model meta-llama/Llama-3.1-8B-Instruct gsm8k

# Evaluate Llama-3.3-70B-Instruct on GPQA Diamond
trtllm-eval --model meta-llama/Llama-3.3-70B-Instruct gpqa_diamond

The --model argument accepts either a Hugging Face model ID or a local checkpoint path. By default, trtllm-eval runs the model with the PyTorch backend; pass --backend tensorrt to switch to the TensorRT backend. Alternatively, the --model argument also accepts a local path to pre-built TensorRT engines; in that case, please pass the Hugging Face tokenizer path to the --tokenizer argument.

See more details by trtllm-eval --help.

README.md

Accuracy Evaluation Tool trtllm-eval

Accuracy Evaluation Tool `trtllm-eval`