mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

QI JUN b5473f7eca waive llama3.1 8B test cases with pipeline parallelism (#3433 ) * waive llama3.1 8B test cases with pipeline parallelism Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * update Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> --------- Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>		2025-04-10 11:07:58 +08:00
..
defs	waive llama3.1 8B test cases with pipeline parallelism (#3433 )	2025-04-10 11:07:58 +08:00
evaltool	chore: remove usernames from comments (#3291 )	2025-04-05 13:44:28 +08:00
perf_configs	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_input_files	test: add random image test for llama-3.2-11b-vision (#3055 )	2025-03-26 15:38:16 +08:00
test_lists	chore : split GptExecutor tests out of gpt tests to reduce single test time (#3412 )	2025-04-10 09:08:15 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00

README.md

TensorRT LLM test definitions

The following subfolder contains test definitions for Tensorrt LLM.

Directory structure

.
└── integration              # Root directory for integration tests
    ├── defs            #     Tiest definitions
    ├── perf_configs    #     Configs for perf tests
    └── test_lists      #     Test lists
        ├── bloom       #         Legacy test lists used by TURTLE (Do not add any new test lists here)
        ├── test-db     #         Test-DB (New test list convention adopted by pytest)
        ├── dev         #         Other test lists used by TRT LLM developers
        ├── qa          #         Test lists used by QA
        └── waives.txt  #         Test waive list

To run perf tests, you also need to first build the cpp benchmark by calling build_wheel.py with --benchmarks flag.

Run perf tests

All the perf test names are in the form of perf/test_perf.py::test_perf[...] where the ... part is the test parameters.

Below are some specific pytest options used for perf tests

# execute these in the tensorrt-llm source repo root dir.
# install dependencies, do not need to do it every time if already installed.
pip install -r requirements-dev.txt

# example 1: run a test case
# For example, if QA reports a perf bug for `perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]`, then you can repro it by running:
cd LLM_ROOT/tests/integration/defs
echo "perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]" > perf.txt
pytest --perf --test-list=perf.txt --output-dir=/workspace/test-log --perf-log-formats csv --perf-log-formats yaml

The captured perf metrics will be saved in /workspace/test-log/perf_scripts_test_results.csv or /workspace/test-log/perf_scripts_test_results.yaml depends on the option --perf-log-formats, and the test logs are saved in /workspace/test-log/result.xmk. Currently, we capture these perf metrics:

test_perf_metric_build_time: The engine building time in seconds.
test_perf_metric_build_peak_cpu_memory: The build-phase peak CPU mem usage in MB.
test_perf_metric_build_peak_gpu_memory: The build-phase peak GPU mem usage in MB.
test_perf_metric_inference_time: The inference latency in ms.
test_perf_metric_inference_peak_gpu_memory: The inference-phase peak GPU mem usage in GB.
test_perf_metric_context_gpu_memory: The context GPU mem usage in MB.

Common Issues and solutions

No package 'libffi' found Install libffi by sudo apt-get install libffi-dev and then remove the turtle-venv by rm -fr build/turtle_venv, and rerun.