mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* extend pyt nano tests perf coverage Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> * explicitly set maxnt for some cases This is because the test harness default to no prefill chunking, that means the isl specified is the true context. When explicitly unspecified in the test harness, the `maxnt` passed down to `trtllm-bench` is 2048. This means trtllm-bench gets conflicting inputs when isl>2048 but maxnt=2048; hence overriding maxnt to be consistent with isl for such cases. Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> --------- Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| defs | ||
| perf_configs | ||
| test_input_files | ||
| test_lists | ||
| README.md | ||
TensorRT LLM test definitions
The following subfolder contains test definitions for Tensorrt LLM.
Directory structure
.
└── integration # Root directory for integration tests
├── defs # Test definitions
├── perf_configs # Configs for perf tests
└── test_lists # Test lists
├── test-db # Test-DB that is the test list convention adopted by CI
├── dev # Other test lists used by TRT LLM developers
├── qa # Test lists used by QA
└── waives.txt # Test waive list
- To run perf tests, you also need to first build the cpp benchmark by calling
build_wheel.pywith--benchmarksflag.
Run perf tests
All the perf test names are in the form of perf/test_perf.py::test_perf[...] where the ... part is the test parameters.
Below are some specific pytest options used for perf tests
# execute these in the tensorrt-llm source repo root dir.
# install dependencies, do not need to do it every time if already installed.
pip install -r requirements-dev.txt
# example 1: run a test case
# For example, if QA reports a perf bug for `perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]`, then you can repro it by running:
cd LLM_ROOT/tests/integration/defs
echo "perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]" > perf.txt
pytest --perf --test-list=perf.txt --output-dir=/workspace/test-log --perf-log-formats csv --perf-log-formats yaml
The captured perf metrics will be saved in /workspace/test-log/perf_scripts_test_results.csv or /workspace/test-log/perf_scripts_test_results.yaml depends on the option --perf-log-formats, and the test logs are saved in /workspace/test-log/result.xmk. Currently, we capture these perf metrics:
test_perf_metric_build_time: The engine building time in seconds.test_perf_metric_build_peak_cpu_memory: The build-phase peak CPU mem usage in MB.test_perf_metric_build_peak_gpu_memory: The build-phase peak GPU mem usage in MB.test_perf_metric_inference_time: The inference latency in ms.test_perf_metric_inference_peak_gpu_memory: The inference-phase peak GPU mem usage in GB.test_perf_metric_context_gpu_memory: The context GPU mem usage in MB.
Common Issues and solutions
- No package 'libffi' found
Install libffi by
sudo apt-get install libffi-devand rerun.