mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-25 21:22:57 +08:00
* fix: Fix p-tuning test bug * A change in the vocab_size calculation for T5Tokenizer, introduced in transformers version 4.34, caused addition of incorrect vtokens for ptuning. In general, instead of adding tokens which are outside the vocabulary, tokens inside the vocabulary were added. Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| defs | ||
| evaltool | ||
| perf_configs | ||
| test_input_files | ||
| test_lists | ||
| README.md | ||
TensorRT LLM test definitions
The following subfolder contains test definitions for Tensorrt LLM.
Directory structure
.
└── integration # Root directory for integration tests
├── defs # Tiest definitions
├── perf_configs # Configs for perf tests
└── test_lists # Test lists
├── bloom # Legacy test lists used by TURTLE (Do not add any new test lists here)
├── test-db # Test-DB (New test list convention adopted by pytest)
├── dev # Other test lists used by TRT LLM developers
├── qa # Test lists used by QA
└── waives.txt # Test waive list
- To run perf tests, you also need to first build the cpp benchmark by calling
build_wheel.pywith--benchmarksflag.
Run perf tests
All the perf test names are in the form of perf/test_perf.py::test_perf[...] where the ... part is the test parameters.
Below are some specific pytest options used for perf tests
# execute these in the tensorrt-llm source repo root dir.
# install dependencies, do not need to do it every time if already installed.
pip install -r requirements-dev.txt
# example 1: run a test case
# For example, if QA reports a perf bug for `perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]`, then you can repro it by running:
cd LLM_ROOT/tests/integration/defs
echo "perf/test_perf.py::test_perf[llama_7b-cppmanager-exe-plugin_ifb-float16-input_output_len:128,128,+512,32]" > perf.txt
pytest --perf --test-list=perf.txt --output-dir=/workspace/test-log --perf-log-formats csv --perf-log-formats yaml
The captured perf metrics will be saved in /workspace/test-log/perf_scripts_test_results.csv or /workspace/test-log/perf_scripts_test_results.yaml depends on the option --perf-log-formats, and the test logs are saved in /workspace/test-log/result.xmk. Currently, we capture these perf metrics:
test_perf_metric_build_time: The engine building time in seconds.test_perf_metric_build_peak_cpu_memory: The build-phase peak CPU mem usage in MB.test_perf_metric_build_peak_gpu_memory: The build-phase peak GPU mem usage in MB.test_perf_metric_inference_time: The inference latency in ms.test_perf_metric_inference_peak_gpu_memory: The inference-phase peak GPU mem usage in GB.test_perf_metric_context_gpu_memory: The context GPU mem usage in MB.
Common Issues and solutions
- No package 'libffi' found
Install libffi by
sudo apt-get install libffi-devand then remove the turtle-venv byrm -fr build/turtle_venv, and rerun.