mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
171 lines
4.2 KiB
ReStructuredText
171 lines
4.2 KiB
ReStructuredText
trtllm-bench
|
|
===========================
|
|
|
|
trtllm-bench is a comprehensive benchmarking tool for TensorRT LLM engines. It provides three main subcommands for different benchmarking scenarios:
|
|
|
|
.. include:: ../_includes/note_sections.rst
|
|
:start-after: .. start-note-config-flag-alias
|
|
:end-before: .. end-note-config-flag-alias
|
|
|
|
Syntax
|
|
------
|
|
|
|
.. click:: tensorrt_llm.commands.bench:main
|
|
:prog: trtllm-bench
|
|
:nested: full
|
|
:commands: throughput, latency, build
|
|
|
|
|
|
|
|
Dataset preparation
|
|
------------------
|
|
|
|
prepare_dataset.py
|
|
^^^^^^^^^^^^^^^^^^
|
|
|
|
trtllm-bench is designed to work with the `prepare_dataset.py <https://github.com/NVIDIA/TensorRT-LLM/blob/main/benchmarks/cpp/prepare_dataset.py>`_ script, which generates benchmark datasets in the required format. The prepare_dataset script supports:
|
|
|
|
**Dataset Types:**
|
|
|
|
- Real datasets from various sources
|
|
- Synthetic datasets with normal or uniform token distributions
|
|
- LoRA task-specific datasets
|
|
|
|
**Key Features:**
|
|
|
|
- Tokenizer integration for proper text preprocessing
|
|
- Configurable random seeds for reproducible results
|
|
- Support for LoRA adapters and task IDs
|
|
- Output in JSON format compatible with trtllm-bench
|
|
|
|
.. important::
|
|
The ``--stdout`` flag is **required** when using prepare_dataset.py with trtllm-bench to ensure proper data streaming format.
|
|
|
|
**Usage:**
|
|
|
|
prepare_dataset
|
|
"""""""""""""""
|
|
|
|
.. code-block:: bash
|
|
|
|
python prepare_dataset.py [OPTIONS]
|
|
|
|
**Options**
|
|
|
|
----
|
|
|
|
.. list-table::
|
|
:widths: 20 80
|
|
:header-rows: 1
|
|
|
|
* - Option
|
|
- Description
|
|
* - ``--tokenizer``
|
|
- Tokenizer directory or HuggingFace model name (required)
|
|
* - ``--output``
|
|
- Output JSON filename (default: preprocessed_dataset.json)
|
|
* - ``--stdout``
|
|
- Print output to stdout with JSON dataset entry on each line (**required for trtllm-bench**)
|
|
* - ``--random-seed``
|
|
- Random seed for token generation (default: 420)
|
|
* - ``--task-id``
|
|
- LoRA task ID (default: -1)
|
|
* - ``--rand-task-id``
|
|
- Random LoRA task range (two integers)
|
|
* - ``--lora-dir``
|
|
- Directory containing LoRA adapters
|
|
* - ``--log-level``
|
|
- Logging level: info or debug (default: info)
|
|
|
|
dataset
|
|
"""""""
|
|
|
|
Process real datasets from various sources.
|
|
|
|
.. code-block:: bash
|
|
|
|
python prepare_dataset.py dataset [OPTIONS]
|
|
|
|
**Options**
|
|
|
|
----
|
|
|
|
.. list-table::
|
|
:widths: 20 80
|
|
:header-rows: 1
|
|
|
|
* - Option
|
|
- Description
|
|
* - ``--input``
|
|
- Input dataset file or directory (required)
|
|
* - ``--max-input-length``
|
|
- Maximum input sequence length (default: 2048)
|
|
* - ``--max-output-length``
|
|
- Maximum output sequence length (default: 512)
|
|
* - ``--num-samples``
|
|
- Number of samples to process (default: all)
|
|
* - ``--format``
|
|
- Input format: json, jsonl, csv, or txt (default: auto-detect)
|
|
|
|
|
|
token_norm_dist
|
|
"""""""""""""""
|
|
|
|
Generate synthetic datasets with normal token distribution.
|
|
|
|
.. code-block:: bash
|
|
|
|
python prepare_dataset.py token_norm_dist [OPTIONS]
|
|
|
|
**Options**
|
|
|
|
----
|
|
|
|
.. list-table::
|
|
:widths: 20 80
|
|
:header-rows: 1
|
|
|
|
* - Option
|
|
- Description
|
|
* - ``--num-requests``
|
|
- Number of requests to be generated (required)
|
|
* - ``--input-mean``
|
|
- Normal distribution mean for input tokens (required)
|
|
* - ``--input-stdev``
|
|
- Normal distribution standard deviation for input tokens (required)
|
|
* - ``--output-mean``
|
|
- Normal distribution mean for output tokens (required)
|
|
* - ``--output-stdev``
|
|
- Normal distribution standard deviation for output tokens (required)
|
|
|
|
|
|
token_unif_dist
|
|
"""""""""""""""
|
|
|
|
Generate synthetic datasets with uniform token distribution
|
|
|
|
.. code-block:: bash
|
|
|
|
python prepare_dataset.py token_unif_dist [OPTIONS]
|
|
|
|
**Options**
|
|
|
|
----
|
|
|
|
.. list-table::
|
|
:widths: 20 80
|
|
:header-rows: 1
|
|
|
|
* - Option
|
|
- Description
|
|
* - ``--num-requests``
|
|
- Number of requests to be generated (required)
|
|
* - ``--input-min``
|
|
- Uniform distribution minimum for input tokens (required)
|
|
* - ``--input-max``
|
|
- Uniform distribution maximum for input tokens (required)
|
|
* - ``--output-min``
|
|
- Uniform distribution minimum for output tokens (required)
|
|
* - ``--output-max``
|
|
- Uniform distribution maximum for output tokens (required)
|