mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-29 07:02:56 +08:00
51 lines
1.5 KiB
Markdown
51 lines
1.5 KiB
Markdown
# TensorRT-LLM Wide-EP Benchmark Scripts
|
|
|
|
This directory contains scripts for benchmarking TensorRT-LLM wide-ep performance using SLURM job scheduler.
|
|
|
|
## ⚠️ DISCLAIMER
|
|
|
|
**These scripts are currently not QA'ed and are provided for demonstration purposes only.**
|
|
|
|
Please note that:
|
|
|
|
- These scripts have not undergone formal quality assurance testing
|
|
- They are intended for demonstration and educational purposes
|
|
- Use at your own risk in production environments
|
|
- Always review and test scripts thoroughly before running in your specific environment
|
|
|
|
## Scripts Overview
|
|
|
|
### Core Scripts
|
|
|
|
Note that, core implementation of the slurm scripts are included in `examples/disaggregated/slurm`.
|
|
|
|
1. `submit.sh` - Main entry point for submitting benchmark jobs
|
|
2. `process_gen_iterlog.py` - Processes benchmark results and generates reports
|
|
|
|
## Usage
|
|
|
|
### Prerequisites
|
|
|
|
Before running the scripts, ensure you have:
|
|
- Access to a SLURM cluster
|
|
- Container image with TensorRT-LLM installed
|
|
- Model files accessible on the cluster
|
|
- Required environment variables set
|
|
|
|
### Running Benchmarks
|
|
|
|
```bash
|
|
# Refer to `examples/disaggregated/slurm/`
|
|
# Please find the `disaggr_torch.slurm` script in the `examples/disaggregated/slurm/` directory.
|
|
# Make sure that SLURM parameters are correctly set in `disaggr_torch.slurm` before executing this script.
|
|
./submit.sh
|
|
```
|
|
|
|
|
|
### Post-processes benchmark results using `process_gen_iterlog.py`
|
|
|
|
- Parses iteration logs from workers
|
|
- Calculates throughput metrics
|
|
- Generates CSV reports
|
|
- Supports MTP (Multi-Token Prediction) analysis
|