mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Kaiyu Xie 47806f09d9 feat: Support custom repo_dir for SLURM script (#6546 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: xxi <xxi@nvidia.com>		2025-08-12 22:06:59 -04:00
..
process_gen_iterlog.py	Add wide-ep benchmarking scripts (#5760 )	2025-07-05 19:29:39 +08:00
README.md	doc: Refactor documents and examples of disaggregated serving and wide ep (#6054 )	2025-07-23 09:20:57 +08:00
submit.sh	feat: Support custom repo_dir for SLURM script (#6546 )	2025-08-12 22:06:59 -04:00

README.md

TensorRT-LLM Wide-EP Benchmark Scripts

This directory contains scripts for benchmarking TensorRT-LLM wide-ep performance using SLURM job scheduler.

⚠️ DISCLAIMER

These scripts are currently not QA'ed and are provided for demonstration purposes only.

Please note that:

These scripts have not undergone formal quality assurance testing
They are intended for demonstration and educational purposes
Use at your own risk in production environments
Always review and test scripts thoroughly before running in your specific environment

Scripts Overview

Core Scripts

Note that, core implementation of the slurm scripts are included in examples/disaggregated/slurm.

submit.sh - Main entry point for submitting benchmark jobs
process_gen_iterlog.py - Processes benchmark results and generates reports

Usage

Prerequisites

Before running the scripts, ensure you have:

Access to a SLURM cluster
Container image with TensorRT-LLM installed
Model files accessible on the cluster
Required environment variables set

Running Benchmarks

# Refer to `examples/disaggregated/slurm/`
# Please find the `disaggr_torch.slurm` script in the `examples/disaggregated/slurm/` directory.
# Make sure that SLURM parameters are correctly set in `disaggr_torch.slurm` before executing this script.
./submit.sh

Post-processes benchmark results using `process_gen_iterlog.py`

Parses iteration logs from workers
Calculates throughput metrics
Generates CSV reports
Supports MTP (Multi-Token Prediction) analysis