mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Kaiyu Xie db2a42f641 [None][chore] Add sample yaml for wide-ep example and minor fixes (#8825 ) Signed-off-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com> Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com>		2025-11-03 07:48:34 -08:00
..
config.yaml	[None][chore] Add sample yaml for wide-ep example and minor fixes (#8825 )	2025-11-03 07:48:34 -08:00
process_gen_iterlog.py	[None] [chore] Update wide-ep genonly scripts (#6995 )	2025-08-19 07:44:07 -04:00
README.md	[None][chore] Add sample yaml for wide-ep example and minor fixes (#8825 )	2025-11-03 07:48:34 -08:00

README.md

TensorRT LLM Wide-EP Benchmark Scripts

This directory contains scripts for benchmarking TensorRT LLM wide-ep performance using SLURM job scheduler.

⚠️ DISCLAIMER

These scripts are currently not QA'ed and are provided for demonstration purposes only.

Please note that:

These scripts have not undergone formal quality assurance testing
They are intended for demonstration and educational purposes
Use at your own risk in production environments
Always review and test scripts thoroughly before running in your specific environment

Scripts Overview

Core Scripts

Note that, core implementation of the slurm scripts are included in examples/disaggregated/slurm/benchmark.

process_gen_iterlog.py - Processes benchmark results and generates reports

Usage

Prerequisites

Before running the scripts, ensure you have:

Access to a SLURM cluster
Container image with TensorRT LLM installed
Model files accessible on the cluster
Required environment variables set

Run Benchmarks

# Please find the `submit.py` script in the `examples/disaggregated/slurm/benchmark/` directory.
# An example `config.yaml` for wide EP: `examples/wide_ep/slurm_scripts/config.yaml`.
python3 submit.py -c config.yaml