TensorRT-LLMs/tests/scripts/perf-sanity
chenfeiz0326 d70aeddc7f
[TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-12-26 22:50:53 +08:00
..
benchmark-serve.sh [TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 (#7985) 2025-10-22 10:17:22 +08:00
config_database_b200_nvl.yaml [None][fix] enable KV cache reuse for config database (#10094) 2025-12-19 15:16:56 -08:00
config_database_h200_sxm.yaml [None][fix] enable KV cache reuse for config database (#10094) 2025-12-19 15:16:56 -08:00
deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
deepseek_r1_fp4_v2_blackwell.yaml [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
deepseek_r1_fp4_v2_grace_blackwell.yaml [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
deepseek_r1_fp8_blackwell.yaml [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
gpt_oss_120b_fp4_blackwell.yaml [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
parse_benchmark_results.py [TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 (#7985) 2025-10-22 10:17:22 +08:00
README.md [TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138) 2025-12-26 22:50:53 +08:00
run_benchmark_serve.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00

TensorRT-LLM Perf Sanity Test System

Performance sanity testing scripts for TensorRT-LLM with configuration-driven test cases supporting single-node, multi-node aggregated, and multi-node disaggregated architectures.

Overview

  • Run performance sanity benchmarks across multiple model configs
  • Support three deployment architectures: single-node, multi-node aggregated, and multi-node disaggregated
  • Manage test cases through YAML config files
  • Automated resource calculation and job submission via SLURM

Configuration File Types

There are three types of YAML config files for different deployment architectures. Aggregated config files are in tests/scripts/perf-sanity. Disaggregated config files are in tests/integration/defs/perf/disagg/test_configs/disagg/perf.

1. Single-Node Aggregated Test Configuration

File Example: deepseek_r1_fp4_v2_grace_blackwell.yaml

Use Case: Single-node performance tests on a single server with multiple GPUs.

2. Multi-Node Aggregated Test Configuration

File Example: deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml

Use Case: Multi-node aggregated architecture where model runs across multiple nodes with unified execution.

3. Multi-Node Disaggregated Test Configuration

File Example: deepseek-r1-fp4_1k1k_ctx1_gen1_dep8_bs768_eplb0_mtp0_ccb-UCX.yaml

Use Case: Disaggregated architecture where model runs across multiple nodes with separate context (prefill) and generation (decode) servers.