TensorRT-LLMs/tests/scripts/perf-sanity
2026-01-12 10:55:07 -08:00
..
benchmark-serve.sh [TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 (#7985) 2025-10-22 10:17:22 +08:00
config_database_b200_nvl.yaml [https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192) 2026-01-12 10:55:07 -08:00
config_database_h200_sxm.yaml [https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192) 2026-01-12 10:55:07 -08:00
deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml [TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282) 2025-12-31 21:44:59 +08:00
deepseek_r1_fp4_v2_blackwell.yaml [TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282) 2025-12-31 21:44:59 +08:00
deepseek_r1_fp4_v2_grace_blackwell.yaml [TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489) 2026-01-12 14:23:23 +08:00
deepseek_r1_fp8_blackwell.yaml [TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282) 2025-12-31 21:44:59 +08:00
gpt_oss_120b_fp4_grace_blackwell.yaml [TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282) 2025-12-31 21:44:59 +08:00
parse_benchmark_results.py [TRTLLM-8260][feat] Add Server-Client Perf Test in pytest for B200 and B300 (#7985) 2025-10-22 10:17:22 +08:00
README.md [TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282) 2025-12-31 21:44:59 +08:00
run_benchmark_serve.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00

TensorRT-LLM Perf Sanity Test System

Performance sanity testing scripts for TensorRT-LLM with configuration-driven test cases supporting single-node, multi-node aggregated, and multi-node disaggregated architectures.

Overview

  • Run performance sanity benchmarks across multiple model configs
  • Support three deployment architectures: single-node, multi-node aggregated, and multi-node disaggregated
  • Manage test cases through YAML config files
  • Automated resource calculation and job submission via SLURM

Configuration File Types

There are two modes for perf sanity tests: aggregated (aggr) and disaggregated (disagg).

Aggregated Mode (aggr)

Config Location: tests/scripts/perf-sanity

File Naming: xxx.yaml where words are connected by _ (underscore), not - (hyphen).

File Examples:

  • deepseek_r1_fp4_v2_grace_blackwell.yaml - Single-node aggregated test
  • deepseek_r1_fp4_v2_2_nodes_grace_blackwell.yaml - Multi-node aggregated test

Use Cases:

  • Single-node: Performance tests on a single server with multiple GPUs
  • Multi-node: Model runs across multiple nodes with unified execution

Test Case Names:

perf/test_perf_sanity.py::test_e2e[aggr_upload-{config yaml file base name}]
perf/test_perf_sanity.py::test_e2e[aggr_upload-{config yaml file base name}-{server_config_name}]
  • Without server config name: runs all server configs in the YAML file
  • With server config name: runs only the specified server config (the name field in server_configs)

Examples:

perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp4_v2_grace_blackwell]
perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp4_v2_grace_blackwell-r1_fp4_v2_dep4_mtp1_1k1k]
perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp4_v2_grace_blackwell-r1_fp4_v2_tep4_mtp3_1k1k]

Disaggregated Mode (disagg)

Config Location: tests/integration/defs/perf/disagg/test_configs/disagg/perf

File Naming: xxx.yaml (can contain - hyphen).

File Example: deepseek-r1-fp4_1k1k_ctx1_gen1_dep8_bs768_eplb0_mtp0_ccb-UCX.yaml

Use Case: Disaggregated architecture where model runs across multiple nodes with separate context (prefill) and generation (decode) servers.

Test Case Name:

perf/test_perf_sanity.py::test_e2e[disagg_upload-{config yaml file base name}]

Example:

perf/test_perf_sanity.py::test_e2e[disagg_upload-deepseek-r1-fp4_1k1k_ctx1_gen1_dep8_bs768_eplb0_mtp0_ccb-UCX]

Running Tests

Important: Do NOT add --perf flag when running pytest. Perf sanity tests are static test cases and do not use perf mode.

# Run all server configs in an aggregated test
pytest perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp4_v2_grace_blackwell]

# Run a specific server config in an aggregated test
pytest perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp4_v2_grace_blackwell-r1_fp4_v2_dep4_mtp1_1k1k]

# Run a specific disaggregated test
pytest perf/test_perf_sanity.py::test_e2e[disagg_upload-deepseek-r1-fp4_1k1k_ctx1_gen1_dep8_bs768_eplb0_mtp0_ccb-UCX]