TensorRT-LLMs/tests/integration/test_lists/qa
dominicshanshan 6f245ec78b
[None][chore] Mass integration of release/1.0 (#6864)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: raayandhar <rdhar@nvidia.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: 2ez4bz <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Raayan Dhar <58057652+raayandhar@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-08-22 09:25:15 +08:00
..
.gitignore Update (#2978) 2025-03-23 16:39:35 +08:00
llm_digits_func.txt [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00
llm_digits_perf.txt [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00
llm_function_full.txt [None][chore] Mass integration of release/1.0 (#6864) 2025-08-22 09:25:15 +08:00
llm_function_gb20x.txt [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00
llm_function_multinode.txt [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00
llm_function_nim.txt [TRTLLM-6541][test] Add NIM Related Cases [StarCoder2_7B] and [Codestral_22B_V01] (#6939) 2025-08-19 00:13:04 -04:00
llm_function_rtx6kd.txt [None][infra] Enable accuracy test for mtp and chunked prefill (#6314) 2025-08-19 07:42:52 +08:00
llm_function_sanity.txt [None][chore] Mass integration of release/1.0 (#6864) 2025-08-22 09:25:15 +08:00
llm_perf_cluster.yml [TRTLLM-5252][test] add for mistral_small_3.1_24b perf test (#6685) 2025-08-07 22:57:04 -04:00
llm_perf_full.yml [TRTLLM-5252][test] add for mistral_small_3.1_24b perf test (#6685) 2025-08-07 22:57:04 -04:00
llm_perf_nim.yml [None][test] fix yml condition error under qa folder (#6734) 2025-08-08 15:59:01 +10:00
llm_perf_sanity.yml [None][test] correct test-db context for perf yaml file (#6686) 2025-08-07 02:47:10 -04:00
llm_release_perf_multinode_test.txt chore: Mass integration of release/0.18 (#3421) 2025-04-16 10:03:29 +08:00
llm_triton_integration.txt [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00
llm_trt_integration_perf_sanity.yml [None][test] correct test-db context for perf yaml file (#6686) 2025-08-07 02:47:10 -04:00
llm_trt_integration_perf.yml [None][test] correct test-db context for perf yaml file (#6686) 2025-08-07 02:47:10 -04:00
README.md [None][doc] add introduction doc on qa test (#6535) 2025-08-05 17:02:17 +08:00

Description

This folder contains QA test definitions for TensorRT-LLM, which are executed on a daily/release schedule. These tests focus on end-to-end validation, accuracy verification, disaggregated testing, and performance benchmarking.

Test Categories

QA tests are organized into three main categories:

1. Functional Tests

Functional tests include E2E (end-to-end), accuracy, and disaggregated test cases:

  • E2E Tests: Complete workflow validation from model loading to inference output
  • Accuracy Tests: Model accuracy verification against reference implementations
  • Disaggregated Tests: Distributed deployment and multi-node scenario validation

2. Performance Tests

Performance tests focus on benchmarking and performance validation:

  • Baseline performance measurements
  • Performance regression detection
  • Throughput and latency benchmarking
  • Resource utilization analysis

3. Triton Backend Tests

Triton backend tests validate the integration with NVIDIA Triton Inference Server:

  • Backend functionality validation
  • Model serving capabilities
  • API compatibility testing
  • Integration performance testing

Dependencies

The following Python packages are required for running QA tests:

pip install mako oyaml rouge_score lm_eval

Dependency Details

  • mako: Template engine for test generation and configuration
  • oyaml: YAML parser with ordered dictionary support
  • rouge_score: ROUGE evaluation metrics for text generation quality assessment
  • lm_eval: Language model evaluation framework

Test Files

This directory contains various test configuration files:

Functional Test Lists

  • llm_function_full.txt - Primary test list for single node multi-GPU scenarios (all new test cases should be added here)
  • llm_function_sanity.txt - Subset of examples for quick torch flow validation
  • llm_function_nim.txt - NIM-specific functional test cases
  • llm_function_multinode.txt - Multi-node functional test cases
  • llm_function_gb20x.txt - GB20X release test cases
  • llm_function_rtx6kd.txt - RTX 6000 Ada specific tests

Performance Test Files

  • llm_perf_full.yml - Main performance test configuration
  • llm_perf_cluster.yml - Cluster-based performance tests
  • llm_perf_sanity.yml - Performance sanity checks
  • llm_perf_nim.yml - NIM-specific performance tests
  • llm_trt_integration_perf.yml - Integration performance tests
  • llm_trt_integration_perf_sanity.yml - Integration performance sanity checks

Triton Backend Tests

  • llm_triton_integration.txt - Triton backend integration tests

Release-Specific Tests

  • llm_digits_func.txt - Functional tests for DIGITS release
  • llm_digits_perf.txt - Performance tests for DIGITS release

Test Execution Schedule

QA tests are executed on a regular schedule:

  • Weekly: Automated regression testing
  • Release: Comprehensive validation before each release
  • On-demand: Manual execution for specific validation needs

Running Tests

Manual Execution

To run specific test categories:

# direct to defs folder
cd tests/integration/defs
# Run all fp8 functional test
pytest --no-header -vs --test-list=../test_lists/qa/llm_function_full.txt -k fp8
# Run a single test case
pytest -vs accuracy/test_cli_flow.py::TestLlama3_1_8B::test_auto_dtype

Automated Execution

QA tests are typically executed through CI/CD pipelines with appropriate test selection based on:

  • Release requirements
  • Hardware availability
  • Test priority and scope

Test Guidelines

Adding New Test Cases

  • Primary Location: For functional testing, new test cases should be added to llm_function_full.txt first
  • Categorization: Test cases should be categorized based on their scope and execution time
  • Validation: Ensure test cases are properly validated before adding to any test list