mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
[None][doc] add introduction doc on qa test (#6535)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
This commit is contained in:
parent
d101a6cebc
commit
08ed9d7305
108
tests/integration/test_lists/qa/README.md
Normal file
108
tests/integration/test_lists/qa/README.md
Normal file
@ -0,0 +1,108 @@
|
||||
# Description
|
||||
|
||||
This folder contains QA test definitions for TensorRT-LLM, which are executed on a daily/release schedule. These tests focus on end-to-end validation, accuracy verification, disaggregated testing, and performance benchmarking.
|
||||
|
||||
## Test Categories
|
||||
|
||||
QA tests are organized into three main categories:
|
||||
|
||||
### 1. Functional Tests
|
||||
Functional tests include E2E (end-to-end), accuracy, and disaggregated test cases:
|
||||
|
||||
- **E2E Tests**: Complete workflow validation from model loading to inference output
|
||||
- **Accuracy Tests**: Model accuracy verification against reference implementations
|
||||
- **Disaggregated Tests**: Distributed deployment and multi-node scenario validation
|
||||
|
||||
### 2. Performance Tests
|
||||
Performance tests focus on benchmarking and performance validation:
|
||||
- Baseline performance measurements
|
||||
- Performance regression detection
|
||||
- Throughput and latency benchmarking
|
||||
- Resource utilization analysis
|
||||
|
||||
### 3. Triton Backend Tests
|
||||
Triton backend tests validate the integration with NVIDIA Triton Inference Server:
|
||||
- Backend functionality validation
|
||||
- Model serving capabilities
|
||||
- API compatibility testing
|
||||
- Integration performance testing
|
||||
|
||||
## Dependencies
|
||||
|
||||
The following Python packages are required for running QA tests:
|
||||
|
||||
```bash
|
||||
pip install mako oyaml rouge_score lm_eval
|
||||
```
|
||||
|
||||
### Dependency Details
|
||||
|
||||
- **mako**: Template engine for test generation and configuration
|
||||
- **oyaml**: YAML parser with ordered dictionary support
|
||||
- **rouge_score**: ROUGE evaluation metrics for text generation quality assessment
|
||||
- **lm_eval**: Language model evaluation framework
|
||||
|
||||
## Test Files
|
||||
|
||||
This directory contains various test configuration files:
|
||||
|
||||
### Functional Test Lists
|
||||
- `llm_function_full.txt` - Primary test list for single node multi-GPU scenarios (all new test cases should be added here)
|
||||
- `llm_function_sanity.txt` - Subset of examples for quick torch flow validation
|
||||
- `llm_function_nim.txt` - NIM-specific functional test cases
|
||||
- `llm_function_multinode.txt` - Multi-node functional test cases
|
||||
- `llm_function_gb20x.txt` - GB20X release test cases
|
||||
- `llm_function_rtx6kd.txt` - RTX 6000 Ada specific tests
|
||||
|
||||
### Performance Test Files
|
||||
- `llm_perf_full.yml` - Main performance test configuration
|
||||
- `llm_perf_cluster.yml` - Cluster-based performance tests
|
||||
- `llm_perf_sanity.yml` - Performance sanity checks
|
||||
- `llm_perf_nim.yml` - NIM-specific performance tests
|
||||
- `llm_trt_integration_perf.yml` - Integration performance tests
|
||||
- `llm_trt_integration_perf_sanity.yml` - Integration performance sanity checks
|
||||
|
||||
### Triton Backend Tests
|
||||
- `llm_triton_integration.txt` - Triton backend integration tests
|
||||
|
||||
### Release-Specific Tests
|
||||
- `llm_digits_func.txt` - Functional tests for DIGITS release
|
||||
- `llm_digits_perf.txt` - Performance tests for DIGITS release
|
||||
|
||||
## Test Execution Schedule
|
||||
|
||||
QA tests are executed on a regular schedule:
|
||||
|
||||
- **Weekly**: Automated regression testing
|
||||
- **Release**: Comprehensive validation before each release
|
||||
- **On-demand**: Manual execution for specific validation needs
|
||||
|
||||
## Running Tests
|
||||
|
||||
### Manual Execution
|
||||
|
||||
To run specific test categories:
|
||||
|
||||
```bash
|
||||
# direct to defs folder
|
||||
cd tests/integration/defs
|
||||
# Run all fp8 functional test
|
||||
pytest --no-header -vs --test-list=../test_lists/qa/llm_function_full.txt -k fp8
|
||||
# Run a single test case
|
||||
pytest -vs accuracy/test_cli_flow.py::TestLlama3_1_8B::test_auto_dtype
|
||||
```
|
||||
|
||||
### Automated Execution
|
||||
|
||||
QA tests are typically executed through CI/CD pipelines with appropriate test selection based on:
|
||||
|
||||
- Release requirements
|
||||
- Hardware availability
|
||||
- Test priority and scope
|
||||
|
||||
## Test Guidelines
|
||||
|
||||
### Adding New Test Cases
|
||||
- **Primary Location**: For functional testing, new test cases should be added to `llm_function_full.txt` first
|
||||
- **Categorization**: Test cases should be categorized based on their scope and execution time
|
||||
- **Validation**: Ensure test cases are properly validated before adding to any test list
|
||||
Loading…
Reference in New Issue
Block a user