mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-22 03:35:00 +08:00

History

Pengyun Lin 2aade46d18 [TRTLLM-8214][feat] Support Qwen3 tool parser (#8216 ) Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>		2025-10-29 15:48:29 +08:00
..
l0_a10.yml	[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216 )	2025-10-29 15:48:29 +08:00
l0_a30.yml	[TRTLLM-8682][chore] Remove auto_parallel module (#8329 )	2025-10-22 20:53:08 -04:00
l0_a100.yml	[TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling (#7539 )	2025-10-16 17:49:22 +02:00
l0_b200.yml	[None][chore] ISOLATE some cases (#8690 )	2025-10-27 22:10:44 -04:00
l0_b300.yml	[None] [test] Add B300 cases to CI (#8056 )	2025-10-06 19:23:31 -07:00
l0_dgx_b200.yml	[None] [test] Add MNNVL AlltoAll tests to pre-merge (#8601 )	2025-10-27 21:39:44 +08:00
l0_dgx_b300.yml	[None][ci] move all llama4 test cases to post merge (#8387 )	2025-10-15 16:36:37 +08:00
l0_dgx_h100.yml	[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602 )	2025-10-28 17:04:53 -07:00
l0_dgx_h200.yml	[None][ci] move some time-consuming benchmark test cases to post merge (#8641 )	2025-10-26 22:47:17 -04:00
l0_gb200_multi_gpus.yml	[https://nvbugs/5516665 ][fix] Fix CUTLASS moe fake impl errors (#7714 )	2025-09-22 11:08:39 -07:00
l0_gb200_multi_nodes.yml	[TRTLLM-6741][fix] Add heuristics for lm head tp size when `enable_lm_head_tp_in_adp=True` (#7891 )	2025-09-30 09:24:35 +08:00
l0_gb202.yml	[None][feat] Update TRTLLM MoE MxFP4 cubins; autotune tileN (#8156 )	2025-10-23 09:14:18 +08:00
l0_gb203.yml	[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431 )	2025-07-01 19:06:41 +08:00
l0_gb300_multi_gpus.yml	[None][chore] ISOLATE some cases (#8690 )	2025-10-27 22:10:44 -04:00
l0_gb300.yml	[None] [test] Add B300 cases to CI (#8056 )	2025-10-06 19:23:31 -07:00
l0_gh200.yml	[None][test] Update llm_models_root to improve path handling on BareMetal environment (#7876 )	2025-09-24 17:35:57 +08:00
l0_h100.yml	[None][ci] move some test cases from H100 to A10 (#8449 )	2025-10-20 01:58:34 -04:00
l0_l40s.yml	[https://nvbugs/5492250 ][fix] Remove isolated cases and unwaive cases (#8492 )	2025-10-20 07:40:07 -04:00
l0_perf.yml	[#7288 ][feat] Added AutoDeploy backend support to test_perf.py (#7588 )	2025-09-28 21:21:27 -07:00
l0_rtx_pro_6000.yml	[None][feat] Update TRTLLM MoE MxFP4 cubins; autotune tileN (#8156 )	2025-10-23 09:14:18 +08:00
l0_sanity_check.yml	[None][chore] Isolate several intermittent cases (#8408 )	2025-10-15 23:48:31 -07:00
perf_sanity_l0_dgx_b200.yml	[None][infra] Minor Update on Perf Sanity Testdb Files (#8607 )	2025-10-28 09:54:48 +08:00
perf_sanity_l0_dgx_b300.yml	[None][infra] Minor Update on Perf Sanity Testdb Files (#8607 )	2025-10-28 09:54:48 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00

README.md

Description

This folder contains test definition which is consumed by trt-test-db tool based on system specifications.

Installation

Install trt-test-db using the following command:

pip3 install --extra-index-url https://urm.nvidia.com/artifactory/api/pypi/sw-tensorrt-pypi/simple --ignore-installed trt-test-db==1.8.5+bc6df7

Test Definition

Test definitions are stored in YAML files located in ${TRT_LLM_ROOT}/tests/integration/test_lists/test-db/. These files define test conditions and the tests to be executed.

Example YAML Structure

version: 0.0.1
l0_e2e:
  - condition:
      terms:
        supports_fp8: true
      ranges:
        system_gpu_count:
          gte: 4
          lte: 4
      wildcards:
        gpu:
          - '*h100*'
        linux_distribution_name: ubuntu*
    tests:
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-8b-enable_fp8]
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-70b-enable_fp8]

Generating Test Lists

Use trt-test-db to generate a test list based on the system configuration:

trt-test-db -d /TensorRT-LLM/src/tests/integration/test_lists/test-db \
            --context l0_e2e \
            --test-names \
            --output /TensorRT-LLM/src/l0_e2e.txt \
            --match-exact '{"chip":"ga102gl-a","compute_capability":"8.6","cpu":"x86_64","gpu":"A10","gpu_memory":"23028.0","host_mem_available_mib":"989937","host_mem_total_mib":"1031949","is_aarch64":false,"is_linux":true,"linux_distribution_name":"ubuntu","linux_version":"22.04","supports_fp8":false,"supports_int8":true,"supports_tf32":true,"sysname":"Linux","system_gpu_count":"1",...}'

This command generates a test list file (l0_e2e.txt) based on the specified context and system configuration.

Running Tests

Execute the tests using pytest with the generated test list:

pytest -v --test-list=/TensorRT-LLM/src/l0_e2e.txt --output-dir=/tmp/logs

This command runs the tests specified in the test list and outputs the results to the specified directory.

Additional Information

The --context parameter in the trt-test-db command specifies which context to search in the YAML files.
The --match-exact parameter provides system information used to filter tests based on the conditions defined in the YAML files.
Modify the YAML files to add or update test conditions and test cases as needed. For more detailed information on trt-test-db and pytest usage, refer to their respective documentation.