mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Bo Li 5ea6888dda [https://nvbugs/5810940 ][fix] Update lm_eval to 4.9.10 and re-enable Skip Softmax Attention tests on CI. (#11176 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com> Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>		2026-02-11 00:54:40 -05:00
..
l0_a10.yml	[TRTLLM-9527][feat] Modularization of the transceiver for KV manager v2 (step 4) (#11225 )	2026-02-06 07:15:18 -05:00
l0_a30.yml	[TRTLLM-9735][feat] Add processed logprobs functionality to TorchSampler (#9675 )	2026-01-16 10:52:41 -08:00
l0_a100.yml	[https://nvbugs/5759698 ][fix] unwaive test_base_worker (#10669 )	2026-01-20 21:14:03 -05:00
l0_b200.yml	[https://nvbugs/5810940 ][fix] Update lm_eval to 4.9.10 and re-enable Skip Softmax Attention tests on CI. (#11176 )	2026-02-11 00:54:40 -05:00
l0_b300.yml	[https://nvbugs/5640873 ][fix] Move thop tests to pre-merge (#9094 )	2025-11-13 13:08:13 +08:00
l0_dgx_b200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu4_gen1_node1_gpu8.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_dgx_b200_perf_sanity.yml	[https://nvbugs/5820874 ][fix] Adjust deepgemm tuning buckets to cover larger num_tokens's scope (#11259 )	2026-02-05 23:12:38 +08:00
l0_dgx_b200.yml	[https://nvbugs/5810940 ][fix] Update lm_eval to 4.9.10 and re-enable Skip Softmax Attention tests on CI. (#11176 )	2026-02-11 00:54:40 -05:00
l0_dgx_b300.yml	[TRTLLM-9111][feat] provide the uniform test framework to test all MoE backends (#11128 )	2026-02-04 15:57:56 +08:00
l0_dgx_h100.yml	[TRTLLM-10273][feat] Move MambaCacheManager from Python to C++ (#10540 )	2026-02-10 07:20:56 -08:00
l0_dgx_h200.yml	[None][infra] Waive failed cases for main on 01/19 (#10794 )	2026-01-19 00:55:26 -05:00
l0_gb10.yml	[None][infra] Enable single-gpu CI on spark (#9304 )	2025-12-30 17:22:14 +08:00
l0_gb200_multi_gpus_perf_sanity.yml	[TRTLLM-8263][feat] Add Aggregated Perf Tests (#10598 )	2026-01-17 13:16:36 +08:00
l0_gb200_multi_gpus.yml	[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime (#10659 )	2026-02-02 14:29:02 +08:00
l0_gb200_multi_nodes_aggr_perf_sanity_node2_gpu8.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu1_gen1_node1_gpu4.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu1_gen1_node2_gpu8.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu4_gen1_node1_gpu4.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu4_gen1_node2_gpu8.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu4_gen1_node4_gpu16.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node1_gpu4_gen1_node8_gpu32.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node2_gpu8_gen1_node2_gpu8.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node2_gpu8_gen1_node4_gpu16.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes_disagg_perf_sanity_ctx1_node2_gpu8_gen1_node8_gpu32.yml	[TRTLLM-8263][feat] Add Disagg Perf Tests (#10912 )	2026-02-04 10:16:11 +08:00
l0_gb200_multi_nodes.yml	[https://nvbugs/5715568 ][fix] Force release torch memory when LLM is destroyed (#10314 )	2026-01-05 15:30:18 +08:00
l0_gb202.yml	[TRTLLM-9831][perf] Enable 2CTA with autotune for CuteDSL MoE and Grouped GEMM optimizations (#10201 )	2025-12-25 09:04:20 -05:00
l0_gb203.yml	[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431 )	2025-07-01 19:06:41 +08:00
l0_gb300_multi_gpus.yml	[TRTLLM-9111][feat] provide the uniform test framework to test all MoE backends (#11128 )	2026-02-04 15:57:56 +08:00
l0_gb300.yml	[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime (#10659 )	2026-02-02 14:29:02 +08:00
l0_gh200.yml	[None][test] Update llm_models_root to improve path handling on BareMetal environment (#7876 )	2025-09-24 17:35:57 +08:00
l0_h100.yml	[None][chore] Move test_trtllm_flashinfer_symbol_collision.py to tests/unittest/_torch (#11168 )	2026-02-09 13:57:57 +08:00
l0_l40s.yml	[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715 )	2026-01-14 10:31:03 +01:00
l0_perf.yml	[#10607 ][chore] Add Nemotron Nano v3 FP8 autodeploy perf test (#10603 )	2026-01-19 08:48:07 +02:00
l0_rtx_pro_6000.yml	[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime (#10659 )	2026-02-02 14:29:02 +08:00
l0_sanity_check.yml	[None][ci] Remove long-running sanity check tests on GH200 (#10924 ) (#10969 )	2026-01-24 13:06:28 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00

README.md

Description

This folder contains test definition which is consumed by trt-test-db tool based on system specifications.

Installation

Install trt-test-db using the following command:

pip3 install --extra-index-url https://urm.nvidia.com/artifactory/api/pypi/sw-tensorrt-pypi/simple --ignore-installed trt-test-db==1.8.5+bc6df7

Test Definition

Test definitions are stored in YAML files located in ${TRT_LLM_ROOT}/tests/integration/test_lists/test-db/. These files define test conditions and the tests to be executed.

Example YAML Structure

version: 0.0.1
l0_e2e:
  - condition:
      terms:
        supports_fp8: true
      ranges:
        system_gpu_count:
          gte: 4
          lte: 4
      wildcards:
        gpu:
          - '*h100*'
        linux_distribution_name: ubuntu*
    tests:
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-8b-enable_fp8]
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-70b-enable_fp8]

Generating Test Lists

Use trt-test-db to generate a test list based on the system configuration:

trt-test-db -d /TensorRT-LLM/src/tests/integration/test_lists/test-db \
            --context l0_e2e \
            --test-names \
            --output /TensorRT-LLM/src/l0_e2e.txt \
            --match-exact '{"chip":"ga102gl-a","compute_capability":"8.6","cpu":"x86_64","gpu":"A10","gpu_memory":"23028.0","host_mem_available_mib":"989937","host_mem_total_mib":"1031949","is_aarch64":false,"is_linux":true,"linux_distribution_name":"ubuntu","linux_version":"22.04","supports_fp8":false,"supports_int8":true,"supports_tf32":true,"sysname":"Linux","system_gpu_count":"1",...}'

This command generates a test list file (l0_e2e.txt) based on the specified context and system configuration.

Running Tests

Execute the tests using pytest with the generated test list:

pytest -v --test-list=/TensorRT-LLM/src/l0_e2e.txt --output-dir=/tmp/logs

This command runs the tests specified in the test list and outputs the results to the specified directory.

Additional Information

The --context parameter in the trt-test-db command specifies which context to search in the YAML files.
The --match-exact parameter provides system information used to filter tests based on the conditions defined in the YAML files.
Modify the YAML files to add or update test conditions and test cases as needed. For more detailed information on trt-test-db and pytest usage, refer to their respective documentation.