TensorRT-LLMs/tests/integration/test_lists/test-db
Daniel Cámpora df19430629
chore: Mass Integration 0.19 (#4255)
* fix: Fix/fused moe 0.19 (#3799)

* fix bug of stream init

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix bug

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

---------

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix: Add pre-download of checkpoint before benchmark. (#3772)

* Add pre-download of checkpoint before benchmark.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Add missing remote code flag.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Move from_pretrained to throughput benchmark.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Move download and use snapshot_download.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Removed trusted flag.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Fix benchmark command in iteration log test.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

---------

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* [https://nvbugspro.nvidia.com/bug/5241495][fix] CUDA Graph padding with overlap scheduler (#3839)

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fuse

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

---------

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* TRTLLM-4875 feat: Add version switcher to doc (#3871)

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

* waive a test (#3897)

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* docs:fix https://nvbugs/5244616 by removing new invalid links. (#3939)

Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>

* fix: remote mpi session abort (#3884)

* fix remote mpi session

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* fix

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

---------

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* skip fp8 gemm for pre-hopper (#3931)

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* [https://nvbugspro.nvidia.com/bug/5247148][fix] Attention DP with overlap scheduler (#3975)

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* update multigpu list

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix namings

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

---------

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* Doc: Fix H200 DeepSeek R1 perf doc (#4006)

* fix doc

Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>

* update perf number

Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>

---------

Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>

* Fix the perf regression caused by insufficient cache warmup. (#4042)

Force tuning up to 8192 sequence length for NVFP4 linear op. Also, make this runtime-selectable with UB enabled.

Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>

* doc: Update 0.19.0 release notes (#3976)

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

* Optimize the AutoTuner cache access code to reduce host code overhead. (#4060)

The NVFP4 Linear op is very sensitive to the host overhead.
This PR introduces customizable `find_nearest_profile` and `get_cache_key_specifc`, which allow users to override the default method for generating the cache key.

Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>

* Update switcher (#4098)

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

* doc: update release notes (#4108)

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

* docs:update 0.19 doc. (#4120)

Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>

* docs:add torch flow supported model list. (#4129)

Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>

* doc: Release V0.19 Perf Overview Update (#4166)

Signed-off-by: zpatel <22306219+zbpatel@users.noreply.github.com>

* Fix readme of autodeploy.

Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>

* Update tensorrt_llm/_torch/pyexecutor/llm_request.py

Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Daniel Cámpora <961215+dcampora@users.noreply.github.com>

* Revert mgmn worker node.

Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>

* Change to disable_overlap_scheduler.

Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>

---------

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: zpatel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
Signed-off-by: Daniel Cámpora <961215+dcampora@users.noreply.github.com>
Co-authored-by: bhsueh_NV <11360707+byshiue@users.noreply.github.com>
Co-authored-by: Frank <3429989+FrankD412@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Co-authored-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Co-authored-by: Zac Patel <22306219+zbpatel@users.noreply.github.com>
2025-05-16 10:53:25 +02:00
..
l0_a10.yml chore: Remove deprecated Python runtime benchmark (#4171) 2025-05-14 18:41:05 +08:00
l0_a30.yml Move Triton backend to TRT-LLM main (#3549) 2025-05-16 07:15:23 +08:00
l0_a100.yml Move Triton backend to TRT-LLM main (#3549) 2025-05-16 07:15:23 +08:00
l0_b200.yml Move Triton backend to TRT-LLM main (#3549) 2025-05-16 07:15:23 +08:00
l0_dgx_h100.yml [feat] Enable chunked context for flashinfer (#4132) 2025-05-15 10:59:38 +08:00
l0_dgx_h200.yml [TRTLLM-5081] [test] Align parametrize_with_ids to the pytest behavior (#4090) 2025-05-13 07:41:51 +08:00
l0_gb202.yml fix: Fix FMHA-based MLA in the generation phase and add MLA unit test (#3863) 2025-04-29 09:09:43 +08:00
l0_gb203.yml chore: refactor llmapi e2e tests (#3803) 2025-05-05 07:37:24 +08:00
l0_gh200.yml chore: refactor llmapi e2e tests (#3803) 2025-05-05 07:37:24 +08:00
l0_h100.yml chore: Mass Integration 0.19 (#4255) 2025-05-16 10:53:25 +02:00
l0_l40s.yml chore: refactor llmapi e2e tests (#3803) 2025-05-05 07:37:24 +08:00
l0_perf.yml chore: Remove deprecated Python runtime benchmark (#4171) 2025-05-14 18:41:05 +08:00
l0_rtx_pro_6000.yml test: Added tests for Llama3.1-70B-BF16 on SM120 (#4198) 2025-05-14 11:57:49 -04:00
l0_sanity_check.yml [Infra] - Update the upstream PyTorch dependency to 2.7.0 (#4235) 2025-05-14 22:28:13 +08:00
README.md Update (#2978) 2025-03-23 16:39:35 +08:00

Description

This folder contains test definition which is consumed by trt-test-db tool based on system specifications.

Installation

Install trt-test-db using the following command:

pip3 install --extra-index-url https://urm.nvidia.com/artifactory/api/pypi/sw-tensorrt-pypi/simple --ignore-installed trt-test-db==1.8.5+bc6df7

Test Definition

Test definitions are stored in YAML files located in ${TRT_LLM_ROOT}/tests/integration/test_lists/test-db/. These files define test conditions and the tests to be executed.

Example YAML Structure

version: 0.0.1
l0_e2e:
  - condition:
      terms:
        supports_fp8: true
      ranges:
        system_gpu_count:
          gte: 4
          lte: 4
      wildcards:
        gpu:
          - '*h100*'
        linux_distribution_name: ubuntu*
    tests:
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-8b-enable_fp8]
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-70b-enable_fp8]

Generating Test Lists

Use trt-test-db to generate a test list based on the system configuration:

trt-test-db -d /TensorRT-LLM/src/tests/integration/test_lists/test-db \
            --context l0_e2e \
            --test-names \
            --output /TensorRT-LLM/src/l0_e2e.txt \
            --match-exact '{"chip":"ga102gl-a","compute_capability":"8.6","cpu":"x86_64","gpu":"A10","gpu_memory":"23028.0","host_mem_available_mib":"989937","host_mem_total_mib":"1031949","is_aarch64":false,"is_linux":true,"linux_distribution_name":"ubuntu","linux_version":"22.04","supports_fp8":false,"supports_int8":true,"supports_tf32":true,"sysname":"Linux","system_gpu_count":"1",...}'

This command generates a test list file (l0_e2e.txt) based on the specified context and system configuration.

Running Tests

Execute the tests using pytest with the generated test list:

pytest -v --test-list=/TensorRT-LLM/src/l0_e2e.txt --output-dir=/tmp/logs

This command runs the tests specified in the test list and outputs the results to the specified directory.

Additional Information

  • The --context parameter in the trt-test-db command specifies which context to search in the YAML files.
  • The --match-exact parameter provides system information used to filter tests based on the conditions defined in the YAML files.
  • Modify the YAML files to add or update test conditions and test cases as needed. For more detailed information on trt-test-db and pytest usage, refer to their respective documentation.