TensorRT-LLMs/tests/integration/test_lists/test-db
Dom Brown 8709fe8b53
chore: bump version to 0.19.0 (#3598) (#3841)
test: add test cases for 0.19 release (#3608)

* fix test name



* add quickstart test for nemotron-ultra



* add rcca multi-node test case for deepseek-v3



* add rcca info



---------




squash (#3642)



fix: nvbugs/5187237: fix deterministic mode crash (#3448)

* nvbugs/5187237 nvbugs/5112075: fix deterministic mode error

* remove waive


* Revert "remove waive"

This reverts commit 0bf5486d19906d692bfb7a6262333c296b0087ac.



* revert ar fusion



---------



update fp8 doc (#3647)




tests: change qa perf test to trtllm-bench (#3619)




 fix: FP8 quantized lm_head (NvBug 5214229) (#3567)



infra: Add PR approval protection for the release branch (#3634)



fix: nvbugs/5231298: pytorch allreduce issue (#3673)



Fix: nvbugs/5222698 variable not defined (#3630)

* Fix: nvbugs/5222698 variable not defined



* Tidy code



---------



test:sync waives.txt from main branch by disabling test_perf/gpt_350m-cppmanager case (#3685)



test:restore fp8 kv cache testing for L0 (#3671)



doc: Update DeepSeek perf docs (#3693)

* Update DeepSeek perf docs



* update



* Apply suggestions from code review




---------




tests: waive test_llm_multi_node (#3664)



fix: update test_user_buffers_mm_add_prologue atol (#3711)



Fix: cherry-pick hmac encryption from main branch (#3635)

* security fix cherry-pick changes from main



* fix hmac in remote mpi session (#3649)



---------





Un-waive DS-V3-Lite tests. (#3621)



fix: FP8 kv accuracy (#3675)

* fix FP8 kv accuracy



* update doc



---------



Fix script options for engines. (#3622)



unwaive multi-node test (#3721)



chore : Split more tests out of gpt tests (#3524) (#3674)



doc:add torch examples link into torch backend documentation (#3749)




test: Get Eagle tests working (#3593) (#3722)




Waive L0 test (#3756)



waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656)





Update ds v3 parameters in stress test. (#3676)

waive gemma on L20 (#3766)



https://nvbugs/5141291: Fix convert.py script for Qwen model. (#3758)

Include Qwen2VLDecoderLayer in the smooth_qwen2_model function.



fix: PP4 fixes and cleanup (#3688)




remove benchmark test list (#3643)



skip disagg deepseek test if sm!=90 (#3720)



test: skip failed cases on B200 (#3710)

* add skip condition to tests



* fix error



---------



test: [nvbug: 5234494] skip_pre_ada for fp8 cases (#3718)

* skip_pre_ada for fp8 cases



* update



* update after rebase



---------



add know issue to deepseek doc. (#3800)



Fix ModelOpt Mixtral AWQ OOM (#3714) (#3761)




Waive L0 tests (#3826)



fix: Reduce memory usage in fused moe op associated with AutoTuning and fix moe fallback issue. (#3793)

* Reduce memory usage in fused moe op associated with AutoTuning.
* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens.
* Add free_memory logic of workspace in min_latency_mode fused moe path.



* Fix fused_moe fallback issue. (#3652)

min_latency_mode is only set to False during warmup phase. Thus when it becomes true during inference, all tactics fall back to the default one and thus cause perf regression.



---------



[doc] Better document for Draft-Target-Model (DTM) speculative decoding (#3797)




Fix pre-commit



Fix again



Address some review comments for the MI

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-29 16:57:22 +08:00
..
l0_a10.yml chore: bump version to 0.19.0 (#3598) (#3841) 2025-04-29 16:57:22 +08:00
l0_a30.yml Test: Split C++ unit tests for CI granularity (#3868) 2025-04-25 13:30:58 -07:00
l0_a100.yml move pytorch tests of LLM API into separate test files (#3745) 2025-04-22 14:36:59 -07:00
l0_b200.yml enable test_ptp_quickstart_advanced_mixed_precision (#3667) 2025-04-18 05:06:24 -07:00
l0_dgx_h100.yml chore: bump version to 0.19.0 (#3598) (#3841) 2025-04-29 16:57:22 +08:00
l0_dgx_h200.yml test: add deepseek v3 & r1 cases (#3528) 2025-04-28 23:37:26 +08:00
l0_gb202.yml fix: Fix FMHA-based MLA in the generation phase and add MLA unit test (#3863) 2025-04-29 09:09:43 +08:00
l0_gb203.yml infra: Add test stages for sm120 (#3533) 2025-04-23 01:26:12 +08:00
l0_gh200.yml test: [TRTLLM-3994] Support only run pytorch tests (#3013) 2025-04-03 13:46:09 +08:00
l0_h100.yml increase H100 CI nodes for PyTorch only pipelines (#3927) 2025-04-29 10:58:43 +08:00
l0_l40s.yml chore: bump version to 0.19.0 (#3598) (#3841) 2025-04-29 16:57:22 +08:00
l0_perf.yml test: [TRTLLM-3994] Support only run pytorch tests (#3013) 2025-04-03 13:46:09 +08:00
l0_sanity_check.yml Update (#2978) 2025-03-23 16:39:35 +08:00
README.md Update (#2978) 2025-03-23 16:39:35 +08:00

Description

This folder contains test definition which is consumed by trt-test-db tool based on system specifications.

Installation

Install trt-test-db using the following command:

pip3 install --extra-index-url https://urm.nvidia.com/artifactory/api/pypi/sw-tensorrt-pypi/simple --ignore-installed trt-test-db==1.8.5+bc6df7

Test Definition

Test definitions are stored in YAML files located in ${TRT_LLM_ROOT}/tests/integration/test_lists/test-db/. These files define test conditions and the tests to be executed.

Example YAML Structure

version: 0.0.1
l0_e2e:
  - condition:
      terms:
        supports_fp8: true
      ranges:
        system_gpu_count:
          gte: 4
          lte: 4
      wildcards:
        gpu:
          - '*h100*'
        linux_distribution_name: ubuntu*
    tests:
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-8b-enable_fp8]
      - examples/test_llama.py::test_llm_llama_v3_1_1node_multi_gpus[llama-3.1-70b-enable_fp8]

Generating Test Lists

Use trt-test-db to generate a test list based on the system configuration:

trt-test-db -d /TensorRT-LLM/src/tests/integration/test_lists/test-db \
            --context l0_e2e \
            --test-names \
            --output /TensorRT-LLM/src/l0_e2e.txt \
            --match-exact '{"chip":"ga102gl-a","compute_capability":"8.6","cpu":"x86_64","gpu":"A10","gpu_memory":"23028.0","host_mem_available_mib":"989937","host_mem_total_mib":"1031949","is_aarch64":false,"is_linux":true,"linux_distribution_name":"ubuntu","linux_version":"22.04","supports_fp8":false,"supports_int8":true,"supports_tf32":true,"sysname":"Linux","system_gpu_count":"1",...}'

This command generates a test list file (l0_e2e.txt) based on the specified context and system configuration.

Running Tests

Execute the tests using pytest with the generated test list:

pytest -v --test-list=/TensorRT-LLM/src/l0_e2e.txt --output-dir=/tmp/logs

This command runs the tests specified in the test list and outputs the results to the specified directory.

Additional Information

  • The --context parameter in the trt-test-db command specifies which context to search in the YAML files.
  • The --match-exact parameter provides system information used to filter tests based on the conditions defined in the YAML files.
  • Modify the YAML files to add or update test conditions and test cases as needed. For more detailed information on trt-test-db and pytest usage, refer to their respective documentation.