TensorRT-LLMs/tests/integration/defs/perf
Robin Kobus d31fefde2c
[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092)
* chore: Remove GptSession/V1 from TRT workflow

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove stateful decoders

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession buffers

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession utils

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession kernels

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove V1 GPT models from tests

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove gptSessionBenchmark from scripts and docs

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove gptSession IO classes

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession from test lists

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession from docs

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove useless encoder test

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove mActualBatchSize from DecoderState

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove static batching from ExecutorTest

- Updated `validateContextLogits` and `validateGenerationLogits` functions to remove the `batchingType` parameter.
- Adjusted related test functions to reflect the changes in parameter lists.
- Cleaned up the instantiation of test cases to eliminate unnecessary batchingType references.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

---------

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-14 23:10:04 +02:00
..
__init__.py Update (#2978) 2025-03-23 16:39:35 +08:00
allowed_configs.py Update (#2978) 2025-03-23 16:39:35 +08:00
build.py Update (#2978) 2025-03-23 16:39:35 +08:00
data_export.py Update (#2978) 2025-03-23 16:39:35 +08:00
data.py Update (#2978) 2025-03-23 16:39:35 +08:00
gpu_clock_lock.py Update (#2978) 2025-03-23 16:39:35 +08:00
misc.py Update (#2978) 2025-03-23 16:39:35 +08:00
model_yaml_config.py test: fix for perf test script issue (#4230) 2025-05-13 10:29:20 +08:00
README.md Update (#2978) 2025-03-23 16:39:35 +08:00
sanity_perf_check.py chore: Remove deprecated Python runtime benchmark (#4171) 2025-05-14 18:41:05 +08:00
session_data_writer.py Update (#2978) 2025-03-23 16:39:35 +08:00
test_perf.py [TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092) 2025-05-14 23:10:04 +02:00
utils.py test: add llama_3.2_1B model and fix for test lora script issue (#4139) 2025-05-12 14:51:59 +08:00

Sanity Perf Check Introduction

Background

The sanity perf check mechanism is the way of perf regression detection for L0 testing. We create the base_perf.csv which consists of the several models' perf baseline and use the sanity_perf_check.py to detect the perf regression.

Usage

There're four typical scenarios for sanity perf check feature.

  1. The newly added MR doesn't impact the models' perf, the perf check will pass w/o exception.
  2. The newly added MR introduces the new model into perf model list. The sanity check will trigger the exception and the author of this MR needs to add the perf into base_perf.csv.
  3. The newly added MR improves the existed models' perf and the MR author need to refresh the base_perf.csv data w/ new baseline.
  4. The newly added MR introduces the perf regression and the MR author needs to fix the issue and rerun the pipeline.