mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

ruodil 793d0102d6 waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656 ) Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> Signed-off-by: Larry <197874197+LarryXFly@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>		2025-04-22 14:53:21 +08:00
..
__init__.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
allowed_configs.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
build.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
data_export.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
data.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
gpu_clock_lock.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
misc.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
model_yaml_config.py	tests: change qa perf test to trtllm-bench (#3619 )	2025-04-17 13:58:38 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00
sanity_perf_check.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
session_data_writer.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_perf.py	waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656 )	2025-04-22 14:53:21 +08:00
utils.py	waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656 )	2025-04-22 14:53:21 +08:00

README.md

Sanity Perf Check Introduction

Background

The sanity perf check mechanism is the way of perf regression detection for L0 testing. We create the base_perf.csv which consists of the several models' perf baseline and use the sanity_perf_check.py to detect the perf regression.

Usage

There're four typical scenarios for sanity perf check feature.

The newly added MR doesn't impact the models' perf, the perf check will pass w/o exception.
The newly added MR introduces the new model into perf model list. The sanity check will trigger the exception and the author of this MR needs to add the perf into base_perf.csv.
The newly added MR improves the existed models' perf and the MR author need to refresh the base_perf.csv data w/ new baseline.
The newly added MR introduces the perf regression and the MR author needs to fix the issue and rerun the pipeline.