mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

ruodil b5edf13b33 test: update test filter in perf test yml file to select cases by gpu name and add cases for RTX 6000 pro (#4282 ) * add cases for rtx_pro_6000 and update test filter Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> * amend a typo in model llama_v3.1_405b_instruct fp4 and add more cases for rtx pro 6000 and waive_list Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> --------- Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>		2025-05-20 10:58:05 +08:00
..
__init__.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
allowed_configs.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
build.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
data_export.py	refactor: use x is None instead of x == None. (#4244 )	2025-05-15 20:00:04 +08:00
data.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
gpu_clock_lock.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
misc.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
model_yaml_config.py	Breaking change: perf: Enable scheduling overlap by default (#4174 )	2025-05-15 14:27:36 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00
sanity_perf_check.py	chore: Remove deprecated Python runtime benchmark (#4171 )	2025-05-14 18:41:05 +08:00
session_data_writer.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_perf.py	test: update test filter in perf test yml file to select cases by gpu name and add cases for RTX 6000 pro (#4282 )	2025-05-20 10:58:05 +08:00
utils.py	test: add llama_3.2_1B model and fix for test lora script issue (#4139 )	2025-05-12 14:51:59 +08:00

README.md

Sanity Perf Check Introduction

Background

The sanity perf check mechanism is the way of perf regression detection for L0 testing. We create the base_perf.csv which consists of the several models' perf baseline and use the sanity_perf_check.py to detect the perf regression.

Usage

There're four typical scenarios for sanity perf check feature.

The newly added MR doesn't impact the models' perf, the perf check will pass w/o exception.
The newly added MR introduces the new model into perf model list. The sanity check will trigger the exception and the author of this MR needs to add the perf into base_perf.csv.
The newly added MR improves the existed models' perf and the MR author need to refresh the base_perf.csv data w/ new baseline.
The newly added MR introduces the perf regression and the MR author needs to fix the issue and rerun the pipeline.