mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-30 07:33:48 +08:00

History

Robin Kobus d31fefde2c [TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 ) * chore: Remove GptSession/V1 from TRT workflow Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove stateful decoders Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove GptSession buffers Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove GptSession utils Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove GptSession kernels Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove V1 GPT models from tests Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove gptSessionBenchmark from scripts and docs Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove gptSession IO classes Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove GptSession from test lists Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove GptSession from docs Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove useless encoder test Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove mActualBatchSize from DecoderState Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove static batching from ExecutorTest - Updated `validateContextLogits` and `validateGenerationLogits` functions to remove the `batchingType` parameter. - Adjusted related test functions to reflect the changes in parameter lists. - Cleaned up the instantiation of test cases to eliminate unnecessary batchingType references. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> --------- Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-05-14 23:10:04 +02:00
..
__init__.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
allowed_configs.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
build.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
data_export.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
data.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
gpu_clock_lock.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
misc.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
model_yaml_config.py	test: fix for perf test script issue (#4230 )	2025-05-13 10:29:20 +08:00
README.md	Update (#2978 )	2025-03-23 16:39:35 +08:00
sanity_perf_check.py	chore: Remove deprecated Python runtime benchmark (#4171 )	2025-05-14 18:41:05 +08:00
session_data_writer.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_perf.py	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
utils.py	test: add llama_3.2_1B model and fix for test lora script issue (#4139 )	2025-05-12 14:51:59 +08:00

README.md

Sanity Perf Check Introduction

Background

The sanity perf check mechanism is the way of perf regression detection for L0 testing. We create the base_perf.csv which consists of the several models' perf baseline and use the sanity_perf_check.py to detect the perf regression.

Usage

There're four typical scenarios for sanity perf check feature.

The newly added MR doesn't impact the models' perf, the perf check will pass w/o exception.
The newly added MR introduces the new model into perf model list. The sanity check will trigger the exception and the author of this MR needs to add the perf into base_perf.csv.
The newly added MR improves the existed models' perf and the MR author need to refresh the base_perf.csv data w/ new baseline.
The newly added MR introduces the perf regression and the MR author needs to fix the issue and rerun the pipeline.