TensorRT-LLMs/base_perf.csv at 9aa086d3bb7b7b85df2a505d1843a2e86230be64 - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

amirkl94 8039ef45d3

CI: Performance regression tests update (#3531 )

2025-06-01 09:47:55 +03:00

1.8 KiB

Raw Blame History

1	network_name	perf_case_name	test_name	threshold	absolute_threshold	metric_type	perf_metric
2	llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192	H100_PCIe-TensorRT-Perf-1/perf/test_perf.py::test_perf_metric_build_time[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	test_perf_metric_build_time[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	0.1	30	BUILD_TIME	143.5976
3	llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192	H100_PCIe-TensorRT-Perf-1/perf/test_perf.py::test_perf_metric_inference_time[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	test_perf_metric_inference_time[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	0.1	50	INFERENCE_TIME	106778.60992
4	llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192	H100_PCIe-TensorRT-Perf-1/perf/test_perf.py::test_perf_metric_seq_throughput[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	test_perf_metric_seq_throughput[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	-0.1	10	SEQ_THROUGHPUT	76.72174
5	llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192	H100_PCIe-TensorRT-Perf-1/perf/test_perf.py::test_perf_metric_token_throughput[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	test_perf_metric_token_throughput[llama_v3.1_8b_instruct-bench-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]	-0.1	10	TOKEN_THROUGHPUT	9820.38162