TensorRT-LLMs/tests/integration/defs/perf/base_perf_pytorch.csv
Eran Geva c5d5af9e7f
[#8391][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-11 16:31:24 -05:00

3.8 KiB

1network_nameperf_case_nametest_namethresholdabsolute_thresholdmetric_typeperf_metricdevice_subtype
2llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_inference_time[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]test_perf_metric_inference_time[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]0.205000INFERENCE_TIME109007.96
3llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_seq_throughput[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]test_perf_metric_seq_throughput[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]-0.205SEQ_THROUGHPUT76.45
4llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_token_throughput[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]test_perf_metric_token_throughput[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]-0.20500TOKEN_THROUGHPUT9785.75
5llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_kv_cache_size[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]test_perf_metric_kv_cache_size[llama_v3.1_8b_instruct-bench-pytorch-float16-maxbs:512-maxnt:2048-input_output_len:128,128-reqs:8192]0.202KV_CACHE_SIZE55.64
6deepseek_r1_distill_qwen_32b-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_inference_time[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]test_perf_metric_inference_time[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]0.150INFERENCE_TIME1359184.5059H100_PCIe
7deepseek_r1_distill_qwen_32b-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_kv_cache_size[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]test_perf_metric_kv_cache_size[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]-0.150KV_CACHE_SIZE10.92H100_PCIe
8deepseek_r1_distill_qwen_32b-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_seq_throughput[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]test_perf_metric_seq_throughput[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]-0.110SEQ_THROUGHPUT0.3767H100_PCIe
9deepseek_r1_distill_qwen_32b-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024H100_PCIe-PyTorch-Perf-1/perf/test_perf.py::test_perf_metric_token_throughput[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]test_perf_metric_token_throughput[deepseek_r1_distill_qwen_32b-subtype:H100_PCIe-bench-_autodeploy-float16-maxbs:512-maxnt:2048-input_output_len:1024,1024]-0.110TOKEN_THROUGHPUT385.7372H100_PCIe