TensorRT-LLMs/tests/integration/test_lists/qa
Venky d15ceae62e
test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407)
* extend pyt nano tests perf coverage

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>

* explicitly set maxnt for some cases

This is because the test harness default to no prefill chunking, that means the isl specified is the true context.
When explicitly unspecified in the test harness, the `maxnt` passed down to `trtllm-bench` is 2048.
This means trtllm-bench gets conflicting inputs when isl>2048 but maxnt=2048; hence overriding maxnt to be consistent with isl for such cases.

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>

---------

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
2025-05-23 08:44:37 +08:00
..
.gitignore Update (#2978) 2025-03-23 16:39:35 +08:00
examples_test_list.txt tests: update api change from decoder to sampler in test (#4479) 2025-05-21 14:22:18 +08:00
llm_multinodes_function_test.txt test: FIX test_ptp_quickstart_advanced_deepseek_v3_2nodes_8gpus (#4283) 2025-05-15 15:57:44 +08:00
llm_release_gb20x.txt test: add qa test list for rtx5090 and rtx_pro_6000 (#4254) 2025-05-15 17:57:31 +08:00
llm_release_perf_multinode_test.txt chore: Mass integration of release/0.18 (#3421) 2025-04-16 10:03:29 +08:00
llm_release_rtx_pro_6000.txt [TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335) 2025-05-19 08:56:21 -07:00
llm_sanity_test.txt [TRTLLM-4932] Add CLI accuracy tests for Llama-3.3-70B-Instruct and LLM API BF16 variant (#4362) 2025-05-20 09:48:14 +08:00
trt_llm_integration_perf_sanity_test.yml [TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092) 2025-05-14 23:10:04 +02:00
trt_llm_integration_perf_test.yml [TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092) 2025-05-14 23:10:04 +02:00
trt_llm_release_perf_cluster_test.yml tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models (#3864) 2025-05-07 13:56:35 +08:00
trt_llm_release_perf_sanity_test.yml test: update test filter in perf test yml file to select cases by gpu name and add cases for RTX 6000 pro (#4282) 2025-05-20 10:58:05 +08:00
trt_llm_release_perf_test.yml test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407) 2025-05-23 08:44:37 +08:00