TensorRT-LLMs/tests/integration/test_lists
Venky d15ceae62e
test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407)
* extend pyt nano tests perf coverage

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>

* explicitly set maxnt for some cases

This is because the test harness default to no prefill chunking, that means the isl specified is the true context.
When explicitly unspecified in the test harness, the `maxnt` passed down to `trtllm-bench` is 2048.
This means trtllm-bench gets conflicting inputs when isl>2048 but maxnt=2048; hence overriding maxnt to be consistent with isl for such cases.

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>

---------

Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
2025-05-23 08:44:37 +08:00
..
dev Update (#2978) 2025-03-23 16:39:35 +08:00
qa test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407) 2025-05-23 08:44:37 +08:00
test-db [TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335) 2025-05-19 08:56:21 -07:00
waives.txt [5234029][5226211] chore: Unwaive multimodal tests for Qwen model. (#4519) 2025-05-23 08:04:56 +08:00