TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Venky d15ceae62e test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407 ) * extend pyt nano tests perf coverage Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> * explicitly set maxnt for some cases This is because the test harness default to no prefill chunking, that means the isl specified is the true context. When explicitly unspecified in the test harness, the `maxnt` passed down to `trtllm-bench` is 2048. This means trtllm-bench gets conflicting inputs when isl>2048 but maxnt=2048; hence overriding maxnt to be consistent with isl for such cases. Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> --------- Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>		2025-05-23 08:44:37 +08:00
..
.gitignore	Update (#2978 )	2025-03-23 16:39:35 +08:00
examples_test_list.txt	tests: update api change from decoder to sampler in test (#4479 )	2025-05-21 14:22:18 +08:00
llm_multinodes_function_test.txt	test: FIX test_ptp_quickstart_advanced_deepseek_v3_2nodes_8gpus (#4283 )	2025-05-15 15:57:44 +08:00
llm_release_gb20x.txt	test: add qa test list for rtx5090 and rtx_pro_6000 (#4254 )	2025-05-15 17:57:31 +08:00
llm_release_perf_multinode_test.txt	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
llm_release_rtx_pro_6000.txt	[TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335 )	2025-05-19 08:56:21 -07:00
llm_sanity_test.txt	[TRTLLM-4932] Add CLI accuracy tests for Llama-3.3-70B-Instruct and LLM API BF16 variant (#4362 )	2025-05-20 09:48:14 +08:00
trt_llm_integration_perf_sanity_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_integration_perf_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_release_perf_cluster_test.yml	tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models (#3864 )	2025-05-07 13:56:35 +08:00
trt_llm_release_perf_sanity_test.yml	test: update test filter in perf test yml file to select cases by gpu name and add cases for RTX 6000 pro (#4282 )	2025-05-20 10:58:05 +08:00
trt_llm_release_perf_test.yml	test(perf): Extend the Llama-Nemotron-Nano-8B perf-integration-tests (pyt) (#4407 )	2025-05-23 08:44:37 +08:00