TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Venky 0a8461d54c test(perf): Pt.2 Add `Llama-3_3-Nemotron-Super-49B-v1` integration-perf-tests (cpp) (#4499 ) add low concurrency perf tests Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>		2025-05-21 10:46:48 -07:00
..
.gitignore	Update (#2978 )	2025-03-23 16:39:35 +08:00
examples_test_list.txt	tests: add qwene fp4 tests into QA test list & update sanity test list (#4478 )	2025-05-21 16:52:02 +08:00
llm_multinodes_function_test.txt	tests: add llama 3.3 70b 2 nodes tests (#4391 )	2025-05-21 12:42:45 +08:00
llm_release_gb20x.txt	test: add qa test list for rtx5090 and rtx_pro_6000 (#4254 )	2025-05-15 17:57:31 +08:00
llm_release_perf_multinode_test.txt	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
llm_release_rtx_pro_6000.txt	[TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335 )	2025-05-19 08:56:21 -07:00
llm_sanity_test.txt	tests: add qwene fp4 tests into QA test list & update sanity test list (#4478 )	2025-05-21 16:52:02 +08:00
trt_llm_integration_perf_sanity_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_integration_perf_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_release_perf_cluster_test.yml	tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models (#3864 )	2025-05-07 13:56:35 +08:00
trt_llm_release_perf_sanity_test.yml	test: update test filter in perf test yml file to select cases by gpu name and add cases for RTX 6000 pro (#4282 )	2025-05-20 10:58:05 +08:00
trt_llm_release_perf_test.yml	test(perf): Pt.2 Add `Llama-3_3-Nemotron-Super-49B-v1` integration-perf-tests (cpp) (#4499 )	2025-05-21 10:46:48 -07:00