TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 16:25:05 +08:00

History

Faraz 7656af1b57 [TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335 ) * add mixtral7x8b fp8 test with fixed cutlass fp8 moe gemm Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com> * update cutlass versions Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com> * added internal cutlass with fix and docker update Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com> * added mixtral to pro 6000 Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com> --------- Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>		2025-05-19 08:56:21 -07:00
..
.gitignore	Update (#2978 )	2025-03-23 16:39:35 +08:00
examples_test_list.txt	[https://nvbugs/5123103 ][fix] Fix torch compile for DeepSeekV3 (#3952 )	2025-05-19 22:12:25 +08:00
llm_multinodes_function_test.txt	test: FIX test_ptp_quickstart_advanced_deepseek_v3_2nodes_8gpus (#4283 )	2025-05-15 15:57:44 +08:00
llm_release_gb20x.txt	test: add qa test list for rtx5090 and rtx_pro_6000 (#4254 )	2025-05-15 17:57:31 +08:00
llm_release_perf_multinode_test.txt	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
llm_release_rtx_pro_6000.txt	[TRTLLM-4618][feat] Fix cutlass MoE GEMM fallback failure on FP8 + add e2e test for Mixtral 8x7B FP8 on RTX6000 Pro (SM120) (#4335 )	2025-05-19 08:56:21 -07:00
llm_sanity_test.txt	[https://nvbugs/5123103 ][fix] Fix torch compile for DeepSeekV3 (#3952 )	2025-05-19 22:12:25 +08:00
trt_llm_integration_perf_sanity_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_integration_perf_test.yml	[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092 )	2025-05-14 23:10:04 +02:00
trt_llm_release_perf_cluster_test.yml	tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models (#3864 )	2025-05-07 13:56:35 +08:00
trt_llm_release_perf_sanity_test.yml	test: fix for perf test script issue (#4230 )	2025-05-13 10:29:20 +08:00
trt_llm_release_perf_test.yml	Extend the Llama-Nemotron-Nano-8B perf-integration-tests (cpp) (#4195 )	2025-05-17 22:46:21 +08:00