From fb31f91e15860b9033582ab26e7e4d391cd2014c Mon Sep 17 00:00:00 2001 From: Stanley Sun <190317771+StanleySun639@users.noreply.github.com> Date: Fri, 9 May 2025 11:03:02 +0800 Subject: [PATCH] test: add qwen3 and disaggregated serving accuracy tests to qa test list (#4083) Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com> --- tests/integration/test_lists/qa/examples_test_list.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tests/integration/test_lists/qa/examples_test_list.txt b/tests/integration/test_lists/qa/examples_test_list.txt index 0565af43c3..8ff4627a4c 100644 --- a/tests/integration/test_lists/qa/examples_test_list.txt +++ b/tests/integration/test_lists/qa/examples_test_list.txt @@ -445,6 +445,10 @@ accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep4---cuda accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep4-mtp_nextn=2--cuda_graph-overlap_scheduler] accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep8---cuda_graph-overlap_scheduler] accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep8-mtp_nextn=2--cuda_graph-overlap_scheduler] +accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_fp8_block_scales[latency] +accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_fp8_block_scales[latency] +accuracy/test_disaggregated_serving.py::TestLlama3_1_8B::test_auto_dtype[False] +accuracy/test_disaggregated_serving.py::TestLlama3_1_8B::test_auto_dtype[True] test_e2e.py::test_benchmark_sanity[bert_base] # 127.18s test_e2e.py::test_benchmark_sanity[gpt_350m] # 64.06s