mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
test: add qwen3 and disaggregated serving accuracy tests to qa test list (#4083)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
This commit is contained in:
parent
5b61486d87
commit
fb31f91e15
@ -445,6 +445,10 @@ accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep4---cuda
|
||||
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep4-mtp_nextn=2--cuda_graph-overlap_scheduler]
|
||||
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep8---cuda_graph-overlap_scheduler]
|
||||
accuracy/test_llm_api_pytorch.py::TestDeepSeekR1::test_nvfp4_8gpus[tp8ep8-mtp_nextn=2--cuda_graph-overlap_scheduler]
|
||||
accuracy/test_llm_api_pytorch.py::TestQwen3_8B::test_fp8_block_scales[latency]
|
||||
accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B::test_fp8_block_scales[latency]
|
||||
accuracy/test_disaggregated_serving.py::TestLlama3_1_8B::test_auto_dtype[False]
|
||||
accuracy/test_disaggregated_serving.py::TestLlama3_1_8B::test_auto_dtype[True]
|
||||
|
||||
test_e2e.py::test_benchmark_sanity[bert_base] # 127.18s
|
||||
test_e2e.py::test_benchmark_sanity[gpt_350m] # 64.06s
|
||||
|
||||
Loading…
Reference in New Issue
Block a user