mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-05 02:31:33 +08:00
[None][chore] unwaive qwen3 235B accuracy test (#10493)
Signed-off-by: linquanh <linquanh@nvidia.com>
This commit is contained in:
parent
bf7303c7f1
commit
f91ea37a13
@ -235,11 +235,11 @@ Qwen3/Qwen3-235B-A22B:
|
||||
accuracy: 86
|
||||
- quant_algo: NVFP4
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 86
|
||||
accuracy: 85.5
|
||||
- spec_dec_algo: Eagle
|
||||
quant_algo: NVFP4
|
||||
kv_cache_quant_algo: FP8
|
||||
accuracy: 86
|
||||
accuracy: 85.5
|
||||
Qwen3/Qwen3-Next-80B-A3B-Thinking:
|
||||
- accuracy: 86
|
||||
Qwen3/Qwen3-Next-80B-A3B-Instruct:
|
||||
|
||||
@ -236,7 +236,6 @@ unittest/_torch/modeling/test_modeling_out_of_tree.py::TestOutOfTree::test_llm_a
|
||||
unittest/_torch/modeling/test_modeling_out_of_tree.py::TestOutOfTree::test_serve[True] SKIP (https://nvbugs/5739981)
|
||||
full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_ctx_pp_gen_tp_asymmetric[MMLU-gen_tp=2-ctx_pp=2] SKIP (https://nvbugs/5596337)
|
||||
full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_tp_pp_symmetric[MMLU-tp2pp2] SKIP (https://nvbugs/5596337)
|
||||
accuracy/test_llm_api_pytorch.py::TestQwen3_235B_A22B::test_nvfp4[latency_moe_trtllm] SKIP (https://nvbugs/5721672)
|
||||
accuracy/test_llm_api_pytorch.py::TestLlama3_1_8BInstruct::test_fp8_4gpus[tp4-fp8kv=True-attn_backend=FLASHINFER-torch_compile=True] SKIP (https://nvbugs/5741304)
|
||||
unittest/executor/test_rpc.py::TestRpcCorrectness::test_incremental_task_async SKIP (https://nvbugs/5741476)
|
||||
test_e2e.py::test_trtllm_bench_llmapi_launch[pytorch_backend-llama-v3-llama3-8b] SKIP (https://nvbugs/5744432)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user