[https://nvbugs/5747920][fix] Fix multimodal serve test (#11296)

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2026-02-16 07:53:55 +08:00 · 2026-02-05 15:12:53 +09:00 · 2026-02-05 15:12:53 +09:00 · 36cb5f8c93
commit 36cb5f8c93
parent 8447a96c29
3 changed files with 6 additions and 4 deletions
--- a/examples/serve/aiperf_client_for_multimodal.sh
+++ b/examples/serve/aiperf_client_for_multimodal.sh
@ -2,7 +2,7 @@

 aiperf profile \
    -m Qwen2.5-VL-3B-Instruct \
-    --tokenizer Qwen/Qwen2.5-VL-3B-Instruct \
+    --tokenizer ${AIPERF_TOKENIZER_PATH:-Qwen/Qwen2.5-VL-3B-Instruct} \
    --endpoint-type chat \
    --random-seed 123 \
    --image-width-mean 64 \
--- a/tests/integration/test_lists/waives.txt
+++ b/tests/integration/test_lists/waives.txt
@ -215,7 +215,6 @@ full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_
 full:sm89/accuracy/test_disaggregated_serving.py::TestLlama3_1_8BInstruct::test_tp_pp_symmetric[MMLU-tp2pp2] SKIP (https://nvbugs/5596337)
 unittest/executor/test_rpc.py::TestRpcCorrectness::test_incremental_task_async SKIP (https://nvbugs/5741476)
 test_e2e.py::test_trtllm_bench_llmapi_launch[pytorch_backend-llama-v3-llama3-8b] SKIP (https://nvbugs/5744432)
-test_e2e.py::test_trtllm_serve_multimodal_example SKIP (https://nvbugs/5747920)
 cpp/test_multi_gpu.py::TestDisagg::test_symmetric_executor[gpt-2proc-mpi_kvcache-90] SKIP (https://nvbugs/5755941)
 examples/test_granite.py::test_llm_granite[granite-3.0-1b-a400m-instruct-bfloat16] SKIP (https://nvbugs/5608979)
 examples/test_granite.py::test_llm_granite[granite-3.0-2b-instruct-bfloat16] SKIP (https://nvbugs/5608979)
--- a/tests/unittest/llmapi/apps/_test_trtllm_serve_multimodal_example.py
+++ b/tests/unittest/llmapi/apps/_test_trtllm_serve_multimodal_example.py
@ -61,12 +61,15 @@ def example_root():
@pytest.mark.parametrize("exe, script",
                         [("python3", "openai_chat_client_for_multimodal.py"),
                          ("bash", "aiperf_client_for_multimodal.sh")])
-def test_trtllm_serve_examples(exe: str, script: str,
+def test_trtllm_serve_examples(exe: str, script: str, model_name: str,
                               server: RemoteOpenAIServer, example_root: str):
    client_script = os.path.join(example_root, script)
+    custom_env = os.environ.copy()
+    custom_env["AIPERF_TOKENIZER_PATH"] = get_model_path(model_name)
    # CalledProcessError will be raised if any errors occur
    subprocess.run([exe, client_script],
                   stdout=subprocess.PIPE,
                   stderr=subprocess.PIPE,
                   text=True,
-                   check=True)
+                   check=True,
+                   env=custom_env)