[Misc] Support local image encoding in benchmarks (#43843)

Signed-off-by: xiaoz <Sukra1@outlook.com>
2026-06-06 00:16:14 +00:00 · 2026-06-02 23:15:06 +08:00
parent 4d93bc35c9
commit 53fa09d085
3 changed files with 238 additions and 15 deletions
@@ -246,6 +246,12 @@ Every image listed in "image_files" is added to the request in the listed order

 The "image" shorthand accepts the same values as "image_files". The "image_url" field accepts either an OpenAI-style object with a "url" field or a URL string.

+By default, image references are sent to the serving endpoint as provided, with local image paths converted to `file://` URLs.
+
+If the benchmark client should load local and HTTP(S) images before sending requests, pass `--custom-ensure-client-side-data` to encode them as base64 data URLs on the client side.
+
+Existing `data:image/...` URLs are already self-contained and are kept unchanged.
+
 ```bash
 # need a model with vision capability here
 vllm serve Qwen/Qwen2-VL-7B-Instruct
@@ -253,13 +259,13 @@ vllm serve Qwen/Qwen2-VL-7B-Instruct

 ```bash
 # run benchmarking script
-vllm bench serve--save-result --save-detailed \
+vllm bench serve --save-result --save-detailed \
  --backend openai-chat \
  --model Qwen/Qwen2-VL-7B-Instruct \
  --endpoint /v1/chat/completions \
  --dataset-name custom_image \
  --dataset-path <path-to-your-image-data-jsonl> \
-  --allowed-local-media-path /path/to/image/folder
+  --custom-ensure-client-side-data
 ```

 Note that we need to use the `openai-chat` backend and `/v1/chat/completions` endpoint for multimodal inputs.