docs: fix tokenizer optimization typo (#44066)

Signed-off-by: chunyang.wen <chunyang.wen@gmail.com>
2026-06-06 00:16:14 +00:00 · 2026-06-05 17:12:49 +08:00
parent d98b8f371c
commit efc347f1b2
1 changed files with 1 additions and 1 deletions
@@ -296,7 +296,7 @@ llm = LLM(model="Qwen/Qwen3-8B")
 The `fastokens` Python package (>= 0.2.0) must be installed; if it isn't,
 vLLM raises a clear `ImportError` at tokenizer load. The override applies to
 any `--tokenizer-mode` that ends up loading an HF fast tokenizer (`hf`,
-`deepseek_v32`, `deepseek_v4`, `qwen_vl`, …). Modes that don't use the HF
+`deepseek_v32`, `deepseek_v4`, `qwen_vl`, …). Models that don't use the HF
 fast tokenizer (`mistral`, `grok2`, `kimi_audio`) ignore the flag.

 Tokenizer-bound workloads — long shared prefixes, bursty short prompts,