mirror of
https://github.com/vllm-project/vllm.git
synced 2026-06-06 00:16:14 +00:00
7493c51c55
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
621 B
621 B
NVIDIA Dynamo
NVIDIA Dynamo is an open-source framework for distributed LLM inference that can run vLLM on Kubernetes with flexible serving architectures (e.g. aggregated/disaggregated, optional router/planner).
For Kubernetes deployment instructions and examples (including vLLM), see the Deploying Dynamo on Kubernetes guide.
Background reading: InfoQ news coverage — NVIDIA Dynamo simplifies Kubernetes deployment for LLM inference.