70 Commits

Author SHA1 Message Date
Nguyễn Thế Duy 3df1c7c43e [Docker] Non-root support for vllm-openai; add opt-in vllm-openai-nonroot target (#40275)
Signed-off-by: TheDuyIT <nduy250299@gmail.com>
Signed-off-by: dtnguyen <dtnguyen@nvidia.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-05-25 13:45:31 +08:00
Florian Woerner 997132911e [Doc] Fix typo in llm-d documentation link (#42397)
Signed-off-by: Florian Woerner <florian.woerner@onmyown.io>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2026-05-12 04:26:46 -07:00
Ethan Feng a43bc34baf [Docs] Update server entrypoint examples (#42077)
Signed-off-by: Ethan Feng <ethan.fengch@gmail.com>
2026-05-09 02:03:52 +00:00
wang.yuqi 1d694e78c9 [Examples][last/6] Resettle examples. (#41084)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-05-07 19:42:12 -07:00
wang.yuqi 51c1ee9b7c [Examples] Resettle Disaggregated examples. (#40759)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
Signed-off-by: wang.yuqi <noooop@126.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-05-06 01:20:38 -07:00
Honglin Cao 9c271f9403 [gRPC] Add standard gRPC health checking (grpc.health.v1) for Kubernetes native probes (#38016)
Signed-off-by: Honglin Cao <Caohonglin317@hotmail.com>
2026-04-22 21:31:00 +00:00
Yuichiro Utsumi a1746ff9ec [Doc] Clarify Helm chart location in deployment guide (#38328)
Signed-off-by: Yuichiro Utsumi <utsumi.yuichiro@fujitsu.com>
Signed-off-by: Yuichiro Utsumi <81412151+utsumi-fj@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-03-27 15:43:02 +08:00
Harry Mellor a0f44bb616 Allow markdownlint to run locally (#36398)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-08 20:05:24 -07:00
Paco Xu 7493c51c55 [Docs] add Dynamo/aibrix integration and kubeai/aks link (#32767)
Signed-off-by: Paco Xu <paco.xu@daocloud.io>
2026-03-05 17:39:50 +08:00
Chen 138c5fa186 [Docs] Add RunPod GPU deployment guide for vLLM (#34531)
Signed-off-by: lisperz <zhuchen200245@163.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-03-04 10:11:34 -08:00
vllmellm aaa2efbe98 [DOC] [ROCm] Update docker deployment doc (#33971)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2026-02-06 10:05:35 -08:00
Nathan Weinberg 58cb55e4de [Doc] Enhance documentation around CPU container images (#32286)
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2026-01-30 13:36:20 +00:00
Andrew Bennett f243abc92d Fix various typos found in docs (#32212)
Signed-off-by: Andrew Bennett <potatosaladx@meta.com>
2026-01-13 03:41:47 +00:00
Harry Mellor decc244767 [Docs] Use relative md links instead of absolute html links for cross referencing (#31494)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-12-29 13:33:44 +00:00
yitingdc b326598e97 add tip for VLLM_USE_PRECOMPILED arg to reduce docker build time (#31385)
Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
2025-12-28 03:19:47 +00:00
Yuan Tang 0736f901e7 docs: Add llm-d integration to the website (#31234)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-12-23 20:27:22 +00:00
Didier Durand 1a55cfafcb [Doc]: fixing typos in various files (#30540)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Didier Durand <2927957+didier-durand@users.noreply.github.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-12-14 02:14:37 -08:00
Qidong Su 24429d5924 [Doc] Add instructions for building docker image on GB300 with CUDA13 (#30414)
Signed-off-by: Qidong Su <soodoshll@gmail.com>
2025-12-13 21:56:53 +00:00
Tiger Xu / Zhonghu Xu 60a66ea2dc [DOC]: Add kthena to integrations (#29931)
Signed-off-by: Zhonghu Xu <xuzhonghu@huawei.com>
2025-12-05 08:11:03 +00:00
Yuan Tang f716a15372 Update KServe guide link in documentation (#29258)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-11-24 14:40:05 +00:00
Cyrus Leung 9452863088 Revert "Revert #28875 (#29159)" (#29179)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 04:27:43 -08:00
Cyrus Leung 4d7231e774 Revert #28875 (#29159) 2025-11-21 01:40:17 -08:00
Qidong Su 698024ecce [Doc] update installation guide regarding aarch64+cuda pytorch build (#28875)
Signed-off-by: Qidong Su <soodoshll@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-20 19:40:25 -08:00
Didier Durand 09540cd918 [Doc]: fix typos in various files (#29010)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-11-19 04:56:21 -08:00
Harry Mellor 67187554dd [Docs] Enable some more markdown lint rules for the docs (#28731)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 18:39:19 +00:00
Harry Mellor 5f3cd7f7f2 [Docs] Update the name of Transformers backend -> Transformers modeling backend (#28725)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-14 16:34:14 +00:00
Fang Han da855b42d2 [Doc]: Make extraInit containers fully configurable in helm chart (#27497)
Signed-off-by: Fang Han <fhan0520@gmail.com>
2025-11-06 20:27:16 +00:00
yitingdc 31b55ffc62 use stringData in secret yaml to store huggingface token (#25685)
Signed-off-by: yiting.jiang <yiting.jiang@daocloud.io>
2025-10-30 00:47:36 -07:00
usberkeley 69f064062b Code quality improvements: version update, type annotation enhancement, and enum usage simplification (#27581)
Signed-off-by: Bradley <bradley.b.pitt@gmail.com>
2025-10-27 17:50:22 +00:00
Harry Mellor 483ea64611 [Docs] Replace all explicit anchors with real links (#27087)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-17 02:22:06 -07:00
Harry Mellor 4ffd6e8942 [Docs] Reduce custom syntax used in docs (#27009)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-16 20:05:34 -07:00
Kay Yan 02d709a6f1 [docs] standardize Hugging Face env var to HF_TOKEN (deprecates HUGGING_FACE_HUB_TOKEN) (#27020)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-10-16 15:31:02 +01:00
Cyrus Leung ef9676a1f1 [Doc] ruff format some Python examples (#26767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-14 03:21:53 -07:00
abhisheksheth28 77c95f72f7 [Doc] add KAITO to integrations (#25521)
Signed-off-by: "Abhishek Sheth" <absheth@microsoft.com>
2025-10-06 17:30:03 +08:00
Aritra Roy Gosthipaty 59f30d0448 [Docs] Edit HF Inference Endpoints documentation (#26275)
Signed-off-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Signed-off-by: ariG23498 <aritra.born2fly@gmail.com>
2025-10-06 10:13:09 +01:00
Elieser Pereira f509a20846 [DOC] Update production-stack.md (#26177)
Signed-off-by: Elieser Pereira <elieser.pereiraa@gmail.com>
2025-10-05 21:32:48 +00:00
Cyrus Leung d00d652998 [CI/Build] Replace vllm.entrypoints.openai.api_server entrypoint with vllm serve command (#25967)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-10-02 10:04:57 -07:00
Sergio Paniego Blanco 099aaee536 Add Hugging Face Inference Endpoints guide to Deployment docs (#25886)
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-09-30 14:35:06 +00:00
Michael Yao 78818dd1b0 [Docs] Have a try to improve frameworks/streamlit.md (#24841)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-14 21:50:36 -07:00
Michael Yao d14c4ebf08 [Docs] Use 1-2-3 list for deploy steps in deployment/frameworks/ (#24633)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-11 01:50:12 -07:00
Michael Yao 85df8afdae [Docs] Revise frameworks/anything-llm.md (#24489)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-11 01:50:05 -07:00
Michael Yao c2a8b08fcd [Doc] Fix issues in integrations/llamastack.md (#24428)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-09-08 02:28:32 -07:00
Christian Berge 8bd5844989 correct LWS deployment yaml (#23104)
Signed-off-by: cberge908 <42270330+cberge908@users.noreply.github.com>
2025-09-02 12:04:59 +00:00
Didier Durand d99c3a4f7b [Doc]: fix typos in .md files (including those of #23751) (#23825)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
2025-08-28 04:38:19 -07:00
Didier Durand 47455c424f [Doc: ]fix various typos in multiple files (#23487)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-08-25 00:04:04 +00:00
Shiming Zhang 3aa8c10038 Fix missing quotes (#23242)
Signed-off-by: Shiming Zhang <wzshiming@hotmail.com>
2025-08-20 10:46:59 +00:00
Daniele d2aab336ad [CI/Build] get rid of unused VLLM_FA_CMAKE_GPU_ARCHES (#21599)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2025-07-31 15:00:08 +08:00
Michael Goin fb58e3a651 [Docs] Update docker.md with HF_TOKEN, new model, and podman fix (#21856) 2025-07-29 19:45:41 -07:00
Harry Mellor ba5c5e5404 [Docs] Switch to better markdown linting pre-commit hook (#21851)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-07-29 19:45:08 -07:00
Michael Yao 260127ea54 [Docs] Add intro and fix 1-2-3 list in frameworks/open-webui.md (#19199)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
2025-07-16 06:11:38 -07:00