Turner Jabbour
|
0c96dd64fb
|
[ROCm] Bump fastsafetensors to v0.3.2 from PyPI, remove git source build (#43625)
Signed-off-by: Turner Jabbour <doubleujabbour@gmail.com>
|
2026-06-04 07:30:57 -07:00 |
|
Aakar Dwivedi
|
3fd9d2d357
|
[CPU][Zen] Route W8A8 and W4A16 linear inference through zentorch on AMD Zen CPUs (#41813)
Signed-off-by: R <Ganesh.R@amd.com>
Signed-off-by: Harshal Adhav <harshal.adhav@amd.com>
Signed-off-by: Aakar Dwivedi <aadwived@amd.com>
Co-authored-by: R <Ganesh.R@amd.com>
Co-authored-by: Harshal Adhav <harshal.adhav@amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-05-30 14:17:21 -05:00 |
|
Li, Jiang
|
3f6f508e14
|
[Bugfix][CPU] Remove invalid extra deps (#43977)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-05-29 22:02:09 +08:00 |
|
Mohammad Miadh Angkad
|
a970fb5a1a
|
Fix CuPy runtime deps and restore humming (#43530)
Signed-off-by: Mohammad Miadh Angkad <176301910+mmangkad@users.noreply.github.com>
|
2026-05-26 05:59:40 -07:00 |
|
Michael Goin
|
10d264a2b9
|
Revert "[Misc] add humming to dependencies" (#43492)
|
2026-05-23 14:21:13 -07:00 |
|
Li, Jiang
|
65b7a812a2
|
[CPU] Experimentally enable Triton and MRV2 (#43225)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-05-22 01:48:17 -07:00 |
|
Nick Hill
|
f2ace1d57d
|
[Frontend][RFC] Rust front-end integration (#40848)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: Bugen Zhao <i@bugenzhao.com>
|
2026-05-21 12:24:48 +08:00 |
|
Chris Leonard
|
07aeaf9d4d
|
[6/n] Migrate activation kernels, gptq, gguf, non cutlass w8a8 to libtorch stable ABI (continued) (#42663)
Signed-off-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Signed-off-by: Chris Leonard <chleonar@redhat.com>
Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Co-authored-by: Shengqi Chen <harry-chen@outlook.com>
|
2026-05-20 00:18:12 -07:00 |
|
Jinzhen Lin
|
8200fbe1ac
|
[Misc] add humming to dependencies (#42540)
Signed-off-by: Jinzhen Lin <jinzhen.ljz@antgroup.com>
|
2026-05-19 08:36:47 -07:00 |
|
Jiangyun Zhu
|
140dc2ec30
|
[Bugfix] Install nvidia-cutlass-dsl[cu13] extra on CUDA 13 platforms (#42438)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2026-05-13 01:57:21 -07:00 |
|
pschlan-amd
|
39dff5ff39
|
Add VLLM_USE_SPINLOOP_EXT to use more efficient busy polling (#36517)
Signed-off-by: Patrick Schlangen <pschlan@amd.com>
|
2026-05-11 16:11:49 -07:00 |
|
lyd1992
|
100c7b65e7
|
[Platform] Fix RISC-V platform detection (lscpu parsing + non-NUMA meminfo) (#40427)
Signed-off-by: liuyudong <liuyudong@iscas.ac.cn>
|
2026-04-24 04:33:05 +00:00 |
|
Honglin Cao
|
9c271f9403
|
[gRPC] Add standard gRPC health checking (grpc.health.v1) for Kubernetes native probes (#38016)
Signed-off-by: Honglin Cao <Caohonglin317@hotmail.com>
|
2026-04-22 21:31:00 +00:00 |
|
Isotr0py
|
67eb6083e3
|
Revert "[Misc] Move pyav and soundfile to common requirements" (#40276)
Co-authored-by: Roger Wang <hey@rogerw.io>
|
2026-04-21 09:08:06 -07:00 |
|
Chinmay-Kulkarni-AMD
|
87518c3027
|
[ZenCPU] AMD Zen CPU Backend with supported dtypes via zentorch weekly (#39967)
Signed-off-by: Chinmay Kulkarni <Chinmay.Kulkarni@amd.com>
|
2026-04-18 06:22:37 +00:00 |
|
Isotr0py
|
617d1c2ff1
|
[Misc] Move pyav and soundfile to common requirements (#39997)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-04-16 08:52:37 -07:00 |
|
Isotr0py
|
82531edbfb
|
[Refactor] Remove resampy dependency (#39524)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-04-16 08:48:17 -07:00 |
|
Yanan Cao
|
edc3648966
|
[Kernel][Helion] Fix inductor fusion of Helion HOP (#39944)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-16 04:41:26 -07:00 |
|
Matthew Bonanni
|
c77e596e2e
|
[FlashAttention] Don't overwrite flash_attn_interface.py when installing precompiled (#39932)
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
|
2026-04-15 16:43:15 -04:00 |
|
Michael Goin
|
eb4205fee5
|
[UX] Integrate DeepGEMM into vLLM wheel via CMake (#37980)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-04-08 18:56:32 -07:00 |
|
ibifrost
|
96b5004b71
|
[KVConnector] Support 3FS KVConnector (#37636)
Signed-off-by: wuchenxin <wuchenxin.wcx@alibaba-inc.com>
Signed-off-by: ibifrost <47308427+ibifrost@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2026-04-07 15:46:00 +00:00 |
|
Robert Shaw
|
968ed02ace
|
[Quantization][Deprecation] Remove Petit NVFP4 (#32694)
Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
|
2026-04-05 00:07:45 +00:00 |
|
Yanan Cao
|
ecd5443dbc
|
Bump helion dependency from 0.3.2 to 0.3.3 (#38062)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-04-02 10:59:33 -07:00 |
|
Fadi Arafeh
|
34d317dcec
|
[CPU][UX][Perf] Enable tcmalloc by default (#37607)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
|
2026-03-25 20:39:57 +08:00 |
|
Isotr0py
|
c7f98b4d0a
|
[Frontend] Remove librosa from audio dependency (#37058)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-21 11:36:15 +08:00 |
|
Huanxing
|
6951fcd44f
|
[XPU] Automatically detect target platform as XPU in build. (#37634)
Signed-off-by: huanxing <huanxing.shen@intel.com>
|
2026-03-20 13:30:15 +08:00 |
|
mikaylagawarecki
|
8b10e4fb31
|
[1/n] Migrate permute_cols to libtorch stable ABI (#31509)
Signed-off-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
|
2026-03-19 11:27:26 -04:00 |
|
elvischenv
|
d61d2b08e9
|
[Build] Fix API rate limit exceeded when using VLLM_USE_PRECOMPILED=1 (#36229)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-03-16 12:09:27 +00:00 |
|
Lalithnarayan C
|
7acaea634c
|
In-Tree AMD Zen CPU Backend via zentorch [1/N] (#35970)
Signed-off-by: Lalithnarayan C <Lalithnarayan.C@amd.com>
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Chinmay-Kulkarni-AMD <Chinmay.Kulkarni@amd.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-03-15 23:35:35 +00:00 |
|
Hari
|
a3e2e250f0
|
[Feature] Add Azure Blob Storage support for RunAI Model Streamer (#34614)
Signed-off-by: hasethuraman <hsethuraman@microsoft.com>
|
2026-03-15 19:38:21 +08:00 |
|
Isotr0py
|
6590a3ecda
|
[Frontend] Remove torchcodec from audio dependency (#37061)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-03-15 05:15:59 +00:00 |
|
arlo
|
8c29042bb9
|
[Feature] Add InstantTensor weight loader (#36139)
|
2026-03-14 18:05:23 +01:00 |
|
seanmamasde
|
84868e4793
|
[Bugfix][Frontend] Fix audio transcription for MP4, M4A, and WebM formats (#35109)
Signed-off-by: seanmamasde <seanmamasde@gmail.com>
|
2026-03-14 08:44:03 -07:00 |
|
Yanan Cao
|
236de72e49
|
[CI] Pin helion version (#37012)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-13 23:25:29 -04:00 |
|
Li, Jiang
|
092ace9e3a
|
[UX] Improve UX of CPU backend (#36968)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Signed-off-by: Li, Jiang <bigpyj64@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-03-14 09:27:29 +08:00 |
|
Simo Lin
|
572c776bfb
|
build: update smg-grpc-servicer to use vllm extra (#36938)
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
|
2026-03-13 01:31:36 +00:00 |
|
Chang Su
|
507ddbe992
|
feat(grpc): extract gRPC servicer into smg-grpc-servicer package, add --grpc flag to vllm serve (#36169)
Signed-off-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2026-03-10 03:29:59 -07:00 |
|
Andrii Skliar
|
5d199ac8f2
|
Support Audio Extraction from MP4 Video for Nemotron Nano VL (#35539)
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Signed-off-by: Andrii Skliar <askliar@nvidia.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Andrii <askliar@nvidia.com>
Co-authored-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Co-authored-by: Andrii Skliar <askliar@oci-nrt-cs-001-vscode-01.cm.cluster>
Co-authored-by: Andrii <askliar@nvidia.com>
Co-authored-by: root <root@pool0-03748.cm.cluster>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: root <root@pool0-02416.cm.cluster>
Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com>
Co-authored-by: root <root@pool0-04880.cm.cluster>
|
2026-03-03 23:20:33 -08:00 |
|
Lucas Wilkinson
|
8b5014d3dd
|
[Attention] FA4 integration (#32974)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2026-03-01 23:44:57 +00:00 |
|
Ma Jian
|
90805ff464
|
[CI/Build] CPU release supports both of AVX2 and AVX512 (#35466)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: jiang1.li <jiang1.li@intel.com>
|
2026-02-28 04:35:21 +00:00 |
|
Sophie du Couédic
|
02acd16861
|
[Benchmarks] Plot benchmark timeline and requests statistics (#35220)
Signed-off-by: Sophie du Couédic <sop@zurich.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2026-02-26 02:17:43 -08:00 |
|
Nick Hill
|
79504027ef
|
[Misc] Bump fastsafetensors version for latest fixes (#34273)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-02-11 00:30:09 -08:00 |
|
emricksini-h
|
325ab6b0a8
|
[Feature] OTEL tracing during loading (#31162)
|
2026-02-05 16:59:28 -08:00 |
|
Michael Goin
|
d0cbac5827
|
[Dev UX] Add auto-detection for VLLM_PRECOMPILED_WHEEL_VARIANT during install (#32948)
Signed-off-by: mgoin <mgoin64@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Shengqi Chen <i@harrychen.xyz>
|
2026-01-23 19:15:17 -08:00 |
|
Lucas Wilkinson
|
889722f3bf
|
[FlashMLA] Update FlashMLA to expose new arguments (#32810)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-01-21 22:02:39 -07:00 |
|
Yanan Cao
|
9d1e611f0e
|
[CI] Add Helion as an optional dependency (#32482)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
|
2026-01-19 19:09:56 +00:00 |
|
Isotr0py
|
cee7436a26
|
[Misc] Make scipy as optional audio/benchmark dependency (#32096)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-01-11 00:18:57 -08:00 |
|
TJian
|
7a05d2dc65
|
[CI] [ROCm] Fix tests/entrypoints/test_grpc_server.py on ROCm (#31970)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
|
2026-01-09 12:54:20 +08:00 |
|
Chang Su
|
791b2fc30a
|
[grpc] Support gRPC server entrypoint (#30190)
Signed-off-by: Chang Su <chang.s.su@oracle.com>
Signed-off-by: njhill <nickhill123@gmail.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: njhill <nickhill123@gmail.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2026-01-07 23:24:46 -08:00 |
|
RickyChen / 陳昭儒
|
b3a2bdf1ac
|
[Feature] Add offline FastAPI documentation support for air-gapped environments (#30184)
Signed-off-by: rickychen-infinirc <ricky.chen@infinirc.com>
Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-12-29 16:22:39 +00:00 |
|