Isotr0py
|
1fd8bd02a4
|
[Docs] Replace broken video url in examples (#44159)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-06-01 06:01:10 +00:00 |
|
Jeffrey Wang
|
29d69332aa
|
[BugFix] Fix _has_module to verify native deps via trial import (#44035)
Signed-off-by: esmeetu <jasonailu87@gmail.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: esmeetu <jasonailu87@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-05-31 22:06:33 -07:00 |
|
Lucas Wilkinson
|
4721bb3aa4
|
[MRV2] Remove Eagle's dedicated CUDA graph pool (#44078)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
2026-05-31 22:00:33 -07:00 |
|
Umut Polat
|
f46e6be169
|
[Misc] Use VLLMValidationError consistently in chat completion and completion protocol validators (#36254)
Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
|
2026-06-01 04:04:11 +00:00 |
|
nightcityblade
|
8b8546da1c
|
docs: fix MLA attention docstring examples (#44118)
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
|
2026-05-31 12:28:38 -07:00 |
|
Jee Jee Li
|
6bdabbad5b
|
[CI/Build] Enable Step3p7ForConditionalGeneration testing (#43956)
Signed-off-by: Jee Jee Li <jeejeelee@inferact.ai>
|
2026-05-31 05:16:12 +00:00 |
|
Aakar Dwivedi
|
3fd9d2d357
|
[CPU][Zen] Route W8A8 and W4A16 linear inference through zentorch on AMD Zen CPUs (#41813)
Signed-off-by: R <Ganesh.R@amd.com>
Signed-off-by: Harshal Adhav <harshal.adhav@amd.com>
Signed-off-by: Aakar Dwivedi <aadwived@amd.com>
Co-authored-by: R <Ganesh.R@amd.com>
Co-authored-by: Harshal Adhav <harshal.adhav@amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-05-30 14:17:21 -05:00 |
|
Woosuk Kwon
|
27fa5aa3b9
|
[MRV2] Support breakable CUDA graph (#44050)
Signed-off-by: Woosuk Kwon <woosuk@inferact.ai>
|
2026-05-30 09:40:52 -07:00 |
|
Wentao Ye
|
e1105064b2
|
[Bug] Fix gemma4 MTP IMA issue when TP>1, CUDA error: an illegal memory access was encountered (#43909)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-05-30 10:34:33 -04:00 |
|
Bugen Zhao
|
50c80d7923
|
[Governance] Add @BugenZhao as Rust frontend code owner (#44047)
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
|
2026-05-30 22:23:54 +08:00 |
|
Xiaoran
|
3becc5db40
|
[ROCm] Add attention sink support to AITer flash attention backend (#43817)
Signed-off-by: Xiaoran Chen <xiaoran@fb.com>
Co-authored-by: Xiaoran Chen <xiaoran@fb.com>
|
2026-05-30 18:13:18 +08:00 |
|
Lanze Liu
|
124fac10cb
|
[Bugfix] Fix RMSNorm kernels to multiply in weight's native dtype (#42379)
Signed-off-by: Lanze Liu <lanzetech@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-05-29 23:16:53 -07:00 |
|
Liangliang Ma
|
e9499996df
|
[BugFix][Platform] Fix import vllm.platforms.rocm error on non-CUDA test_gpt_oss.py (#43571)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-29 23:16:49 -07:00 |
|
nemanjaudovic
|
c0056b19bf
|
[ROCm] cmake: support PYTORCH_FOUND_HIP for torch 2.13 native HIP language support (#43881)
Signed-off-by: nemanjaudovic <nudovic@amd.com>
Co-authored-by: Shengqi Chen <harry-chen@outlook.com>
|
2026-05-29 22:16:57 -07:00 |
|
Andreas Karatzas
|
ef8840adc7
|
[ROCm][CI] Fix failure in the Phi3V pooling test (#44028)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-30 12:14:37 +08:00 |
|
Flora Feng
|
1a096d8208
|
[Refactor] Remove dead current_tool_name_sent assignments from tool parsers (#43997)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 21:45:15 -04:00 |
|
Gagan Dhakrey
|
1e2ce5d11a
|
offload prompt_embeds decode in render_prompts_async to avoid blocking (#43792)
Signed-off-by: Gagan Dhakrey <gagandhakrey@gmail.com>
|
2026-05-30 01:36:34 +00:00 |
|
Jee Jee Li
|
559d6710bf
|
[PERF]MiniMax-M2 gate kernel (#38445)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: qianlihuang <91178480+qianlihuang@users.noreply.github.com>
Co-authored-by: Yiliu Dong <91178480+qianlihuang@users.noreply.github.com>
|
2026-05-29 18:28:34 -07:00 |
|
bnellnm
|
187457a952
|
Revert "[MoE Refactor] Migrate MoeWNA16Method quantization to MK orac… (#44033)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2026-05-29 16:45:29 -07:00 |
|
Kevin H. Luu
|
8fad266507
|
[CI] Fix smoke test step key to bypass block gate (#43974)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-29 16:28:32 -07:00 |
|
Flora Feng
|
8c6daf6e2f
|
[CI] Remove duplicate Harmony test coverage (#44023)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 22:52:46 +00:00 |
|
bnellnm
|
7b98f498cd
|
[MoE Refactor] Remove supports_expert_map (#43108)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2026-05-29 17:26:56 -04:00 |
|
bnellnm
|
106aa92f04
|
[MoE Refactor] Migrate MoeWNA16Method quantization to MK oracle (#42647)
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-29 17:19:31 -04:00 |
|
yzong-rh
|
46409fd2a1
|
[Fronten] Clean up stop_token_ids override for Harmony (#44009)
Signed-off-by: Yifan Zong <yzong@redhat.com>
|
2026-05-29 13:28:06 -07:00 |
|
Tyler Michael Smith
|
38b864d81d
|
[Metrics] Exclude KV transfer tokens from iteration_tokens_total (#43346)
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-29 19:56:44 +00:00 |
|
Wentao Ye
|
5dbf1605a0
|
[Feature] SSL support for dp supervisor (#43688)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-05-29 19:28:12 +00:00 |
|
Kevin H. Luu
|
acbc203340
|
Add @khluu to CODEOWNERS (#44019)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-05-29 12:24:29 -07:00 |
|
Flora Feng
|
6de08e8b46
|
[CI] Remove redundant test_chat_with_tool_reasoning.py (#44011)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 19:23:56 +00:00 |
|
Kevin H. Luu
|
6aabe221a5
|
[CI] Make Model Executor test hangs fail fast with a traceback (#43971)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
v0.22.1rc0
|
2026-05-29 11:58:25 -07:00 |
|
Wentao Ye
|
739096a028
|
[Bug] Fix torch device issue for MOE permute (#44005)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-05-29 18:55:00 +00:00 |
|
czhu-cohere
|
8b9deeec4b
|
[Bugfix] Fix Ray placement group allocation with grouped nodes (#43998)
Signed-off-by: <conway.zhu@cohere.com>
Signed-off-by: root <conway.zhu@cohere.com>
|
2026-05-29 12:51:05 -06:00 |
|
qizixi
|
d07ad0693b
|
[Bugfix] Use storage_block_size in KV cache reshape for compressed specs (DeepSeek V4) (#43988)
Signed-off-by: zixi-qi <zixi@inferact.ai>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
2026-05-29 11:14:25 -07:00 |
|
Ilya Markov
|
4aaba00f92
|
[EPLB] Make async EPLB default (#43219)
Signed-off-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2026-05-29 18:07:16 +00:00 |
|
bnellnm
|
84b2a8a7e7
|
[MoE Refactor] WNA16 MoE backend selection into oracle module (#42553)
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-29 13:11:17 -04:00 |
|
qizixi
|
4ff865c38e
|
[Bugfix] Disable allreduce_rms_fusion when pipeline_parallel_size > 1 (#43616)
Signed-off-by: zixi-qi <zixi@inferact.ai>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-29 22:57:43 +08:00 |
|
Taneem Ibrahim
|
5502c3b52d
|
[Misc] added unit tests for the core pooling methods (#43818)
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-05-29 14:40:31 +00:00 |
|
Chunyang Wen
|
f191d5630e
|
docs: clarify ITL acronym in optimization docs (#43922)
Signed-off-by: chunyang.wen <chunyang.wen@gmail.com>
|
2026-05-29 07:40:05 -07:00 |
|
Lucain
|
11dfa3169d
|
Add vLLM library info to Hugging Face Hub requests (#43857)
Signed-off-by: Wauplin <lucainp@gmail.com>
Signed-off-by: Lucain Pouget <lucain@huggingface.co>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-05-29 14:04:58 +00:00 |
|
Li, Jiang
|
3f6f508e14
|
[Bugfix][CPU] Remove invalid extra deps (#43977)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-05-29 22:02:09 +08:00 |
|
Harry Mellor
|
0585b5ba2e
|
Skip docs build if PR doesn't affect docs (#43972)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-05-29 12:09:52 +00:00 |
|
Thien Tran
|
d2889722ff
|
[Bugfix] Corrupted MLA + linear attention (#43961)
Signed-off-by: Thien Tran <gau.nernst@yahoo.com.sg>
|
2026-05-29 05:00:51 -07:00 |
|
frida-andersson
|
0b56815a24
|
[ROCm][Perf] DSv3.2 MI355X TP4 decode-step orchestration cleanup (3 micro-opts) (#42982)
Signed-off-by: Frida Andersson <fanderss@amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-05-29 04:26:57 -07:00 |
|
MHYangAMD
|
ab12aab127
|
[Bugfix] [ROCm] [DSV4] Fix AITER MXFP4 MoE weight loading and shuffle… (#42595)
Co-authored-by: MHYangAMD <MHYangAMD@users.noreply.github.com>
|
2026-05-29 04:08:33 -07:00 |
|
JartX
|
0cff0741ff
|
[Kernel][ROCm] Native W4A16 kernel for AMD RDNA3 (gfx1100) — fp16 + bf16 (#41394)
Signed-off-by: JartX <sagformas@epdcenter.es>
|
2026-05-29 11:04:40 +00:00 |
|
Joaquín Mondéjar
|
60a7a2214f
|
[Bugfix] Fix Step3 pipeline parallel KeyError for residual tensor (#37622)
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2026-05-29 03:04:02 -07:00 |
|
Nicolò Lucchesi
|
7ebc0ec104
|
[CI] Nixl+SimpleCPUOffloadingConnector unit tests (#43871)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-05-29 02:40:42 -07:00 |
|
Qiming Zhang
|
e8b5199973
|
[XPU] support MTP of gdn attention (#43565)
Signed-off-by: mayuyuace <qiming1.zhang@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-29 17:10:24 +08:00 |
|
Simon Danielsson
|
b7fb747d8d
|
[CI][ROCm] Don't skip MoRI-IO Connector tests (#43703)
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
|
2026-05-29 17:06:23 +08:00 |
|
Kunshang Ji
|
30c6289b8e
|
[XPU] fix xpu install document triton-xpu version (#43947)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-29 02:05:12 -07:00 |
|
Andreas Karatzas
|
ff990d0d32
|
[ROCm][CI] Fix AITER unified attention for encoder-decoder cross-attention (#43945)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-29 16:43:39 +08:00 |
|