Flora Feng
|
e67063826b
|
[CI] Add missing vllm/parser/ CI trigger and fix test_parse.py (#44352)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-06-02 21:05:19 -07:00 |
|
Andreas Karatzas
|
53b88d1dfc
|
[CI] Reject out-of-vocabulary before they reach the GPU logprob path (#44042)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-06-02 22:27:52 -05:00 |
|
JartX
|
7b476c8f14
|
[ROCm][CI] Skip fp8 reload tests on gfx90a (MI250) (#44369)
Signed-off-by: JartX <sagformas@epdcenter.es>
|
2026-06-02 22:27:14 -05:00 |
|
JartX
|
4454a18695
|
[ROCm][CI] Fix stale wvSplitK GEMM fallback test for N=5 (#44368)
Signed-off-by: JartX <sagformas@epdcenter.es>
|
2026-06-02 22:00:25 -05:00 |
|
Siddharth Bedekar
|
0917a009d3
|
Fix sparse NCCL weight transfer test construction (#44345)
Signed-off-by: Siddharth Bedekar <bedeksid@gmail.com>
|
2026-06-02 21:51:21 +00:00 |
|
SeongJun Lee
|
3099de3617
|
[Kernel][MoE] Add GELU_TANH to CPU, CUTLASS, and WNA16 MoE backends (#42027)
Signed-off-by: lesj0610 <lesj0610@users.noreply.github.com>
Co-authored-by: lesj0610 <lesj0610@users.noreply.github.com>
|
2026-06-02 17:12:08 -04:00 |
|
Nick Hill
|
e15f20258b
|
[ModelRunnerV2] Avoid pipeline parallel bubbles (#42187)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-06-02 14:02:01 -07:00 |
|
Yifan Qiao
|
e9e08c49b9
|
[Bugfix] Cache the EAGLE/MTP lookahead block in the SWA prefix-cache mask (#44082)
Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>
|
2026-06-02 12:21:07 -07:00 |
|
Nick Hill
|
da107a59e5
|
[MRV2] Also enable MRV2 for Llama and Mistral dense models (#43458)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Co-authored-by: yewentao256 <zhyanwentao@126.com>
|
2026-06-02 11:18:46 -07:00 |
|
Chauncey
|
ed9a7526b6
|
[Anthropic] Support system role messages inside messages array (#44283)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Co-authored-by: Aleksandar Yanakiev <alexander.yanakiev@discretestack.com>
Co-authored-by: Ang Kah Min, Kelvin <syraxius@hotmail.com>
|
2026-06-02 18:13:54 +00:00 |
|
Flora Feng
|
478b49ddec
|
[Refactor] Remove dead code from parser infrastructure (#44279)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-06-02 12:08:27 -04:00 |
|
Nick Hill
|
cab5c9a2a9
|
[Core] Move max_concurrent_batches to VllmConfig (#44274)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-06-02 08:57:25 -07:00 |
|
XiaoZ
|
53fa09d085
|
[Misc] Support local image encoding in benchmarks (#43843)
Signed-off-by: xiaoz <Sukra1@outlook.com>
|
2026-06-02 15:15:06 +00:00 |
|
王金旭
|
0bdfd5eb84
|
[Bugfix] Vendor MiniCPMV/MiniCPMO processors to unblock Transformers v5 (#44282)
Signed-off-by: guanwei-wu <b08901019@ntu.edu.tw>
Signed-off-by: wjinxu <1299461899@qq.com>
Co-authored-by: guanwei-wu <b08901019@ntu.edu.tw>
Co-authored-by: Cursor <cursoragent@cursor.com>
|
2026-06-02 07:14:38 -07:00 |
|
gruner
|
654bd2bca4
|
[Bugfix] Sync block_size from EngineCore to frontend for hybrid Mamba… (#42967)
Signed-off-by: Amit Gruner <agruner@crusoe.ai>
Co-authored-by: Amit Gruner <agruner@crusoe.ai>
Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com>
|
2026-06-02 13:41:00 +00:00 |
|
wang.yuqi
|
b623f7ea95
|
[Frontend] Consolidate dev entrypoints. (#44170)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-06-02 06:30:21 -07:00 |
|
Shreyas Kulkarni
|
0eeba5eec1
|
Fix DFlash prefix cache corruption due to missing lookahead block (#42971)
Signed-off-by: Shreyas Kulkarni <shreyas.gp269@gmail.com>
|
2026-06-02 12:06:33 +00:00 |
|
Ronen Schaffer
|
2a2b5ca791
|
[KV Offload] Add on_schedule_end() hook to separate step lifecycle from event draining (#44206)
Signed-off-by: Ronen Schaffer <ronen.schaffer@ibm.com>
|
2026-06-02 13:42:52 +03:00 |
|
Isotr0py
|
f8e9c56d15
|
[Multimodal] Automatically select registered video loader for VLM (#44126)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2026-06-02 09:09:47 +00:00 |
|
alberto
|
e30313220c
|
[Parser] Migrate ResponsesParser to unified Parser interface (#42977)
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
|
2026-06-02 08:50:05 +00:00 |
|
omerpaz95
|
d247a9dc13
|
[EC Connector] Non blocking EC Connector lookup (#41627)
Signed-off-by: omerpaz95 <omerpaz95@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-06-02 08:48:25 +00:00 |
|
Yifan Qiao
|
7c37096620
|
[Core][Refactor]: thread scheduler_block_size into KVCacheManager and KVCacheCoordinator (#44165)
Signed-off-by: Yifan Qiao <yifanqiao@inferact.ai>
|
2026-06-02 01:14:44 -07:00 |
|
Fadi Arafeh
|
0b25cf4419
|
[CPU][Perf] Enable fused kernels for GDN's gated delta rules (#43534)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
|
2026-06-02 08:00:48 +00:00 |
|
Flora Feng
|
68dafcca75
|
[Refactor] Unify reasoning + tool-call parsing behind Parser.parse() (#44267)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-06-02 15:11:42 +08:00 |
|
JooHo Lee
|
a045c7425f
|
[MM][CG] Profile encoder CUDA graph pool memory (#41714)
Signed-off-by: JooHo Lee <jooho414@gmail.com>
|
2026-06-02 12:27:34 +08:00 |
|
Or Ozeri
|
480fadab1b
|
[BugFix][kv_offload]: Prevent offloading stale sliding window blocks (#42959)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
|
2026-06-02 05:59:48 +03:00 |
|
Andreas Karatzas
|
54d0c36fff
|
[CI] Stabilize OpenAI schema fuzzing for malformed structural tags (#44131)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-06-01 19:56:15 -07:00 |
|
Flora Feng
|
9affc17a05
|
[Refactor] Move unstreamed tool-arg flush from serving layer to parser (#44017)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-06-02 10:37:43 +08:00 |
|
Dao007forever
|
d68f0b220e
|
[Bugfix][Mooncake] Release GPU pin on failed store in MooncakeStoreConnector (#43742)
Signed-off-by: Dao Le <Dao007forever@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-06-01 18:29:18 -07:00 |
|
JartX
|
48c0d13e65
|
[ROCm][CI] Skip unbacked dynamic shapes tests on PyTorch < 2.11 (#44256)
Signed-off-by: JartX <sagformas@epdcenter.es>
|
2026-06-01 19:09:01 -05:00 |
|
Nick Hill
|
e4cbc4385d
|
[Test][BugFix] Fix double-BOS in PD+specdec acceptance test (#44234)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-06-01 14:31:12 -07:00 |
|
Nick Hill
|
6f8b40a23f
|
[BugFix][CI] Fix added _has_module tests (#44248)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-06-01 14:23:12 -07:00 |
|
Siddharth Bedekar
|
266b9d9c64
|
[Frontend][Core] Add sparse NCCL weight transfer support for in-place updates (#40096)
Signed-off-by: Siddharth Bedekar <bedeksid@gmail.com>
Co-authored-by: OpenAI Codex <codex@openai.com>
|
2026-06-01 15:37:30 -04:00 |
|
Andreas Karatzas
|
fd9e91d7e4
|
[ROCm][CI] Fix and stabilize EAGLE3 acceptance tests (#41294)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
|
2026-06-01 12:40:01 -05:00 |
|
Madeesh Kannan
|
023808c23d
|
[Feature] Add support for JetBrains' Mellum v2 code generation model (#43992)
Signed-off-by: Madeesh Kannan <madeeswaran.kannan@jetbrains.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-06-01 10:11:35 -04:00 |
|
Chaojun Zhang
|
bd0aecdc08
|
[XPU][CI] Fix test_audio_in_video flake by using module-scoped server fixture (#44146)
Signed-off-by: Chaojun Zhang <chaojun.zhang@intel.com>
|
2026-06-01 11:21:36 +00:00 |
|
wang.yuqi
|
0910f7e0e1
|
[Frontend] Resettle generative scoring entrypoint. (#44153)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-06-01 07:54:59 +00:00 |
|
Jeffrey Wang
|
29d69332aa
|
[BugFix] Fix _has_module to verify native deps via trial import (#44035)
Signed-off-by: esmeetu <jasonailu87@gmail.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: esmeetu <jasonailu87@gmail.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
|
2026-05-31 22:06:33 -07:00 |
|
Umut Polat
|
f46e6be169
|
[Misc] Use VLLMValidationError consistently in chat completion and completion protocol validators (#36254)
Signed-off-by: umut-polat <52835619+umut-polat@users.noreply.github.com>
|
2026-06-01 04:04:11 +00:00 |
|
Jee Jee Li
|
6bdabbad5b
|
[CI/Build] Enable Step3p7ForConditionalGeneration testing (#43956)
Signed-off-by: Jee Jee Li <jeejeelee@inferact.ai>
|
2026-05-31 05:16:12 +00:00 |
|
Liangliang Ma
|
e9499996df
|
[BugFix][Platform] Fix import vllm.platforms.rocm error on non-CUDA test_gpt_oss.py (#43571)
Signed-off-by: Ma, Liangliang <liangliang.ma@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-29 23:16:49 -07:00 |
|
Andreas Karatzas
|
ef8840adc7
|
[ROCm][CI] Fix failure in the Phi3V pooling test (#44028)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-30 12:14:37 +08:00 |
|
Jee Jee Li
|
559d6710bf
|
[PERF]MiniMax-M2 gate kernel (#38445)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: qianlihuang <91178480+qianlihuang@users.noreply.github.com>
Co-authored-by: Yiliu Dong <91178480+qianlihuang@users.noreply.github.com>
|
2026-05-29 18:28:34 -07:00 |
|
Flora Feng
|
8c6daf6e2f
|
[CI] Remove duplicate Harmony test coverage (#44023)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 22:52:46 +00:00 |
|
bnellnm
|
7b98f498cd
|
[MoE Refactor] Remove supports_expert_map (#43108)
Signed-off-by: Bill Nell <bnell@redhat.com>
|
2026-05-29 17:26:56 -04:00 |
|
Wentao Ye
|
5dbf1605a0
|
[Feature] SSL support for dp supervisor (#43688)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-05-29 19:28:12 +00:00 |
|
Flora Feng
|
6de08e8b46
|
[CI] Remove redundant test_chat_with_tool_reasoning.py (#44011)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 19:23:56 +00:00 |
|
Ilya Markov
|
4aaba00f92
|
[EPLB] Make async EPLB default (#43219)
Signed-off-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2026-05-29 18:07:16 +00:00 |
|
Taneem Ibrahim
|
5502c3b52d
|
[Misc] added unit tests for the core pooling methods (#43818)
Signed-off-by: Taneem Ibrahim <taneem.ibrahim@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
|
2026-05-29 14:40:31 +00:00 |
|
Lucain
|
11dfa3169d
|
Add vLLM library info to Hugging Face Hub requests (#43857)
Signed-off-by: Wauplin <lucainp@gmail.com>
Signed-off-by: Lucain Pouget <lucain@huggingface.co>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-05-29 14:04:58 +00:00 |
|