Li, Jiang
|
c505cd93ef
|
[CI/Build] Disable CPU-Compatibility Tests (#44605)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-06-05 13:14:26 +08:00 |
|
zofia
|
063ce98fb7
|
[XPU][MoE] support block_fp8_moe on xpu (#42139)
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
Signed-off-by: zofia <110436990+zufangzhu@users.noreply.github.com>
|
2026-06-05 08:36:58 +08:00 |
|
wang.yuqi
|
d01d0b4646
|
[Frontend] Consolidate online serving utils. (#44479)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-06-04 06:49:31 +00:00 |
|
JartX
|
5b2a2beade
|
[ROCm][CI] Move Model Executor test step from MI250 to MI300 (gfx942) (#44370)
Signed-off-by: JartX <sagformas@epdcenter.es>
|
2026-06-03 12:23:51 -05:00 |
|
Andreas Karatzas
|
87954eb50e
|
[ROCm][CI] Optimize ROCm Docker build: registry cache, DeepEP, and ci-bake script (#36949)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-06-02 23:43:07 -07:00 |
|
Flora Feng
|
e67063826b
|
[CI] Add missing vllm/parser/ CI trigger and fix test_parse.py (#44352)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-06-02 21:05:19 -07:00 |
|
Daoyuan Li
|
bd98e97557
|
[Misc] Remove dead VLLM_RPC_TIMEOUT env var and fix profiling doc that references it (#44128)
Signed-off-by: Daoyuan Li <94409450+DaoyuanLi2816@users.noreply.github.com>
|
2026-06-03 00:22:10 +00:00 |
|
Nick Hill
|
e15f20258b
|
[ModelRunnerV2] Avoid pipeline parallel bubbles (#42187)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
|
2026-06-02 14:02:01 -07:00 |
|
wang.yuqi
|
b623f7ea95
|
[Frontend] Consolidate dev entrypoints. (#44170)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-06-02 06:30:21 -07:00 |
|
Fadi Arafeh
|
0b25cf4419
|
[CPU][Perf] Enable fused kernels for GDN's gated delta rules (#43534)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
|
2026-06-02 08:00:48 +00:00 |
|
Alec
|
816cc73a9b
|
[Bugfix][CI] Normalize NIXL connector CUDA wheel installs (#44266)
Signed-off-by: Alec Flowers <aflowers@nvidia.com>
|
2026-06-01 19:34:05 -07:00 |
|
wang.yuqi
|
0910f7e0e1
|
[Frontend] Resettle generative scoring entrypoint. (#44153)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-06-01 07:54:59 +00:00 |
|
Kevin H. Luu
|
8fad266507
|
[CI] Fix smoke test step key to bypass block gate (#43974)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-29 16:28:32 -07:00 |
|
Flora Feng
|
6de08e8b46
|
[CI] Remove redundant test_chat_with_tool_reasoning.py (#44011)
Signed-off-by: sfeng33 <4florafeng@gmail.com>
|
2026-05-29 19:23:56 +00:00 |
|
Kevin H. Luu
|
6aabe221a5
|
[CI] Make Model Executor test hangs fail fast with a traceback (#43971)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-29 11:58:25 -07:00 |
|
Ilya Markov
|
4aaba00f92
|
[EPLB] Make async EPLB default (#43219)
Signed-off-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Markov Ilya <markovilya19@gmail.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
|
2026-05-29 18:07:16 +00:00 |
|
Tianmu Li
|
94d3f4d205
|
[CPU Backend] CPU top-k and top-p sampling kernels using Triton (#43633)
Signed-off-by: Li, Tianmu <tianmu.li@intel.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-05-29 15:02:39 +08:00 |
|
Kevin H. Luu
|
648c3ebee6
|
[CI] Separate non-root smoke tests from image build step (#43712)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-28 23:34:16 -07:00 |
|
yzong-rh
|
325a1ec4fb
|
[CI] Enable prefix caching in BFCL benchmark (#43925)
Signed-off-by: Yifan Zong <yzong@redhat.com>
|
2026-05-28 23:36:31 +00:00 |
|
Michael Goin
|
03f03f9630
|
Refactor output filename handling in ci-fetch-log.sh (#43901)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
|
2026-05-28 14:20:12 -07:00 |
|
Micah Williamson
|
1b5437cec8
|
[ROCm] Bump ROCm to 7.2.3 (#43136)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
|
2026-05-28 09:42:43 -07:00 |
|
Li, Jiang
|
20d69d100a
|
[CPU] Migrate cpu_awq into awq_marlin (#43841)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-05-28 22:36:31 +08:00 |
|
Andreas Karatzas
|
a9bc0ad8e4
|
[ROCm][CI] Move workload from MI300 to MI325 (#43824)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-28 03:31:29 -07:00 |
|
Andreas Karatzas
|
33e94fc3ad
|
[ROCm][CI] Stabilize Cargo cache and pre-test image checks (#43815)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-28 11:24:44 +08:00 |
|
Harry Mellor
|
2616f67faa
|
Remove Transformers forward/backward compatibility tests (#43785)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2026-05-27 12:46:36 -07:00 |
|
Luciano Martins
|
dede691c95
|
[Bugfix] Split attention groups by num_heads_q for spec-decode drafts (#43543)
Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
|
2026-05-27 00:11:01 +00:00 |
|
Kevin H. Luu
|
e19b9b1045
|
[ci] Add arm64 ci image (#41303)
Signed-off-by: khluu <khluu000@gmail.com>
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-26 14:38:09 -07:00 |
|
Kevin H. Luu
|
49b4882779
|
[CI] Soft-fail AMD entrypoints mirror tests (#43709)
Signed-off-by: Kevin Luu <kevin@inferact.ai>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-26 13:08:48 -07:00 |
|
Yongye Zhu
|
6ab6ffb428
|
[Feat][DSV4] Fuse q pad into deepseek v4 fused kernel (#43162)
|
2026-05-26 05:12:54 -10:00 |
|
Andreas Karatzas
|
445ded18c1
|
[ROCm][CI] Extend ROCm quick reduce coverage (#40990)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-26 21:57:13 +08:00 |
|
Nguyễn Thế Duy
|
3df1c7c43e
|
[Docker] Non-root support for vllm-openai; add opt-in vllm-openai-nonroot target (#40275)
Signed-off-by: TheDuyIT <nduy250299@gmail.com>
Signed-off-by: dtnguyen <dtnguyen@nvidia.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-25 13:45:31 +08:00 |
|
Andreas Karatzas
|
2a7d5b7324
|
[ROCm][CI] Remove benchmarks test group and shard long test groups (#41669)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-23 23:31:46 +08:00 |
|
Jakub Zakrzewski
|
5bb8d2767a
|
[Kernel] Batch invariant NVFP4 linear using cutlass (#39912)
Signed-off-by: Jakub Zakrzewski <jzakrzewski@nvidia.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Yongye Zhu <zyy1102000@gmail.com>
|
2026-05-23 09:41:12 -04:00 |
|
sychen52
|
fb21d8b4f9
|
Add NVFP4 MOE support for Deepseek V4. (#42209)
Signed-off-by: Shiyang Chen <shiychen@nvidia.com>
|
2026-05-22 07:21:51 -07:00 |
|
Li, Jiang
|
65b7a812a2
|
[CPU] Experimentally enable Triton and MRV2 (#43225)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-05-22 01:48:17 -07:00 |
|
Bugen Zhao
|
39910f2b25
|
[Rust Frontend] Move code from vllm-frontend-rs (#43283)
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Eric Curtin <eric.curtin@docker.com>
Signed-off-by: Dev-X25874 <283057883+Dev-X25874@users.noreply.github.com>
Signed-off-by: Will.hou <1205157517@qq.com>
Signed-off-by: Will.hou <willamhou@ceresman.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Co-authored-by: Eric Curtin <eric.curtin@docker.com>
Co-authored-by: Dev-X25874 <283057883+Dev-X25874@users.noreply.github.com>
Co-authored-by: Will.hou <1205157517@qq.com>
Co-authored-by: Will.hou <willamhou@ceresman.com>
Please see https://github.com/Inferact/vllm-frontend-rs for full original commit history.
|
2026-05-21 17:21:48 -07:00 |
|
xiangdong
|
5ecd8e9c70
|
[XPU][CI]Fix Docker image pull-to-run race in Intel GPU CI (#43266)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-21 10:41:38 +00:00 |
|
Nick Hill
|
f2ace1d57d
|
[Frontend][RFC] Rust front-end integration (#40848)
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Co-authored-by: Bugen Zhao <i@bugenzhao.com>
|
2026-05-21 12:24:48 +08:00 |
|
Louie Tsai
|
5d041cc1fe
|
update GPU json file based on h200 recipes (#43262)
Signed-off-by: louie-tsai <louie.tsai@intel.com>
|
2026-05-21 03:57:48 +00:00 |
|
xiangdong
|
6f21558da1
|
[XPU][CI] Add 2 server model test files in Intel GPU CI (#42499)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
|
2026-05-20 16:54:58 +08:00 |
|
Kevin H. Luu
|
85959567c3
|
[ci] Revert model executor test back to L4 (#43188)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-05-19 23:01:41 -07:00 |
|
Kevin H. Luu
|
a65093c1a3
|
[ci] Move language models tests (hybrid) back to L4 (#43129)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-05-19 11:51:34 -07:00 |
|
zhanqiuhu
|
129019f334
|
[CI] Add MTP + PD disagg test for Qwen3.5 (#42677)
Signed-off-by: ZhanqiuHu <zhu@redhat.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2026-05-19 11:44:33 +02:00 |
|
wang.yuqi
|
301d986473
|
[Frontend] Consolidate beam search by BeamSearchMixin. (#42946)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-05-19 07:37:40 +00:00 |
|
Kevin H. Luu
|
6e889b582b
|
[ci] Route 28 gpu_1_queue tests to h200_35gb queue (#43030)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-05-18 21:58:36 -07:00 |
|
Kunshang Ji
|
36dcaf25d8
|
[XPU] add gptq(int4) support (#37844)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-05-19 11:17:09 +08:00 |
|
xiangdong
|
2e40faf08b
|
[XPU][CI] Temporarily skip test_moe_lora_align_block_size_mixed_base_and_lora[1] in Intel GPU CI (#42954)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
|
2026-05-18 20:34:48 +08:00 |
|
Yuwen Zhou
|
88a860d754
|
[CPU] Add MXFP4 W4A16 MoE support (#41922)
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Yuwen Zhou <yuwen.zhou@intel.com>
|
2026-05-18 03:04:45 -07:00 |
|
wenjun liu
|
c38bed4248
|
delete xpu ci (#42582)
Signed-off-by: wenjun.liu <wenjun.liu@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-05-18 16:36:45 +08:00 |
|
Jiangyun Zhu
|
8a56da3845
|
[Experimental] Breakable CUDA graph (#42304)
Signed-off-by: zjy0516 <riverclouds.zhu@qq.com>
|
2026-05-16 22:04:12 +08:00 |
|