Kevin H. Luu
|
f653761252
|
[CI] Route part of B200 jobs to b200-k8s (#41453)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: OpenAI Codex <noreply@openai.com>
|
2026-05-05 19:00:30 -07:00 |
|
Andreas Karatzas
|
4a8ae26e53
|
[ROCm][CI] Use vLLM generation defaults for DeepSeek prefetch-offload eval (#41575)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-05-06 01:08:12 +00:00 |
|
Kevin H. Luu
|
1333864408
|
[CI] Automate Docker Hub release image publishing (#40415)
Signed-off-by: khluu <khluu000@gmail.com>
|
2026-05-06 00:15:23 +00:00 |
|
Artem Perevedentsev
|
8b9ea2f881
|
[Feature] Add Triton kernel JIT compilation monitor for inference (#40137)
Signed-off-by: Artem Perevedentsev <aperevedents@nvidia.com>
Co-authored-by: Jiangyun Zhu <riverclouds.zhu@qq.com>
|
2026-05-05 14:08:57 +04:00 |
|
Gregory Shtrasberg
|
e724b0ea8d
|
[ROCm] ROCm7.2.2 + profiler fix + AITER 0.1.12.post2 (#41386)
Signed-off-by: Rohan138 <rohanpotdar138@gmail.com>
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Co-authored-by: Rohan138 <rohanpotdar138@gmail.com>
|
2026-05-04 13:07:19 -05:00 |
|
Michael Goin
|
4f7309fcc0
|
[CI] Add ci-fetch-log.sh helper for Buildkite job logs (#41517)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-05-02 15:23:59 -07:00 |
|
Michael Goin
|
cfd2573f23
|
[Build] Switch CUDA 13.0 wheel builds to PyTorch manylinux_2_28 base (#41416)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-05-02 05:51:28 -07:00 |
|
Michael Goin
|
3ccc1ff495
|
[Eval][CI] Add basic mrcr eval to tests/evals/ (#40164)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-05-01 12:00:38 -04:00 |
|
vllmellm
|
529c671e80
|
[ROCm][FEAT] AITER Fused Allreduce + RMSNorm (#37646)
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: Rita Brugarolas Brufau <rita.brugarolasbrufau@amd.com>
Signed-off-by: junkang1991 <junkangchow@gmail.com>
Co-authored-by: Rita Brugarolas <Rita.BrugarolasBrufau@amd.com>
Co-authored-by: junkang1991 <junkangchow@gmail.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
|
2026-05-01 23:07:18 +08:00 |
|
Stefano Castagnetta
|
92a7c121b6
|
[CI] Add MTP coverage: Qwen3.5 correctness + no-sync spec decode (#40472)
Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-04-30 12:24:09 -07:00 |
|
Chenxi Qian
|
54146a9bf9
|
[Bugfix] correct h matrix layout in chunk_kda output kernel (#40956)
Signed-off-by: ChenxiQian <chenxi.qian.cq@outlook.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-04-30 16:22:41 +08:00 |
|
Kevin H. Luu
|
0ab67c0222
|
[CI] Add key field to all test_areas pipeline steps (#41201)
Signed-off-by: khluu <khluu000@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-04-29 16:59:16 -07:00 |
|
Rishi Puri
|
ccfb620c62
|
Create tests/distributed/test_mnnvl_alltoall.py (#35241)
Signed-off-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Claude <claude@anthropic.com>
Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Stefano Castagnetta <scastagnetta@nvidia.com>
|
2026-04-29 21:56:56 +00:00 |
|
yzong-rh
|
93da1fe97a
|
[CI] Add temperature to bfcl eval, default greedy (#41059)
Signed-off-by: Yifan Zong <yzong@redhat.com>
|
2026-04-29 14:01:57 -07:00 |
|
Artem Perevedentsev
|
b92ef9ec5a
|
[Perf] Enable FlashInfer top-k/top-p sampler by default (#40376)
Signed-off-by: Artem Perevedentsev <aperevedents@nvidia.com>
|
2026-04-29 19:10:34 +04:00 |
|
Alec
|
3f1a4bb639
|
build: embed image provenance metadata in vLLM containers (#40653)
Signed-off-by: Alec Flowers <aflowers@nvidia.com>
Co-authored-by: OpenAI Codex <codex@openai.com>
|
2026-04-29 03:07:41 -07:00 |
|
haosdent
|
ef70057ca7
|
[CI][CPU] Split CPU-Distributed Tests into per-scenario labels (#41203)
Signed-off-by: haosdent <haosdent@gmail.com>
|
2026-04-29 01:28:45 -07:00 |
|
Shengqi Chen
|
e48cb85185
|
[CI/Build] Auto-detect manylinux ABI tag for nightly wheels (#41149)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Co-authored-by: Claude <noreply@anthropic.com>
|
2026-04-29 00:37:14 -07:00 |
|
wang.yuqi
|
a8208e6a81
|
[Examples] Resettle features examples. (#40995)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-04-28 00:33:41 -07:00 |
|
Kunshang Ji
|
407b34be26
|
[xpu] bump up vllm-xpu-kernel v0.1.7 (#41019)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-04-28 08:04:54 +08:00 |
|
wang.yuqi
|
8d8062d0a7
|
[Examples] Resettle generate examples. (#36464)
Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
|
2026-04-27 07:48:37 +00:00 |
|
ojhaanshika
|
592ae6805c
|
Cutlass W4A16 (Machete) Tests (#35450)
Signed-off-by: Anshika Ojha <anshikao@nvidia.com>
|
2026-04-27 05:15:29 +00:00 |
|
Dmitry Tokarev
|
6dec49f27e
|
[Build] Bump CUDA to 13.0.2 to match PyTorch 2.11.0 (#40669)
Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com>
|
2026-04-24 10:27:11 +00:00 |
|
Shanshan Shen
|
b5587e1013
|
[CI/Build] Add e2e test for ViT CUDA graph (#40780)
Signed-off-by: shen-shanshan <467638484@qq.com>
|
2026-04-24 18:12:14 +08:00 |
|
xiangdong
|
01acf96c6f
|
[XPU][CI] Fix Docker cleanup races on Intel CI runners (#40761)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
|
2026-04-24 14:08:45 +08:00 |
|
Nicolò Lucchesi
|
8824f50f1f
|
[CI] Split disaggregated tests into own test-area (#40623)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2026-04-23 23:20:12 +08:00 |
|
xiangdong
|
01cb41dcf5
|
[XPU][CI]Temporary disable 3 cases on Intel GPU in CI (#40683)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
|
2026-04-23 21:42:22 +08:00 |
|
Shengqi Chen
|
3ed5231c6a
|
[Build] Switch default CUDA to 13.0, update CUDA architecture lists, clean up stale build-args (#39878)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-23 15:51:28 +08:00 |
|
Rishi Puri
|
9f39b380d0
|
[Bugfix] Fix spec decode test failures on Blackwell (SM100+) (#39546)
Signed-off-by: Stefano Castagnetta <scastagnetta@nvidia.com>
Signed-off-by: Rishi Puri <puririshi98@berkeley.edu>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Stefano Castagnetta <scastagnetta@nvidia.com>
Co-authored-by: Matthew Bonanni <mbonanni@redhat.com>
Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
|
2026-04-21 18:21:19 +00:00 |
|
xiangdong
|
b2a5518679
|
[XPU][CI] Add misc, engine and lora cases on Intel GPU in CI (#39887)
Signed-off-by: zengxian <xiangdong.zeng@intel.com>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
|
2026-04-21 22:30:46 +08:00 |
|
Sage Moore
|
def8f52200
|
[CI][EPLB] Add Async EPLB end-to-end integration test to CI (#40168)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2026-04-20 10:22:54 -04:00 |
|
Andreas Karatzas
|
a943839e9a
|
[ROCm][CI] Introducing new MI300 nodes (#39531)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-04-20 16:09:58 +08:00 |
|
Kevin H. Luu
|
629d45eacb
|
[ci] Make ecr authenticate non blocking (#40305)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
|
2026-04-19 15:37:53 -07:00 |
|
Michael Goin
|
a8bffaa133
|
[Kernel] Add MXFP4 W4A4 CUTLASS MoE kernel for SM100 (#37463)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2026-04-17 16:42:32 -07:00 |
|
Ryan Rock
|
58da4ee047
|
[AMD][CI] Update DeepEP branch (#38396)
Signed-off-by: Ryan Rock <ryan.rock@amd.com>
|
2026-04-17 14:30:20 -05:00 |
|
Li, Jiang
|
d02421a7db
|
[CPU] Refactor CPU affinity and memory management (#39781)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-04-17 21:01:08 +08:00 |
|
Sumanth R Hegde
|
adf9bb3c57
|
[CI] Add weight transfer tests to CI (#39821)
Signed-off-by: SumanthRH <sumanthrh99@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
|
2026-04-16 15:51:45 -04:00 |
|
Li, Jiang
|
324a3d2bd8
|
[CI/Build] Improve stability of CPU tests (#39966)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2026-04-16 21:50:36 +08:00 |
|
Yanan Cao
|
edc3648966
|
[Kernel][Helion] Fix inductor fusion of Helion HOP (#39944)
Signed-off-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-16 04:41:26 -07:00 |
|
Fadi Arafeh
|
445b7093fd
|
[perf][cpu] Accelerate BF16 GELU with LUT impl on Arm CPUs (#37469)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2026-04-15 22:26:17 -07:00 |
|
Harry Mellor
|
03f8d3a548
|
Update to transformers v5 (#30566)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: khluu <khluu000@gmail.com>
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Signed-off-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: khluu <khluu000@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: jiang1.li <jiang1.li@intel.com>
|
2026-04-15 16:29:15 -07:00 |
|
zhanqiuhu
|
0b790a2501
|
[Speculative Decoding] Add DFlash speculators config parsing (#38300)
Signed-off-by: Zhanqiu Hu <zhu@redhat.com>
|
2026-04-15 16:22:15 -04:00 |
|
Kevin H. Luu
|
102d51c9f3
|
[CI] Only build release Docker images when NIGHTLY=1 (#39882)
Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
2026-04-15 19:01:13 +00:00 |
|
Monishver
|
21e5a9f48e
|
Bug/test eagle dp v2 (#39838)
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
|
2026-04-15 17:48:12 +00:00 |
|
Vibhav Agarwal
|
f4b42df048
|
[Attention Backend] TurboQuant: 2-bit KV cache compression with 4x capacity (#38479)
Signed-off-by: vibhavagarwal5 <vibhavagarwal5@gmail.com>
Signed-off-by: Michael Goin <mgoin64@gmail.com>
Co-authored-by: Xinyu Chen <xinyu1.chen@intel.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
2026-04-14 19:57:13 -07:00 |
|
Andrey Talman
|
b569620f72
|
[CI] Add PyTorch nightly build and test pipeline (#37226)
Signed-off-by: atalman <atalman@fb.com>
|
2026-04-14 17:13:24 -07:00 |
|
Wentao Ye
|
2ad1029233
|
[Bug] Fix batch invariance nvfp4 support (#39820)
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
2026-04-14 17:08:17 -04:00 |
|
bnellnm
|
e1e318af01
|
[MoE Refactor] Remove MoE DP chunking (#39107)
Signed-off-by: Bill Nell <bnell@redhat.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
|
2026-04-14 09:48:05 -04:00 |
|
Monishver
|
8213e8f880
|
Bug/test eagle dp v0 (#38938)
Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Nicolò Lucchesi <nlucches@redhat.com>
|
2026-04-13 20:50:08 +00:00 |
|
Andreas Karatzas
|
4e4ad41d11
|
[ROCm][CI] Removed stale tests and extended acceptance test (#39651)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
2026-04-13 10:40:26 +08:00 |
|