mirror of
https://github.com/vllm-project/vllm.git
synced 2026-06-06 00:16:14 +00:00
0f0ef3417b
With K/V packed into a single contiguous region per block, the NIXL and Mooncake transfer paths register one region per layer and coalesce block transfers instead of emitting separate K/V halves. Update the unit tests to match: detect the 4D blocks-first layout, expect one entry per tensor, and expect coalesced (non-split) block transfers. Co-authored-by: Claude Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>