This website requires JavaScript.
Explore
Help
Sign In
obscura
/
vllm
Watch
2
Star
0
Fork
0
You've already forked vllm
mirror of
https://github.com/vllm-project/vllm.git
synced
2026-06-06 00:16:14 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
Files
main
vllm
/
csrc
/
quantization
T
Add File
New File
Upload File
Apply Patch
Copy Permalink
Download directory as ZIP
Download directory as TAR.GZ
History
Chris Leonard
56aff0dd15
[10/n] Migrate cuda_view and silu_and_mul_per_block_quant kernels to torch stale ABI. (
#44334
)
2026-06-04 20:14:43 -07:00
..
machete
[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (
#38423
)
2026-03-30 09:36:18 -07:00
marlin
[Kernel] Marlin MoE: include SM 12.x in default arch list (
#40923
)
2026-05-28 15:30:26 +08:00
w8a8
[9/n] Migrate attention and cache kernels to torch stable ABI (continued) (
#43717
)
2026-05-29 04:44:45 +00:00
activation_kernels.cu
[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (
#33972
)
2026-03-27 16:36:08 -07:00
utils.cuh
[6/n] Migrate activation kernels, gptq, gguf, non cutlass w8a8 to libtorch stable ABI (continued) (
#42663
)
2026-05-20 00:18:12 -07:00