Logo
Explore Help
Sign In
obscura/vllm
2
0
Fork 0
You've already forked vllm
mirror of https://github.com/vllm-project/vllm.git synced 2026-06-06 00:16:14 +00:00
Code Issues Packages Projects Releases Wiki Activity
Files
aa6fb8a329bf092a3fd29cfb399aa22feed7f5a1
vllm/csrc/quantization
T
History
Chris Leonard 56aff0dd15 [10/n] Migrate cuda_view and silu_and_mul_per_block_quant kernels to torch stale ABI. (#44334)
2026-06-04 20:14:43 -07:00
..
machete
[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423)
2026-03-30 09:36:18 -07:00
marlin
[Kernel] Marlin MoE: include SM 12.x in default arch list (#40923)
2026-05-28 15:30:26 +08:00
w8a8
[9/n] Migrate attention and cache kernels to torch stable ABI (continued) (#43717)
2026-05-29 04:44:45 +00:00
activation_kernels.cu
[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972)
2026-03-27 16:36:08 -07:00
utils.cuh
[6/n] Migrate activation kernels, gptq, gguf, non cutlass w8a8 to libtorch stable ABI (continued) (#42663)
2026-05-20 00:18:12 -07:00
Powered by Gitea Version: 1.26.1 Page: 557ms Template: 20ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API