vllm/csrc/quantization at aa6fb8a329bf092a3fd29cfb399aa22feed7f5a1 - vllm - Gitea: Git with a cup of tea

obscura/vllm

mirror of https://github.com/vllm-project/vllm.git synced 2026-06-06 00:16:14 +00:00

Files

T

History

Chris Leonard 56aff0dd15 [10/n] Migrate cuda_view and silu_and_mul_per_block_quant kernels to torch stale ABI. (#44334 )

2026-06-04 20:14:43 -07:00

..

[NVIDIA] Bugfix NVFP4 DGX Spark and RTX50 (#38423 )

2026-03-30 09:36:18 -07:00

[Kernel] Marlin MoE: include SM 12.x in default arch list (#40923 )

2026-05-28 15:30:26 +08:00

[9/n] Migrate attention and cache kernels to torch stable ABI (continued) (#43717 )

2026-05-29 04:44:45 +00:00

activation_kernels.cu

[Bugfix]fix output Nan/Inf in marlin if dtype=float16 (#33972 )

2026-03-27 16:36:08 -07:00

utils.cuh

[6/n] Migrate activation kernels, gptq, gguf, non cutlass w8a8 to libtorch stable ABI (continued) (#42663 )

2026-05-20 00:18:12 -07:00