Files
Aman Gupta 8bece2eb20 CUDA: use mmvq for mul-mat-id for small batch sizes (#18958)
* CUDA: use mmvq for mul-mat-id for small batch sizes

* add mmvq too

* Fix perf issue on ampere. Use mmvf mm-id only for non-nvidia GPUs

* templatize multi_token_path
2026-02-03 23:31:23 +08:00
..
2026-02-02 08:38:55 +02:00
2026-01-29 11:10:53 +01:00
2026-02-03 13:43:29 +02:00
2025-08-05 22:10:36 +03:00
2025-08-05 22:10:36 +03:00