Files
llama.cpp/src
Piotr Wilkin (ilintar) a5251ca11d Optimization: Qwen3 next autoregressive pass (#17996)
* It's Qwen3 Next, the lean mean token generation machine!

* Apply patches from thread

* Remove recurrent version, only keep chunked and autoregressive

* Remove unnecessary conts and asserts

* Remove more extra conts and asserts

* Cleanup masking
2025-12-16 11:59:53 +01:00
..
2025-09-05 17:32:39 -06:00
2025-09-05 17:32:39 -06:00