Files
llama.cpp/src/models
Aman Gupta 12e5d99078 mtp: use inp_out_ids for skipping logit computation (#23433)
when doing a follow-up decode for the draft model, we were always doing the logit computation even though it is not required.
2026-05-21 15:23:14 +08:00
..
2026-05-19 15:32:58 +03:00