mirror of
https://github.com/ggml-org/llama.cpp.git
synced 2026-06-30 16:20:20 +00:00
15d2b46b4d
Store the last computed graph and reuse it when possible. Also do not return response from GRAPH_COMPUTE and assume it always completes successfully. If this this is not the case, the server closes the connection. This saves us a network round trip to the server.