* feat: Implement synchronous request termination in batch manager
- Added `terminateRequestSync` method to `TrtEncoderModel` and `TrtGptModelInflightBatching` for handling request termination in the next `forwardSync` call.
- Updated existing request termination logic to utilize the new synchronous method, ensuring generated tokens are cleared appropriately.
- Enhanced logging for clarity in token management during request processing.
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fixup! feat: Implement synchronous request termination in batch manager
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fix: MockedModelCancelRequest
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fixup! feat: Implement synchronous request termination in batch manager
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fix: terminate with timeout
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fixup! feat: Implement synchronous request termination in batch manager
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* docs: Update doc string for allottedTimeMs
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
---------
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>