* add space in test output
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* perf: reduce executor lock scope
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* refactor: Move TokenRangeRetentionConfig implementation to cpp file
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fix: Improve finished steps handling for external draft tokens
- Fixed a bug where the whole finished steps tensor was being zeroes instead of the slices.
- Replaced the creation of a temporary tensor for finished steps with a direct slice from the input tensor, improving efficiency and readability.
- Updated the tensor management logic to streamline the process of setting zero values for finished steps during batch processing.
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* chore: Clean up includes
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
---------
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* Reapply "refactor: Replace DecoderFinishedEvent with CudaEvent in decoder clas…" (#3183)
This reverts commit 75495730bc.
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
* fixup! Reapply "refactor: Replace DecoderFinishedEvent with CudaEvent in decoder clas…" (#3183)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
---------
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
- Updated the `forwardAsync` method in `GptDecoderBatched` and `iGptDecoderBatched` to return `CudaEvent` instead of `DecoderFinishedEventPtr`, simplifying event handling.
- Removed the `DecoderFinishedEvent` class and its associated usage across various files, streamlining the codebase.
- Adjusted related methods and Python bindings to accommodate the new event structure, ensuring compatibility and maintaining functionality.
These changes enhance the clarity and efficiency of the decoding process in the batch manager.
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>