TensorRT-LLMs/cpp/tensorrt_llm
Robin Kobus 3ee4332fb1
refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078)
- Updated the `forwardAsync` method in `GptDecoderBatched` and `iGptDecoderBatched` to return `CudaEvent` instead of `DecoderFinishedEventPtr`, simplifying event handling.
- Removed the `DecoderFinishedEvent` class and its associated usage across various files, streamlining the codebase.
- Adjusted related methods and Python bindings to accommodate the new event structure, ensuring compatibility and maintaining functionality.

These changes enhance the clarity and efficiency of the decoding process in the batch manager.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-03-28 14:50:52 +08:00
..
batch_manager refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078) 2025-03-28 14:50:52 +08:00
common fix: fix for cp > kvHeadNum (#3002) 2025-03-26 12:39:02 +08:00
cutlass_extensions/include/cutlass_extensions feat: Update cutlass (#2981) 2025-03-26 22:36:27 +08:00
executor feat: Add BW measurement (#3070) 2025-03-28 10:53:00 +08:00
executor_worker Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
kernels refactor: Improve decoder finalize function (#3077) 2025-03-28 14:33:59 +08:00
layers v1.2 (#3082) 2025-03-26 23:31:29 +08:00
plugins fix: fix for cp > kvHeadNum (#3002) 2025-03-26 12:39:02 +08:00
pybind refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078) 2025-03-28 14:50:52 +08:00
runtime refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078) 2025-03-28 14:50:52 +08:00
thop v1.2 (#3082) 2025-03-26 23:31:29 +08:00
CMakeLists.txt Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00