TensorRT-LLMs/cpp/tensorrt_llm/runtime
Robin Kobus 3ee4332fb1
refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078)
- Updated the `forwardAsync` method in `GptDecoderBatched` and `iGptDecoderBatched` to return `CudaEvent` instead of `DecoderFinishedEventPtr`, simplifying event handling.
- Removed the `DecoderFinishedEvent` class and its associated usage across various files, streamlining the codebase.
- Adjusted related methods and Python bindings to accommodate the new event structure, ensuring compatibility and maintaining functionality.

These changes enhance the clarity and efficiency of the decoding process in the batch manager.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-03-28 14:50:52 +08:00
..
utils Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
bufferManager.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
bufferView.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
CMakeLists.txt Update (#2978) 2025-03-23 16:39:35 +08:00
cudaMemPool.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
cudaMemPool.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
decoderState.cpp Update (#2978) 2025-03-23 16:39:35 +08:00
decodingLayerWorkspace.cpp Update TensorRT-LLM (#2184) 2024-09-03 12:14:23 +02:00
decodingLayerWorkspace.h Update TensorRT-LLM (#2436) 2024-11-12 15:27:49 +08:00
decodingOutput.cpp Update (#2978) 2025-03-23 16:39:35 +08:00
eagleBuffers.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
explicitDraftTokensBuffers.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
explicitDraftTokensModule.h Update TensorRT-LLM (#1763) 2024-06-11 16:59:02 +08:00
generationConfig.cpp Update TensorRT-LLM (#2110) 2024-08-13 22:34:33 +08:00
generationConfig.h Update TensorRT-LLM (#2110) 2024-08-13 22:34:33 +08:00
gptDecoder.cpp Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
gptDecoderBatched.cpp refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078) 2025-03-28 14:50:52 +08:00
gptJsonConfig.cpp Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
gptSession.cpp refactor: Remove speculative decoding parameters from stateful decoders (#3024) 2025-03-26 20:16:26 +08:00
iBuffer.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
ipcNvlsMemory.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
ipcSocket.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
ipcSocket.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
ipcUtils.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
iTensor.cpp Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
jsonSerialization.h Update TensorRT-LLM (#2436) 2024-11-12 15:27:49 +08:00
layerProfiler.cpp Update TensorRT-LLM (#1554) 2024-05-07 23:34:28 +08:00
layerProfiler.h Update TensorRT-LLM (#1554) 2024-05-07 23:34:28 +08:00
lookaheadBuffers.cpp Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
loraCache.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
loraManager.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
loraManager.h Update TensorRT-LLM (#2413) 2024-11-05 16:27:06 +08:00
loraModule.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
loraUtils.cpp Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
loraUtils.h Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
memoryCounters.cpp Update TensorRT-LLM (#2110) 2024-08-13 22:34:33 +08:00
ncclCommunicator.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
ncclCommunicator.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
promptTuningParams.cpp Update TensorRT-LLM (#1598) 2024-05-14 16:43:41 +08:00
rnnStateBuffers.cpp open source 7f370deb0090d885d7518c2b146399ba3933c004 (#2273) 2024-09-30 13:51:19 +02:00
rnnStateBuffers.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
runtimeBuffers.cpp Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
runtimeBuffers.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
runtimeKernels.cu Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
runtimeKernels.h Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
statefulGptDecoder.cpp refactor: Improve decoder finalize function (#3077) 2025-03-28 14:33:59 +08:00
statefulGptDecoder.h refactor: Remove speculative decoding parameters from stateful decoders (#3024) 2025-03-26 20:16:26 +08:00
statefulGptDecoderBatched.cpp refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078) 2025-03-28 14:50:52 +08:00
tensorView.h Update TensorRT-LLM (#1793) 2024-06-18 18:18:23 +08:00
tllmBuffers.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
tllmBuffers.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
tllmLogger.cpp Update TensorRT-LLM (#787) 2024-01-02 17:54:32 +08:00
tllmRuntime.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
tllmRuntime.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
torch.h Update TensorRT-LLM (#2110) 2024-08-13 22:34:33 +08:00
torchUtils.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
torchView.h Update TensorRT-LLM (#1168) 2024-02-27 17:37:34 +08:00
transformerBuffers.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
transformerBuffers.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
workerPool.cpp Update TensorRT-LLM (#2156) 2024-08-27 18:20:59 +08:00
workerPool.h Update TensorRT-LLM (#2156) 2024-08-27 18:20:59 +08:00
worldConfig.cpp Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00