TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Robin Kobus 3ee4332fb1 refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078 ) - Updated the `forwardAsync` method in `GptDecoderBatched` and `iGptDecoderBatched` to return `CudaEvent` instead of `DecoderFinishedEventPtr`, simplifying event handling. - Removed the `DecoderFinishedEvent` class and its associated usage across various files, streamlining the codebase. - Adjusted related methods and Python bindings to accommodate the new event structure, ensuring compatibility and maintaining functionality. These changes enhance the clarity and efficiency of the decoding process in the batch manager. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-03-28 14:50:52 +08:00
..
batch_manager	refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078 )	2025-03-28 14:50:52 +08:00
common	fix: fix for cp > kvHeadNum (#3002 )	2025-03-26 12:39:02 +08:00
cutlass_extensions/include/cutlass_extensions	feat: Update cutlass (#2981 )	2025-03-26 22:36:27 +08:00
executor	feat: Add BW measurement (#3070 )	2025-03-28 10:53:00 +08:00
executor_worker	Update TensorRT-LLM (#2792 )	2025-02-18 21:27:39 +08:00
kernels	refactor: Improve decoder finalize function (#3077 )	2025-03-28 14:33:59 +08:00
layers	v1.2 (#3082 )	2025-03-26 23:31:29 +08:00
plugins	fix: fix for cp > kvHeadNum (#3002 )	2025-03-26 12:39:02 +08:00
pybind	refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078 )	2025-03-28 14:50:52 +08:00
runtime	refactor: Replace DecoderFinishedEvent with CudaEvent in decoder classes (#3078 )	2025-03-28 14:50:52 +08:00
thop	v1.2 (#3082 )	2025-03-26 23:31:29 +08:00
CMakeLists.txt	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00