TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Robin Kobus 45134d7095 refactor: Improve decoder finalize function (#3077 ) * refactor: Update gatherTree function to accept CUDA stream parameter This commit modifies the gatherTree function signature to include a runtime::CudaStream parameter, enhancing flexibility in stream management. Additionally, it removes unnecessary buffer manager parameters and stream handling from the function, streamlining the code. The finalize method in GptDecoderBatched is also updated to reflect these changes, improving clarity and maintainability in the decoding process. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * refactor: Update GptDecoderBatched finalize This commit refactors the GptDecoderBatched class to improve method signatures and reduce code complexity: - Modified finalize method to accept DecoderState as a parameter - Updated method signatures to work with the new DecoderState approach - Improved code organization and readability The changes continue the ongoing refactoring to centralize decoder state management and simplify the decoder implementation. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> --------- Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-03-28 14:33:59 +08:00
..
allReduce	Update (#2978 )	2025-03-23 16:39:35 +08:00
cudaCoreGemm	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
fused_gated_gemm	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
sampling	Update TensorRT-LLM (#2873 )	2025-03-11 21:13:42 +08:00
smoothQuant	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
weightOnly	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
banRepeatNGramsKernelsTest.cpp	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
CMakeLists.txt	Update (#2978 )	2025-03-23 16:39:35 +08:00
decodingKernelTest.cpp	refactor: Improve decoder finalize function (#3077 )	2025-03-28 14:33:59 +08:00
logitsBitmaskTest.cpp	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
mixtureOfExpertsTest.cu	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
ropeTest.cu	Update TensorRT-LLM (#2873 )	2025-03-11 21:13:42 +08:00
shiftKCacheKernelTest.cu	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
stopCriteriaKernelsTest.cpp	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00