TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Robin Kobus e12e7a753d refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder (#3139 ) * refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder - Introduced a new `DecoderState` class in the C++ bindings, encapsulating key functionalities for managing decoding state. - Adjusted the Python `TRTLLMDecoder` to access properties from `decoder_state`, ensuring consistency and clarity in the decoding process. These changes streamline the decoder's architecture and enhance maintainability. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove unused new_tokens from DecoderState bindings Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> --------- Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-04-05 07:42:35 +08:00
..
batch_manager	feat: Support PeftCacheManager in Torch (#3186 )	2025-04-04 12:38:08 +08:00
common	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
executor	feat: Add BW measurement (#3070 )	2025-03-28 10:53:00 +08:00
runtime	refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder (#3139 )	2025-04-05 07:42:35 +08:00
userbuffers	Update TensorRT-LLM (#2783 )	2025-02-13 18:40:22 +08:00
bindings.cpp	feat: Support PeftCacheManager in Torch (#3186 )	2025-04-04 12:38:08 +08:00
CMakeLists.txt	Update TensorRT-LLM (#2849 )	2025-03-04 18:44:00 +08:00