TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-22 19:52:38 +08:00

History

Robin Kobus e12e7a753d refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder (#3139 ) * refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder - Introduced a new `DecoderState` class in the C++ bindings, encapsulating key functionalities for managing decoding state. - Adjusted the Python `TRTLLMDecoder` to access properties from `decoder_state`, ensuring consistency and clarity in the decoding process. These changes streamline the decoder's architecture and enhance maintainability. Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> * chore: Remove unused new_tokens from DecoderState bindings Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com> --------- Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-04-05 07:42:35 +08:00
..
batch_manager	Reapply "refactor: Replace DecoderFinishedEvent with CudaEvent in decoder clas…" (#3183 ) (#3195 )	2025-04-04 15:56:28 +02:00
common	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
cutlass_extensions/include/cutlass_extensions	feat: Update cutlass (#2981 )	2025-03-26 22:36:27 +08:00
executor	feat: Variable-Beam-Width-Search (VBWS) Part2 (#3133 )	2025-04-02 12:31:28 +08:00
executor_worker	Update TensorRT-LLM (#2792 )	2025-02-18 21:27:39 +08:00
kernels	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
layers	feat: Variable-Beam-Width-Search (VBWS) Part2 (#3133 )	2025-04-02 12:31:28 +08:00
plugins	update FP4 quantize layout (#3045 )	2025-04-03 13:13:54 -04:00
pybind	refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder (#3139 )	2025-04-05 07:42:35 +08:00
runtime	Reapply "refactor: Replace DecoderFinishedEvent with CudaEvent in decoder clas…" (#3183 ) (#3195 )	2025-04-04 15:56:28 +02:00
thop	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
CMakeLists.txt	[feat] open source fp8_blockscale_gemm (#3071 )	2025-04-02 12:12:52 +08:00