TensorRT-LLMs/cpp/tensorrt_llm/batch_manager
NVShreyas 6c1862fb33
[TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957)
Signed-off-by: Shreyas Misra <shreyasm@nvidia.com>
2026-01-27 12:23:02 -08:00
..
utils [TRTLLM-6106][feat] Add support for KVCache transfer from KVCache reuse path (#6348) 2025-09-27 19:29:30 -04:00
allocateKvCache.cpp [https://nvbugs/5627710][fix] Fix synchronization bugs in KvCacheTransferManager that can cause corrupted blocks (#9056) 2025-12-02 09:10:21 -06:00
assignReqSeqSlots.cpp [https://nvbugs/5394392][fix] Enlarge scheduler capacity under disagg bs == 1 (#6537) 2025-08-15 09:52:06 -07:00
baseTransBuffer.cpp [TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957) 2026-01-27 12:23:02 -08:00
baseTransBuffer.h [TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957) 2026-01-27 12:23:02 -08:00
cacheFormatter.cpp [TRTLLM-9465][fix] Swap TP-CP grouping order (#10350) 2026-01-05 20:08:03 +08:00
cacheFormatter.h [TRTLLM-8540][feat] Add support for disagg in DSv3.2 (#8735) 2025-11-12 08:21:11 -08:00
cacheTransBuffer.cpp [TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957) 2026-01-27 12:23:02 -08:00
cacheTransBuffer.h [TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957) 2026-01-27 12:23:02 -08:00
cacheTransceiver.cpp [None][chore] Async Transfer Manager (#9891) 2026-01-20 12:12:47 -05:00
capacityScheduler.cpp [https://nvbugs/5677746][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659) 2025-12-08 18:43:52 -08:00
CMakeLists.txt [TRTLLM-10197][chore] Refactor to setup for RNN cache transceiver (#10957) 2026-01-27 12:23:02 -08:00
contextProgress.cpp
createNewDecoderRequests.cpp [None] [refactor] Minor cleanup and improvements (#7619) 2025-10-03 11:40:06 +02:00
dataTransceiver.cpp [https://nvbugs/5702786][fix] Fix race conditions in KV cache communication during unexpected termination (#10076) 2025-12-23 14:09:51 +02:00
dataTransceiver.h [None][feat] add detailed KV cache transfer time breakdown (#8521) 2025-10-29 10:11:09 +08:00
decoderBuffers.cpp refactor: Enhanced handling of decoder requests and logits within the batch manager (#6055) 2025-07-18 12:12:08 +02:00
encoderBuffers.cpp
encoderBuffers.h
evictionPolicy.cpp [TLLM-6777][feature] Support SWA KV cache reuse OOW block detach (#7922) 2025-10-13 09:18:12 -07:00
guidedDecoder.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
handleContextLogits.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
handleGenerationLogits.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
kvCacheEventManager.cpp [TRTLLM-9601][feat] Expose mmKeys for multimodal to integrate with dynamo. (#9604) 2025-12-15 08:42:30 +08:00
kvCacheManager.cpp [None][fix] Bugfix/mtp with async scheduler (#10941) 2026-01-24 07:19:54 -05:00
kvCacheManagerV2Utils.cpp [TRTLLM-7738][feat] Adding implementation of KVCacheManagerV2 (#10736) 2026-01-24 04:48:39 -05:00
kvCacheManagerV2Utils.cu [TRTLLM-7738][feat] Adding implementation of KVCacheManagerV2 (#10736) 2026-01-24 04:48:39 -05:00
kvCacheManagerV2Utils.h [TRTLLM-7738][feat] Adding implementation of KVCacheManagerV2 (#10736) 2026-01-24 04:48:39 -05:00
kvCacheTransferManager.cpp [None][feat] update trtllm-gen nvfp4 kernels with better performance (#9510) 2025-12-03 21:35:49 +08:00
llmRequest.cpp [TRTLLM-9527][feat] change context params and disagg params (step3) (#10495) 2026-01-27 16:34:17 +08:00
logitsPostProcessor.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
loraBuffers.cpp
loraBuffers.h
makeDecodingBatchInputOutput.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
medusaBuffers.cpp
microBatchScheduler.cpp
mlaCacheFormatter.cpp [TRTLLM-9465][fix] Swap TP-CP grouping order (#10350) 2026-01-05 20:08:03 +08:00
mlaCacheFormatter.h [TRTLLM-8540][feat] Add support for disagg in DSv3.2 (#8735) 2025-11-12 08:21:11 -08:00
pauseRequests.cpp [TRTLLM-909][feat] Overlap context chunks in pipeline parallel mode (#9308) 2025-11-25 22:11:51 +01:00
peftCacheManager.cpp [https://nvbugs/5322131][feat] Multi-LoRA serving with CUDA Graph (#8279) 2026-01-22 14:01:18 +01:00
promptTuningBuffers.cpp
rnnStateBuffers.cpp
rnnStateBuffers.h
rnnStateManager.cpp
runtimeBuffers.cpp
scheduledBlocksManager.h
sequenceSlotManager.cpp
transformerBuffers.cpp
trtEncoderModel.cpp
trtEncoderModel.h
trtGptModel.h
trtGptModelFactory.h
trtGptModelInflightBatching.cpp [None][chore] Async Transfer Manager (#9891) 2026-01-20 12:12:47 -05:00
trtGptModelInflightBatching.h [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
updateDecoderBuffers.cpp