TensorRT-LLMs/cpp/include/tensorrt_llm/batch_manager
Yan Chunwei 0c26059703
chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732)
* beam_width and max_new_token

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* remove beam_width

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* remove min_length

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* remove return_num_sequences

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

---------

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-05-07 13:20:25 +08:00
..
allocateKvCache.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
assignReqSeqSlots.h Update TensorRT-LLM (#2436) 2024-11-12 15:27:49 +08:00
cacheTransceiver.h cacheTransceiver buffer manager (#3798) 2025-04-27 11:48:15 +08:00
capacityScheduler.h feat: allocate minimal blocks per window size (#3028) 2025-04-17 16:04:57 +08:00
common.h open source 4dbf696ae9b74a26829d120b67ab8443d70c8e58 (#2297) 2024-10-08 12:19:19 +02:00
contextProgress.h Update TensorRT-LLM (#2413) 2024-11-05 16:27:06 +08:00
createNewDecoderRequests.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
decoderBuffers.h refactor: Introduce MpiTag enumeration and update MPI function signatures (#3893) 2025-05-04 13:24:29 +02:00
evictionPolicy.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
generateRequestOptions.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
guidedDecoder.h Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
handleContextLogits.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
handleGenerationLogits.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
kvCacheConfig.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
kvCacheEventManager.h Update TensorRT-LLM (#2436) 2024-11-12 15:27:49 +08:00
kvCacheManager.h cacheTransceiver buffer manager (#3798) 2025-04-27 11:48:15 +08:00
kvCacheTransferManager.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
kvCacheUtils.h feat: allocate minimal blocks per window size (#3028) 2025-04-17 16:04:57 +08:00
llmRequest.h chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732) 2025-05-07 13:20:25 +08:00
logitsPostProcessor.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
makeDecodingBatchInputOutput.h [TRTLLM-3429] feat: Overlap scheduling in C++ runtime (#3625) 2025-05-06 15:06:46 +02:00
medusaBuffers.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
microBatchScheduler.h [TRTLLM-3429] feat: Overlap scheduling in C++ runtime (#3625) 2025-05-06 15:06:46 +02:00
pauseRequests.h Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
peftCacheManager.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
peftCacheManagerConfig.h Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
promptTuningBuffers.h feat: Offloading Multimodal embedding table to CPU in Chunked Prefill Mode (#3380) 2025-04-21 14:31:01 +08:00
rnnStateManager.h Update TensorRT-LLM (#2413) 2024-11-05 16:27:06 +08:00
runtimeBuffers.h [TRTLLM-3429] feat: Overlap scheduling in C++ runtime (#3625) 2025-05-06 15:06:46 +02:00
sequenceSlotManager.h Update TensorRT-LLM (#2413) 2024-11-05 16:27:06 +08:00
transformerBuffers.h Feat: Variable-Beam-Width-Search (VBWS) part3 (#3338) 2025-04-08 23:51:27 +08:00
trtGptModelOptionalParams.h [TRTLLM-3429] feat: Overlap scheduling in C++ runtime (#3625) 2025-05-06 15:06:46 +02:00
updateDecoderBuffers.h fix: Fix C++ decoder synchronization in PyTorch (#3106) 2025-04-23 23:55:27 +08:00