TensorRT-LLMs/cpp/tensorrt_llm/batch_manager/utils
Robin Kobus d39bcb6b40
[nvbugs/5274894] fix: Moving finished context requests to generation (#4576)
fix: Moving finished context requests to generation

- Unfinished chunked context requests appear at end of context requests vector.
- Replaced std::find_if with std::partition to find the correct position to move finished context requests to generation.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-22 17:49:40 +02:00
..
debugUtils.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
debugUtils.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
inflightBatchingUtils.cpp [nvbugs/5274894] fix: Moving finished context requests to generation (#4576) 2025-05-22 17:49:40 +02:00
inflightBatchingUtils.h Feat: Variable-Beam-Width-Search (VBWS) part4 (#3979) 2025-05-12 22:32:29 +02:00
logitsThread.cpp refactor: Introduce MpiTag enumeration and update MPI function signatures (#3893) 2025-05-04 13:24:29 +02:00
logitsThread.h refactor: Introduce MpiTag enumeration and update MPI function signatures (#3893) 2025-05-04 13:24:29 +02:00
staticThreadPool.cpp Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
staticThreadPool.h Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00