TensorRT-LLMs/cpp/tensorrt_llm/pybind
Dom Brown dbd9a83b0d
feat: Integrate GPUDirect Storage (GDS) into Executor API (#3582)
* feat: Integrate GPUDirect Storage (GDS) into Executor API

Squash of several dev commits

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2025-04-18 15:59:21 +08:00
..
batch_manager feat: allocate minimal blocks per window size (#3028) 2025-04-17 16:04:57 +08:00
common feat: support abort disconnected requests (#3214) 2025-04-07 16:14:58 +08:00
executor feat: Integrate GPUDirect Storage (GDS) into Executor API (#3582) 2025-04-18 15:59:21 +08:00
runtime refactor: batch slot management in decoder classes (#3300) 2025-04-13 05:05:13 +08:00
userbuffers feat: Introduce UB allocator for pytorch flow (#3257) 2025-04-08 18:39:49 +08:00
bindings.cpp feat: Integrate GPUDirect Storage (GDS) into Executor API (#3582) 2025-04-18 15:59:21 +08:00
CMakeLists.txt Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00