TensorRT-LLMs/cpp/tensorrt_llm/pybind
Patrice Castonguay 9b0f45298f
[None][feat] Have ability to cancel disagg request if KV cache resource are exhausted (#9155)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-18 20:59:17 -05:00
..
batch_manager [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
common [None][feat] Add Request specific exception (#6931) 2025-09-04 18:43:42 -04:00
executor [None][feat] Have ability to cancel disagg request if KV cache resource are exhausted (#9155) 2025-11-18 20:59:17 -05:00
process_group [TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520) 2025-10-04 08:12:24 +08:00
runtime [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
testing fix: Improve chunking test and skip empty kernel calls (#5710) 2025-07-04 09:08:15 +02:00
thop [None] [feat] Use triton kernels for RocketKV prediction module (#8682) 2025-11-13 18:51:09 -08:00
userbuffers [TRTLLM-7028][feat] Enable guided decoding with speculative decoding (part 2: one-model engine) (#6948) 2025-09-03 15:16:11 -07:00
bindings.cpp [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00
CMakeLists.txt [None][refactor] decoding inputs, part 2 (#5799) 2025-11-18 14:38:51 +01:00