TensorRT-LLMs/tensorrt_llm/serve
pcastonguay ae5671644a
feat: Disaggregated router class (#3584)
* Add draft scheduler class

Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>

* Refactor the design

Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>

* feat: Introduce router class for disaggregated server

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Add unit tests for router class

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Adding tests for disagg_utils

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Fixing missing import

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Fixing disagg integration tests

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

* Addressing MR review comments

Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>

---------

Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-04-19 00:34:12 +08:00
..
__init__.py Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
openai_disagg_server.py feat: Disaggregated router class (#3584) 2025-04-19 00:34:12 +08:00
openai_protocol.py feat: Add support of chat completion in PD (#2985) 2025-04-11 17:53:28 +08:00
openai_server.py test: add kv cache event tests for disagg workers (#3602) 2025-04-18 18:30:19 +08:00
postprocess_handlers.py chore: Unify Python NVTX call (#3450) 2025-04-15 23:25:36 +08:00
router.py feat: Disaggregated router class (#3584) 2025-04-19 00:34:12 +08:00