Kaiyu Xie
|
b4e5df0ee0
|
Breaking change: perf: Enable scheduling overlap by default (#4174)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-05-15 14:27:36 +08:00 |
|
Chuang Zhu
|
09a28becae
|
fix cache buffer (#3942)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-05-07 09:49:44 +08:00 |
|
Shunkangz
|
ea050084ad
|
feat: Add support of chat completion in PD (#2985)
* Add support of chat completion in PD
Add support of include_usage in PD
Reformat
* Remove redundant code
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
* Refactor code
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
* Add chat completion test
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
* Refactor code
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
---------
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-04-11 17:53:28 +08:00 |
|
Chuang Zhu
|
f3237e52ed
|
update readme for disaggregated (#3323)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-04-07 21:29:15 +08:00 |
|
pcastonguay
|
b763051ba4
|
chore: Refactor disaggregated serving scripts (#3073)
* chore: Refactor to reduce duplicated code in disagg server, reuse trtllm-serve
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
* Updating README, removing launch script
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
* Fixing integration tests
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
* Adding scripts to populate urls section of disagg config based on SLURM env vars
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
---------
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-04-03 14:55:05 -04:00 |
|
Kaiyu Xie
|
3aa6b11d13
|
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
|
2025-03-18 21:25:19 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|
Kaiyu Xie
|
77d7fe1eb2
|
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM
---------
Co-authored-by: aotman <chenhangatm@gmail.com>
|
2025-03-04 18:44:00 +08:00 |
|
Kaiyu Xie
|
ab5b19e027
|
Update TensorRT-LLM (#2820)
|
2025-02-25 21:21:49 +08:00 |
|