mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-10 13:03:34 +08:00
* Add support of chat completion in PD Add support of include_usage in PD Reformat * Remove redundant code Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> * Refactor code Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> * Add chat completion test Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> * Refactor code Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> --------- Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> |
||
|---|---|---|
| .. | ||
| test_configs | ||
| sanity_check.sh | ||
| test_disaggregated_single_gpu.py | ||
| test_disaggregated.py | ||