mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* add deepseek-r1 reasoning parser Signed-off-by: pansicheng <sicheng.pan.chn@gmail.com> * fix test Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com> --------- Signed-off-by: pansicheng <sicheng.pan.chn@gmail.com> Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com> Co-authored-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| curl_chat_client_for_multimodal.sh | ||
| curl_chat_client.sh | ||
| curl_completion_client.sh | ||
| deepseek_r1_reasoning_parser.sh | ||
| genai_perf_client.sh | ||
| openai_chat_client_for_multimodal.py | ||
| openai_chat_client.py | ||
| openai_completion_client.py | ||
| README.md | ||
| requirements.txt | ||
Online Serving Examples with trtllm-serve
We provide a CLI command, trtllm-serve, to launch a FastAPI server compatible with OpenAI APIs, here are some client examples to query the server, you can check the source code here or refer to the command documentation and examples for detailed information and usage guidelines.