mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* fix bug 5277113. Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> * fix bug 5277113 and 5278517. Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> --------- Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| curl_chat_client_for_multimodal.sh | ||
| curl_chat_client.sh | ||
| curl_completion_client.sh | ||
| deepseek_r1_reasoning_parser.sh | ||
| genai_perf_client.sh | ||
| openai_chat_client_for_multimodal.py | ||
| openai_chat_client.py | ||
| openai_completion_client.py | ||
| README.md | ||
| requirements.txt | ||
Online Serving Examples with trtllm-serve
We provide a CLI command, trtllm-serve, to launch a FastAPI server compatible with OpenAI APIs, here are some client examples to query the server, you can check the source code here or refer to the command documentation and examples for detailed information and usage guidelines.