mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

dominicshanshan 404fbe9b32 [https://nvbugs/5277113 ][fix]genai-perf API change stress test (#4300 ) * fix bug 5277113. Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> * fix bug 5277113 and 5278517. Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com> --------- Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>		2025-05-15 14:12:34 +08:00
..
curl_chat_client_for_multimodal.sh	feat: trtllm-serve multimodal support (#3590 )	2025-04-19 05:01:28 +08:00
curl_chat_client.sh	feat: trtllm-serve multimodal support (#3590 )	2025-04-19 05:01:28 +08:00
curl_completion_client.sh	feat: trtllm-serve multimodal support (#3590 )	2025-04-19 05:01:28 +08:00
deepseek_r1_reasoning_parser.sh	feat: add deepseek-r1 reasoning parser to trtllm-serve (#3354 )	2025-05-06 08:13:04 +08:00
genai_perf_client.sh	[https://nvbugs/5277113 ][fix]genai-perf API change stress test (#4300 )	2025-05-15 14:12:34 +08:00
openai_chat_client_for_multimodal.py	feat: trtllm-serve multimodal support (#3590 )	2025-04-19 05:01:28 +08:00
openai_chat_client.py	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
openai_completion_client.py	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
README.md	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
requirements.txt	doc: add genai-perf benchmark & slurm multi-node for trtllm-serve doc (#3407 )	2025-04-16 00:11:58 +08:00

README.md

Online Serving Examples with `trtllm-serve`

We provide a CLI command, trtllm-serve, to launch a FastAPI server compatible with OpenAI APIs, here are some client examples to query the server, you can check the source code here or refer to the command documentation and examples for detailed information and usage guidelines.

README.md

Online Serving Examples with trtllm-serve

Online Serving Examples with `trtllm-serve`