mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-28 14:44:24 +08:00

History

Pengyun Lin 79fc2f48c0 [None][chore] Enhance trtllm-serve example test (#6604 ) Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>		2025-08-06 20:30:35 +08:00
..
curl_chat_client_for_multimodal.sh
curl_chat_client.sh
curl_completion_client.sh
deepseek_r1_reasoning_parser.sh	chore: update trtllm-serve usage doc by removing backend parameter when it use torch as backend. (#6419 )	2025-07-30 11:11:06 -04:00
genai_perf_client_for_multimodal.sh
genai_perf_client.sh
openai_chat_client_for_multimodal.py	[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431 )	2025-07-01 19:06:41 +08:00
openai_chat_client.py	[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431 )	2025-07-01 19:06:41 +08:00
openai_completion_client_for_lora.py	[TRTLLM-5831][feat] Add LoRA support for pytorch backend in trtllm-serve (#5376 )	2025-06-29 12:46:30 +00:00
openai_completion_client_json_schema.py	[None][chore] Enhance trtllm-serve example test (#6604 )	2025-08-06 20:30:35 +08:00
openai_completion_client.py	[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431 )	2025-07-01 19:06:41 +08:00
README.md
requirements.txt

README.md

Online Serving Examples with `trtllm-serve`

We provide a CLI command, trtllm-serve, to launch a FastAPI server compatible with OpenAI APIs, here are some client examples to query the server, you can check the source code here or refer to the command documentation and examples for detailed information and usage guidelines.

README.md

Online Serving Examples with trtllm-serve

Online Serving Examples with `trtllm-serve`