TensorRT-LLMs/openai_responses_client.py at 2d45b482e084e1efa179e1a554bf263eec7952e2 - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

JunyiXu-nv af899d2fe7

[TRTLLM-9860][doc] Add docs and examples for Responses API (#9946 )

Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>

2025-12-14 21:46:13 -08:00

16 lines

302 B

Python

Raw Blame History

 ### :title OpenAI Responses Client
 from openai import OpenAI
 client = OpenAI(
     base_url="http://localhost:8000/v1",
     api_key="tensorrt_llm",
 )
 response = client.responses.create(
     model="TinyLlama-1.1B-Chat-v1.0",
     input="Where is New York?",
     max_output_tokens=20,
 )
 print(response)