TensorRT-LLMs/curl_completion_client.sh at 90a28b917fd3e9df101a431bc49af5fc2fc715bd - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Pengyun Lin f25c7cefb4

doc: refactor trtllm-serve examples and doc (#3187 )

Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>

2025-04-04 11:40:43 +08:00

11 lines

259 B

Bash

Raw Blame History

 #! /usr/bin/env bash
 curl http://localhost:8000/v1/completions \
     -H "Content-Type: application/json" \
     -d '{
         "model": TinyLlama-1.1B-Chat-v1.0,
         "prompt": "Where is New York?",
         "max_tokens": 16,
         "temperature": 0
     }'