TensorRT-LLMs/examples/serve/deepseek_r1_reasoning_parser.sh
nv-guomingz 03e38c9087
chore: update trtllm-serve usage doc by removing backend parameter when it use torch as backend. (#6419)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-07-30 11:11:06 -04:00

11 lines
332 B
Bash

#! /usr/bin/env bash
trtllm-serve \
deepseek-ai/DeepSeek-R1 \
--host localhost --port 8000 \
--max_batch_size 161 --max_num_tokens 1160 \
--tp_size 8 --ep_size 8 --pp_size 1 \
--kv_cache_free_gpu_memory_fraction 0.95 \
--extra_llm_api_options ./extra-llm-api-config.yml \
--reasoning_parser deepseek-r1