TensorRT-LLMs/examples/disaggregated/slurm/simple_example/gen_extra-llm-api-config.yaml

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00
cache_transceiver_config:
  backend: UCX
  max_tokens_in_buffer: 2048
	`cache_transceiver_config:`
	`backend: UCX`
	`max_tokens_in_buffer: 2048`