mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Anish Shanbhag 7c82605327 [None][fix] enable KV cache reuse for config database (#10094 )		2025-12-19 15:16:56 -08:00
..
curated	[None][fix] enable KV cache reuse for config database (#10094 )	2025-12-19 15:16:56 -08:00
database	[None][fix] enable KV cache reuse for config database (#10094 )	2025-12-19 15:16:56 -08:00
__init__.py	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
README.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00

README.md

Recommended LLM API Configuration Settings

This directory contains recommended LLM API performance settings for popular models. They can be used out-of-the-box with trtllm-serve via the --config CLI flag, or you can adjust them to your specific use case.

For model-specific deployment guides, please refer to the official documentation.