TensorRT-LLMs/docs/source/performance/performance-tuning-guide/index.rst
2025-02-13 18:40:22 +08:00

16 lines
329 B
ReStructuredText

Performance Tuning Guide
=======================
.. include:: introduction.md
:parser: myst_parser.sphinx_
.. toctree::
:maxdepth: 1
benchmarking-default-performance
useful-build-time-flags
tuning-max-batch-size-and-max-num-tokens
deciding-model-sharding-strategy
fp8-quantization
useful-runtime-flags