mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-09 12:41:52 +08:00
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> open source f8c0381a2bc50ee2739c3d8c2be481b31e5f00bd (#2736) Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Add note for blackwell (#2742) Update the docs to workaround the extra-index-url issue (#2744) update README.md (#2751) Fix github io pages (#2761) Update
16 lines
329 B
ReStructuredText
16 lines
329 B
ReStructuredText
Performance Tuning Guide
|
|
=======================
|
|
|
|
.. include:: introduction.md
|
|
:parser: myst_parser.sphinx_
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
|
|
benchmarking-default-performance
|
|
useful-build-time-flags
|
|
tuning-max-batch-size-and-max-num-tokens
|
|
deciding-model-sharding-strategy
|
|
fp8-quantization
|
|
useful-runtime-flags
|