TensorRT-LLMs/docs/source/index.rst
dhansen-nvidia 80235e53cf [None][feat] Add documentation on configuring CPU affinity in TRT-LLM (#10678)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2026-02-15 19:57:03 +08:00

134 lines
2.9 KiB
ReStructuredText

.. TensorRT LLM documentation master file, created by
sphinx-quickstart on Wed Sep 20 08:35:21 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to TensorRT LLM's Documentation!
========================================
.. toctree::
:maxdepth: 2
:caption: Getting Started
:name: Getting Started
overview.md
quick-start-guide.md
installation/index.rst
supported-hardware.md
.. toctree::
:maxdepth: 2
:caption: Deployment Guide
:name: Deployment Guide
examples/llm_api_examples.rst
examples/trtllm_serve_examples
examples/dynamo_k8s_example.rst
deployment-guide/index.rst
deployment-guide/configuring-cpu-affinity.md
.. toctree::
:maxdepth: 2
:caption: Models
:name: Models
models/supported-models.md
models/adding-new-model.md
.. toctree::
:maxdepth: 2
:caption: CLI Reference
:name: CLI Reference
commands/trtllm-bench
commands/trtllm-eval
commands/trtllm-serve/index
.. toctree::
:maxdepth: 2
:caption: API Reference
llm-api/index.md
llm-api/reference.rst
.. toctree::
:maxdepth: 2
:caption: Features
features/feature-combination-matrix.md
features/attention.md
features/disagg-serving.md
features/kvcache.md
features/long-sequence.md
features/lora.md
features/multi-modality.md
features/visual-generation.md
features/overlap-scheduler.md
features/paged-attention-ifb-scheduler.md
features/parallel-strategy.md
features/quantization.md
features/sampling.md
features/additional-outputs.md
features/guided-decoding.md
features/speculative-decoding.md
features/checkpoint-loading.md
features/auto_deploy/auto-deploy.md
features/ray-orchestrator.md
features/torch_compile_and_piecewise_cuda_graph.md
features/helix.md
features/kv-cache-connector.md
features/sparse-attention.md
.. toctree::
:maxdepth: 2
:caption: Developer Guide
developer-guide/overview.md
developer-guide/perf-analysis.md
developer-guide/perf-benchmarking.md
developer-guide/ci-overview.md
developer-guide/dev-containers.md
developer-guide/api-change.md
developer-guide/kv-transfer.md
.. toctree::
:maxdepth: 2
:caption: Blogs
:glob:
blogs/tech_blog/*
blogs/Best_perf_practice_on_DeepSeek-R1_in_TensorRT-LLM.md
blogs/H200launch.md
blogs/XQA-kernel.md
blogs/H100vsA100.md
.. toctree::
:maxdepth: 2
:caption: Quick Links
Releases <https://github.com/NVIDIA/TensorRT-LLM/releases>
Github Code <https://github.com/NVIDIA/TensorRT-LLM>
Roadmap <https://github.com/NVIDIA/TensorRT-LLM/issues?q=is%3Aissue%20state%3Aopen%20label%3Aroadmap>
.. toctree::
:maxdepth: 2
:caption: Use TensorRT Engine
:hidden:
legacy/tensorrt_quickstart.md
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`