mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-13 22:18:36 +08:00
* Update TensorRT-LLM --------- Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com> Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
89 lines
1.8 KiB
ReStructuredText
89 lines
1.8 KiB
ReStructuredText
.. TensorRT-LLM documentation master file, created by
|
|
sphinx-quickstart on Wed Sep 20 08:35:21 2023.
|
|
You can adapt this file completely to your liking, but it should at least
|
|
contain the root `toctree` directive.
|
|
|
|
Welcome to TensorRT-LLM's documentation!
|
|
========================================
|
|
|
|
.. toctree::
|
|
:maxdepth: 1
|
|
:caption: Contents:
|
|
|
|
architecture.md
|
|
gpt_runtime.md
|
|
batch_manager.md
|
|
inference_request.md
|
|
gpt_attention.md
|
|
precision.md
|
|
build_from_source.md
|
|
performance.md
|
|
2023-05-19-how-to-debug.md
|
|
2023-05-17-how-to-add-a-new-model.md
|
|
graph-rewriting.md
|
|
memory.md
|
|
new_workflow.md
|
|
lora.md
|
|
perf_best_practices.md
|
|
performance_analysis.md
|
|
|
|
|
|
Python API
|
|
----------
|
|
|
|
- :doc:`tensorrt_llm.layers <python-api/tensorrt_llm.layers>`
|
|
- :doc:`tensorrt_llm.functional <python-api/tensorrt_llm.functional>`
|
|
- :doc:`tensorrt_llm.models <python-api/tensorrt_llm.models>`
|
|
- :doc:`tensorrt_llm.plugin <python-api/tensorrt_llm.plugin>`
|
|
- :doc:`tensorrt_llm.quantization <python-api/tensorrt_llm.quantization>`
|
|
- :doc:`tensorrt_llm.runtime <python-api/tensorrt_llm.runtime>`
|
|
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
:caption: Python API
|
|
:hidden:
|
|
|
|
python-api/tensorrt_llm.layers
|
|
python-api/tensorrt_llm.functional
|
|
python-api/tensorrt_llm.models
|
|
python-api/tensorrt_llm.plugin
|
|
python-api/tensorrt_llm.quantization
|
|
python-api/tensorrt_llm.runtime
|
|
|
|
|
|
C++ API
|
|
---------
|
|
|
|
- :doc:`cpp/runtime <_cpp_gen/runtime>`
|
|
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
:caption: C++ API
|
|
:hidden:
|
|
|
|
_cpp_gen/runtime
|
|
|
|
|
|
Indices and tables
|
|
==================
|
|
|
|
* :ref:`genindex`
|
|
* :ref:`modindex`
|
|
* :ref:`search`
|
|
|
|
|
|
Blogs
|
|
----------
|
|
|
|
.. toctree::
|
|
:maxdepth: 2
|
|
:caption: Blogs
|
|
:hidden:
|
|
|
|
blogs/H100vsA100.md
|
|
blogs/H200launch.md
|
|
blogs/Falcon180B-H200.md
|
|
blogs/quantization-in-TRT-LLM.md
|