TensorRT-LLMs/_sources/examples/index.rst.txt

=======================================================
LLM Examples Introduction
=======================================================

Here is a simple example to show how to use the LLM with TinyLlama.

.. literalinclude:: ../../../examples/llm-api/quickstart_example.py
   :language: python
   :linenos:

The LLM API can be used for both offline or online usage. See more examples of the LLM API here:

.. toctree::
    :maxdepth: 1
    :caption: LLM API Examples

    llm_inference_async
    llm_inference_kv_events
    llm_inference_customize
    llm_lookahead_decoding
    llm_medusa_decoding
    llm_guided_decoding
    llm_logits_processor
    llm_quantization
    llm_inference
    llm_multilora
    llm_inference_async_streaming
    llm_inference_distributed
    llm_eagle_decoding
    llm_auto_parallel
    llm_mgmn_llm_distributed
    llm_mgmn_trtllm_bench
    llm_mgmn_trtllm_serve

For more details on how to fully utilize this API, check out:

* `Common customizations <customization.html>`_
* `LLM API Reference <../llm-api/index.html>`_