mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Yan Chunwei 612c26be22 [None][doc] add legacy section for tensorrt engine (#6724 )

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>

2025-09-01 11:02:31 +08:00

LLM API with TensorRT Engine

A simple inference example with TinyLlama using the LLM API:

    :language: python
    :linenos:

For more advanced usage including distributed inference, multimodal, and speculative decoding, please refer to this README.