mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
10 lines
382 B
Markdown
10 lines
382 B
Markdown
# LLM API with TensorRT Engine
|
|
A simple inference example with TinyLlama using the LLM API:
|
|
|
|
```{literalinclude} ../../../examples/llm-api/_tensorrt_engine/quickstart_example.py
|
|
:language: python
|
|
:linenos:
|
|
```
|
|
|
|
For more advanced usage including distributed inference, multimodal, and speculative decoding, please refer to this [README](../../../examples/llm-api/README.md).
|