mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com> Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com> Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
1.5 KiB
1.5 KiB
PyTorch Backend
Note:
This feature is currently in beta, and the related API is subjected to change in future versions.
To enhance the usability of the system and improve developer efficiency, TensorRT-LLM launches a new backend based on PyTorch.
The PyTorch backend of TensorRT-LLM is available in version 0.17 and later. You can try it via importing tensorrt_llm._torch.
Quick Start
Here is a simple example to show how to use tensorrt_llm.LLM API with Llama model.
:language: python
:linenos:
Features
Developer Guide
Key Components
Known Issues
- The PyTorch backend on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.