mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
1.4 KiB
1.4 KiB
PyTorch Backend
Note:
This feature is currently in beta, and the related API is subjected to change in future versions.
To enhance the usability of the system and improve developer efficiency, TensorRT-LLM launches a new backend based on PyTorch.
The PyTorch backend of TensorRT-LLM is available in version 0.17 and later. You can try it via importing tensorrt_llm._torch.
Quick Start
Here is a simple example to show how to use tensorrt_llm.LLM API with Llama model.
:language: python
:linenos:
Features
Developer Guide
Key Components
Known Issues
- The PyTorch backend on SBSA is incompatible with bare metal environments like Ubuntu 24.04. Please use the PyTorch NGC Container for optimal support on SBSA platforms.