mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
* Update TensorRT-LLM --------- Co-authored-by: Marks101 <markus.schnoes@gmx.de> Co-authored-by: lkm2835 <lkm2835@gmail.com>
2.3 KiB
2.3 KiB
(windows)=
Installing on Windows
The Windows release of TensorRT-LLM is currently in beta.
We recommend checking out the [v0.10.0 tag](https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.10.0) for the most stable experience.
Prerequisites
-
Clone this repository using Git for Windows.
-
Install the dependencies one of two ways:
-
Install all dependencies together.
- Run the provided PowerShell script
setup_env.ps1located under the/windows/folder which installs Python and CUDA 12.4.1 automatically with default settings. Run PowerShell as Administrator to use the script.
./setup_env.ps1 [-skipCUDA] [-skipPython]- Close and re-open any existing PowerShell or Git Bash windows so they pick up the new
Pathmodified by thesetup_env.ps1script above.
- Run the provided PowerShell script
-
Install the dependencies one at a time.
-
Install Python 3.10.
- Select Add python.exe to PATH at the start of the installation. The installation may only add the
pythoncommand, but not thepython3command. - Navigate to the installation path
%USERPROFILE%\AppData\Local\Programs\Python\Python310(AppDatais a hidden folder) and copypython.exetopython3.exe.
- Select Add python.exe to PATH at the start of the installation. The installation may only add the
-
Install CUDA 12.4.1 Toolkit. Use the Express Installation option. Installation may require a restart.
-
-
Steps
- Install TensorRT-LLM.
If you have an existing TensorRT installation (from older versions of tensorrt_llm), please execute
pip uninstall -y tensorrt tensorrt_libs tensorrt_bindings
pip uninstall -y nvidia-cublas-cu12 nvidia-cuda-nvrtc-cu12 nvidia-cuda-runtime-cu12 nvidia-cudnn-cu12
before installing TensorRT-LLM with the following command.
pip install tensorrt_llm==0.10.0 --extra-index-url https://pypi.nvidia.com
Run the following command to verify that your TensorRT-LLM installation is working properly.
python -c "import tensorrt_llm; print(tensorrt_llm._utils.trt_version())"
- Build the model.
- Deploy the model.