mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

* Update TensorRT-LLM

---------

Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>

2024-05-28 20:07:49 +08:00

1.8 KiB

Raw Blame History

(linux)=

Installing on Linux

Retrieve and launch the docker container (optional).

You can pre-install the environment using the NVIDIA Container Toolkit to avoid manual environment configuration.

# Obtain and start the basic docker image environment (optional).
docker run --rm --runtime=nvidia --gpus all --entrypoint /bin/bash -it nvidia/cuda:12.4.0-devel-ubuntu22.04

Install TensorRT-LLM.

# Install dependencies, TensorRT-LLM requires Python 3.10
apt-get update && apt-get -y install python3.10 python3-pip openmpi-bin libopenmpi-dev git git-lfs

# Install the latest preview version (corresponding to the main branch) of TensorRT-LLM.
# If you want to install the stable version (corresponding to the release branch), please
# remove the `--pre` option.
pip3 install tensorrt_llm -U --pre --extra-index-url https://pypi.nvidia.com

# Check installation
python3 -c "import tensorrt_llm"

Please note that TensorRT-LLM depends on TensorRT. In earlier versions that include TensorRT 8, overwriting an upgraded to a new version may require explicitly running pip uninstall tensorrt to uninstall the old version.

Install the requirements for running the example.

git clone https://github.com/NVIDIA/TensorRT-LLM.git
cd TensorRT-LLM
pip install -r examples/bloom/requirements.txt
git lfs install

Beyond the local execution, you can also use the NVIDIA Triton Inference Server to create a production-ready deployment of your LLM as described in this Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM blog.

1.8 KiB Raw Blame History

Installing on Linux

1.8 KiB

Raw Blame History