mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
doc: Document the docker release image on NGC (#4705)
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
This commit is contained in:
parent
971d16a2ee
commit
f3fba4cc63
@ -29,7 +29,7 @@ where `x.xx.x` is the version of the TensorRT-LLM container to use. This command
|
|||||||
NVIDIA NGC registry, sets up the local user's account within the container, and launches it with full GPU support. The
|
NVIDIA NGC registry, sets up the local user's account within the container, and launches it with full GPU support. The
|
||||||
local source code of TensorRT-LLM will be mounted inside the container at the path `/code/tensorrt_llm` for seamless
|
local source code of TensorRT-LLM will be mounted inside the container at the path `/code/tensorrt_llm` for seamless
|
||||||
integration. Ensure that the image version matches the version of TensorRT-LLM in your current local git branch. Not
|
integration. Ensure that the image version matches the version of TensorRT-LLM in your current local git branch. Not
|
||||||
specifying an `IMAGE_TAG` will attempt to resolve this automatically, but the not every intermediate release might be
|
specifying an `IMAGE_TAG` will attempt to resolve this automatically, but not every intermediate release might be
|
||||||
accompanied by development container. In that case, use the latest version preceding the version of your development
|
accompanied by development container. In that case, use the latest version preceding the version of your development
|
||||||
branch.
|
branch.
|
||||||
|
|
||||||
|
|||||||
57
docker/release.md
Normal file
57
docker/release.md
Normal file
@ -0,0 +1,57 @@
|
|||||||
|
# Description
|
||||||
|
|
||||||
|
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support
|
||||||
|
state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to
|
||||||
|
create Python and C++ runtimes that orchestrate the inference execution in performant way.
|
||||||
|
|
||||||
|
# Overview
|
||||||
|
|
||||||
|
## TensorRT-LLM Release Container
|
||||||
|
|
||||||
|
The TensorRT-LLM Release container provides a pre-built environment for running TensorRT-LLM.
|
||||||
|
|
||||||
|
Visit the [official GitHub repository](https://github.com/NVIDIA/TensorRT-LLM) for more details.
|
||||||
|
|
||||||
|
### Running TensorRT-LLM Using Docker
|
||||||
|
|
||||||
|
A typical command to launch the container is:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run --rm -it --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --gpus=all \
|
||||||
|
nvcr.io/nvidia/tensorrt-llm/release:x.xx.x
|
||||||
|
```
|
||||||
|
|
||||||
|
where x.xx.x is the version of the TensorRT-LLM container to use. To sanity check, run the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 -c "import tensorrt_llm"
|
||||||
|
```
|
||||||
|
|
||||||
|
This command will print the TensorRT-LLM version if everything is working correctly. After verification, you can explore
|
||||||
|
and try the example scripts included in `/app/tensorrt_llm/examples`.
|
||||||
|
|
||||||
|
Alternatively, if you have already cloned the TensorRT-LLM repository, you can use the following convenient command to
|
||||||
|
run the container:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make -C docker ngc-release_run LOCAL_USER=1 DOCKER_PULL=1 IMAGE_TAG=x.xx.x
|
||||||
|
```
|
||||||
|
|
||||||
|
This command pulls the specified container from the NVIDIA NGC registry, sets up the local user's account within the
|
||||||
|
container, and launches it with full GPU support.
|
||||||
|
|
||||||
|
For comprehensive information about TensorRT-LLM, including documentation, source code, examples, and installation
|
||||||
|
guidelines, visit the following official resources:
|
||||||
|
|
||||||
|
- [TensorRT-LLM GitHub Repository](https://github.com/NVIDIA/TensorRT-LLM)
|
||||||
|
- [TensorRT-LLM Online Documentation](https://nvidia.github.io/TensorRT-LLM/latest/index.html)
|
||||||
|
|
||||||
|
### Security CVEs
|
||||||
|
|
||||||
|
To review known CVEs on this image, refer to the Security Scanning tab on this page.
|
||||||
|
|
||||||
|
### License
|
||||||
|
|
||||||
|
By pulling and using the container, you accept the terms and conditions of
|
||||||
|
this [End User License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/)
|
||||||
|
and [Product-Specific Terms](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/).
|
||||||
Loading…
Reference in New Issue
Block a user