mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-03 01:31:30 +08:00
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
* Test Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> * Update Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> --------- Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> |
||
|---|---|---|
| 0.18.2 | ||
| 0.19.0rc0 | ||
| 0.20.0rc0 | ||
| latest | ||
| .nojekyll | ||
| index.html | ||