TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-03 01:31:30 +08:00

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Go to file

Kaiyu Xie 1b55286e1b doc: support main page redirection (#3870 ) * Test Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> * Update Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> --------- Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>		2025-04-25 10:50:50 -07:00
0.18.2	Add multi-version documents (#3861 )	2025-04-25 10:35:43 -07:00
0.19.0rc0	Add multi-version documents (#3861 )	2025-04-25 10:35:43 -07:00
0.20.0rc0	Add multi-version documents (#3861 )	2025-04-25 10:35:43 -07:00
latest	Add multi-version documents (#3861 )	2025-04-25 10:35:43 -07:00
.nojekyll	update gh-pages (#3403 )	2025-04-09 14:14:17 +08:00
index.html	doc: support main page redirection (#3870 )	2025-04-25 10:50:50 -07:00