TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Go to file

nv-guomingz df5423f505 Update gh-pages. (#1972 ) Co-authored-by: Guoming Zhang <37257613+nv-guomingz@users.noreply.github.com>		2024-07-17 22:32:54 +08:00
_cpp_gen	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
_downloads/ea8faa5e98124e92f96b66dc586fb429	Update gh-pages (#1737 )	2024-06-05 21:59:38 +08:00
_sources	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
_static	Update gh-pages for 0.11 release. (#1971 )	2024-07-17 21:29:04 +08:00
advanced	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
architecture	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
blogs	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
installation	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
performance	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
python-api	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
reference	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
.nojekyll	gh-pages for release/0.5.0	2023-10-19 12:25:48 +00:00
executor.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
genindex.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
index.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
kv_cache_reuse.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
objects.inv	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
overview.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
quick-start-guide.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
release-notes.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
search.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
searchindex.js	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00
speculative_decoding.html	Update gh-pages. (#1972 )	2024-07-17 22:32:54 +08:00