TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Go to file
2024-09-30 17:25:23 +08:00
_cpp_gen update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
_downloads update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
_modules update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
_sources update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
_static update gh-pages (#2168) 2024-08-30 13:09:14 +08:00
advanced update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
architecture update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
blogs update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
commands update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
installation update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
llm-api update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
llm-api-examples update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
performance update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
python-api update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
reference update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
.nojekyll gh-pages for release/0.5.0 2023-10-19 12:25:48 +00:00
executor.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
genindex.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
index.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
key-features.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
kv_cache_reuse.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
objects.inv update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
overview.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
py-modindex.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
quick-start-guide.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
release-notes.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
search.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
searchindex.js update gh-pages (#2271) 2024-09-30 17:25:23 +08:00
speculative_decoding.html update gh-pages (#2271) 2024-09-30 17:25:23 +08:00