TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Go to file

石晓伟 1d1cd4259a Update gh-pages (#1196 )		2024-02-29 20:56:26 +08:00
_cpp_gen	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
_modules	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
_sources	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
_static	update github pages (#540 )	2023-12-04 16:26:13 +08:00
blogs	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
python-api	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
.nojekyll	gh-pages for release/0.5.0	2023-10-19 12:25:48 +00:00
2023-05-17-how-to-add-a-new-model.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
2023-05-19-how-to-debug.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
architecture.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
batch_manager.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
build_from_source.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
genindex.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
gpt_attention.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
gpt_runtime.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
graph-rewriting.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
index.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
inference_request.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
lora.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
memory.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
new_workflow.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
objects.inv	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
perf_best_practices.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
performance_analysis.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
performance.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
precision.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
py-modindex.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
search.html	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00
searchindex.js	Update gh-pages (#1196 )	2024-02-29 20:56:26 +08:00