TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

rakib-hasan ff3b741045 feat: adding multimodal (only image for now) support in trtllm-bench (#3490 ) * feat: adding multimodal (only image for now) support in trtllm-bench Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * fix: add in load_dataset() calls to maintain the v2.19.2 behavior Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * re-adding prompt_token_ids and using that for prompt_len Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * updating the datasets version in examples as well Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * api changes are not needed Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * moving datasets requirement and removing a missed api change Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * addressing review comments Signed-off-by: Rakib Hasan <rhasan@nvidia.com> * refactoring the quickstart example Signed-off-by: Rakib Hasan <rhasan@nvidia.com> --------- Signed-off-by: Rakib Hasan <rhasan@nvidia.com>		2025-04-18 07:06:16 +08:00
..
_templates	Update TensorRT-LLM (#1725 )	2024-06-04 20:26:32 +08:00
advanced	disagg perf tune doc (#3516 )	2025-04-14 16:05:06 +08:00
architecture	Update TensorRT-LLM (#2562 )	2024-12-11 00:31:05 -08:00
blogs	Update README and add benchmarking blog for DeepSeek-R1 (#3232 )	2025-04-10 17:00:49 +08:00
commands	doc: add genai-perf benchmark & slurm multi-node for trtllm-serve doc (#3407 )	2025-04-16 00:11:58 +08:00
dev-on-cloud	doc: add doc ahout developent on cloud or runpod (#3194 )	2025-04-02 18:10:56 +08:00
examples	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
installation	relax the limitation of setuptools (#2992 )	2025-03-24 13:36:10 +08:00
llm-api	Update TensorRT-LLM (#2755 )	2025-02-11 03:01:00 +00:00
media	L4 added to readme (#3301 )	2025-04-06 19:09:28 +08:00
performance	feat: adding multimodal (only image for now) support in trtllm-bench (#3490 )	2025-04-18 07:06:16 +08:00
python-api	Update TensorRT-LLM (#1492 )	2024-04-24 14:44:22 +08:00
reference	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
torch	feat: no-cache attention in PyTorch workflow (#3085 )	2025-04-05 01:54:32 +08:00
conf.py	doc: add genai-perf benchmark & slurm multi-node for trtllm-serve doc (#3407 )	2025-04-16 00:11:58 +08:00
helper.py	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
index.rst	doc: refactor trtllm-serve examples and doc (#3187 )	2025-04-04 11:40:43 +08:00
key-features.md	Update TensorRT-LLM (#2562 )	2024-12-11 00:31:05 -08:00
overview.md	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
quick-start-guide.md	Update (#2978 )	2025-03-23 16:39:35 +08:00
release-notes.md	chore: Mass integration of release/0.18 (#3421 )	2025-04-16 10:03:29 +08:00
torch.md	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00