TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Venky bb02d86b54 test(perf): Add some `Llama-3_3-Nemotron-Super-49B-v1` integration-perf-tests (TRT flow, trtllm-bench) (#4128 ) * changes to run llama-v3.3-nemotron-super-49b Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> * yapf Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> * address review comments pt 1 Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> * re-add cpp super tests Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com> --------- Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>		2025-05-19 12:00:48 -07:00
..
__init__.py	Update TensorRT-LLM	2024-08-20 18:55:15 +08:00
build.py	test(perf): Add some `Llama-3_3-Nemotron-Super-49B-v1` integration-perf-tests (TRT flow, trtllm-bench) (#4128 )	2025-05-19 12:00:48 -07:00
dataclasses.py	feat: adding multimodal (only image for now) support in trtllm-bench (#3490 )	2025-04-18 07:06:16 +08:00
tuning.py	Update TensorRT-LLM (#2849 )	2025-03-04 18:44:00 +08:00
utils.py	Update TensorRT-LLM (#2532 )	2024-12-04 21:16:56 +08:00