TensorRT-LLMs/tensorrt_llm/bench/benchmark
Suyog Gupta 047f2b234d
perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench (#3041)
* Enable AutoDeploy as a backend in trtllm-bench

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* update how caches are resized

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* fix: files permission from 100755 to 100644

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* some comments

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Fix function name

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* refactor

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Remove spurious change

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Add cursor generated doc strings

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* re-enable ad test

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* some perf cleanup

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* debug ci

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* ensure that overlap scheduler is enabled

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Reorder the tests

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

---------

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-26 14:33:14 -07:00
..
utils perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench (#3041) 2025-03-26 14:33:14 -07:00
__init__.py Update TensorRT-LLM (#2389) 2024-10-29 22:24:38 +08:00
low_latency.py Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
throughput.py perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench (#3041) 2025-03-26 14:33:14 -07:00