Commit Graph

6 Commits

Author SHA1 Message Date
kris1025
6a3a921284
[TRTLLM-6685][feat] Add speculative metrics for trt llm bench (#6476)
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-08-04 15:22:57 -07:00
Frank
cf15efa15e
[TRTLLM-4883][fix]: Update output speed calculation. (#3923)
* Update gen tps calculation.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Add back output speed for comparison.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Fix issue with f-string.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Fix some spacing.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Replace output speed with per-request genphase tput.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Add gen TPS breakdown.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Update some tagging.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

---------

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-04-29 11:04:12 +08:00
Frank
f8a4cc0629
perf: Add total token throughput metric. (#3212)
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-04-05 13:17:59 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM

---------

Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM

---------

Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
Kaiyu Xie
385626572d
Update TensorRT-LLM (#2502)
* Update TensorRT-LLM

---------

Co-authored-by: 岑灿 <yunyi.hyy@alibaba-inc.com>
2024-11-26 16:51:34 +08:00