TensorRT-LLMs/tensorrt_llm/bench/benchmark
Frank 1e317c98c6
[feat]: Allow for a settable end-of-sequence/padding token in max throughput benchmark. (#3776)
* Move world options to a different group for clarity.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

* Add eos_id option.

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>

---------

Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-05-01 09:42:46 +08:00
..
utils Add smart router for moe (#3641) 2025-04-23 12:21:59 +08:00
__init__.py Update TensorRT-LLM (#2389) 2024-10-29 22:24:38 +08:00
low_latency.py chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025) 2025-04-05 13:31:48 +08:00
throughput.py [feat]: Allow for a settable end-of-sequence/padding token in max throughput benchmark. (#3776) 2025-05-01 09:42:46 +08:00