TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-08 20:21:48 +08:00

History

Jhao-Ting Chen 92d90fa29a [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>		2025-12-23 11:41:31 -06:00
..
disaggregated_serving
gpt
inflight_batcher_llm	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
llmapi/tensorrt_llm
multimodal	[https://nvbugs/5606136 ][ci] Remove tests for deprecating triton multimodal models. (#8926 )	2025-11-06 17:58:42 -08:00
tests
whisper/whisper_bls