TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

History

Jhao-Ting Chen 92d90fa29a [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>		2025-12-23 11:41:31 -06:00
..
client	[Chore] Replace MODEL_CACHE_DIR with LLM_MODELS_ROOT and unwaive triton_server/test_triton.py::test_gpt_ib[gpt-ib] (#5859 )	2025-07-21 15:46:37 -07:00
cmake	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00
scripts	[None][fix] Fix build of tritonbuild/tritonrelease image (#7003 )	2025-09-01 11:02:31 +08:00
src	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
tests	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00
CMakeLists.txt	[TRTLLM-9197][infra] Move thirdparty stuff to it's own listfile (#8986 )	2025-11-20 16:44:23 -08:00