TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

History

Jhao-Ting Chen 92d90fa29a [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>		2025-12-23 11:41:31 -06:00
..
disaggregated_serving	[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850 )	2025-09-25 21:02:35 +08:00
gpt	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00
inflight_batcher_llm	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
llmapi/tensorrt_llm	feat: Add support for Triton request cancellation (#5898 )	2025-07-15 20:52:43 -04:00
multimodal	[https://nvbugs/5606136 ][ci] Remove tests for deprecating triton multimodal models. (#8926 )	2025-11-06 17:58:42 -08:00
tests	[https://nvbugs/5394409 ][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714 )	2025-08-21 18:08:38 +02:00
whisper/whisper_bls	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00