TensorRT-LLMs/triton_backend/all_models
Jhao-Ting Chen 92d90fa29a
[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-23 11:41:31 -06:00
..
disaggregated_serving [None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850) 2025-09-25 21:02:35 +08:00
gpt Move Triton backend to TRT-LLM main (#3549) 2025-05-16 07:15:23 +08:00
inflight_batcher_llm [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018) 2025-12-23 11:41:31 -06:00
llmapi/tensorrt_llm feat: Add support for Triton request cancellation (#5898) 2025-07-15 20:52:43 -04:00
multimodal [https://nvbugs/5606136][ci] Remove tests for deprecating triton multimodal models. (#8926) 2025-11-06 17:58:42 -08:00
tests [https://nvbugs/5394409][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714) 2025-08-21 18:08:38 +02:00
whisper/whisper_bls Move Triton backend to TRT-LLM main (#3549) 2025-05-16 07:15:23 +08:00