TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-13 22:18:36 +08:00

History

Jhao-Ting Chen 92d90fa29a [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>		2025-12-23 11:41:31 -06:00
..
all_models	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
ci	[TRI-332] [fix] Fix L0_backend_trtllm (#9282 )	2025-11-20 18:55:37 -08:00
inflight_batcher_llm	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
scripts	[nvbug/5308432] fix: extend triton exit time for test_llava (#5971 )	2025-07-12 12:56:37 +09:00
tools	[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851 )	2025-09-25 21:02:35 +08:00
requirements.txt	[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689 )	2025-12-15 20:05:20 -08:00