TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-19 01:05:12 +08:00

History

Jhao-Ting Chen 92d90fa29a [None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>		2025-12-23 11:41:31 -06:00
..
rcca/bug_4323566
__init__.py
build_engines.py	[https://nvbugs/5394409 ][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714 )	2025-08-21 18:08:38 +02:00
build_model.sh
common.py	[https://nvbugs/5394409 ][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714 )	2025-08-21 18:08:38 +02:00
conftest.py	[TRTLLM-5950][infra] Removing remaining turtle keywords from the code base (#7086 )	2025-09-07 14:26:18 +08:00
local_venv.py
runner_interface.py
test_list_parser.py	[None][feat] add waive by sm version (#8928 )	2025-11-05 19:20:43 -08:00
test_triton_llm.py	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 )	2025-12-23 11:41:31 -06:00
test_triton_memleak.py
test_triton_multi_node.py
test_triton_rcca.py
test_triton.py	[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678 )	2025-08-03 11:18:59 +08:00
test.sh	[https://nvbugs/5394409 ][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714 )	2025-08-21 18:08:38 +02:00
trt_test_alternative.py