TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-05 02:31:33 +08:00

History

Jiayu Chang 1dc49b266e [https://nvbugs/5322131 ][feat] Multi-LoRA serving with CUDA Graph (#8279 ) Signed-off-by: Jiayu Chang <jiayuc@nvidia.com>		2026-01-22 14:01:18 +01:00
..
batch_manager	[https://nvbugs/5322131 ][feat] Multi-LoRA serving with CUDA Graph (#8279 )	2026-01-22 14:01:18 +01:00
common	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
deep_gemm	[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851 )	2025-09-25 21:02:35 +08:00
executor	[TRTLLM-10059][feat] Use global unique id as disagg request id (#10187 )	2026-01-21 22:52:34 -05:00
kernels	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 )	2025-12-12 23:32:15 +08:00
layers	[None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127 )	2025-10-27 13:12:31 -04:00
plugins/api	Update TensorRT-LLM (#2532 )	2024-12-04 21:16:56 +08:00
runtime	[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350 )	2026-01-05 20:08:03 +08:00