TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Ziyi Xiong de472828b9 [TRTLLM-6637][feat] Resolve KV cache divergence issue (#6628 ) Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>		2025-08-09 23:15:04 +08:00
..
batch_manager	[TRTLLM-6637][feat] Resolve KV cache divergence issue (#6628 )	2025-08-09 23:15:04 +08:00
common	[None] [feat] Add model gpt-oss (#6645 )	2025-08-07 03:04:18 -04:00
deep_gemm	fix: fix license bug (#5200 )	2025-06-13 18:58:15 +08:00
executor	[TRTLLM-6881][feat] Include attention dp rank info with KV cache events (#6563 )	2025-08-07 14:17:07 +02:00
kernels	fix: compatibility with CUDA < 12.9 on `__CUDA_ARCH_SPECIFIC__` macro (#5917 )	2025-07-28 16:02:26 +08:00
layers	v1.2 (#3082 )	2025-03-26 23:31:29 +08:00
plugins/api	Update TensorRT-LLM (#2532 )	2024-12-04 21:16:56 +08:00
runtime	[TRTLLM-6785][feat] BREAKING CHANGE Enable TRTLLM sampler by default (#6216 )	2025-08-07 22:19:37 -04:00