TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

xavier-nvidia b6013da198 Fix GEMM+AR fusion on blackwell (#5563 ) Signed-off-by: xsimmons <xsimmons@nvidia.com>		2025-07-09 08:48:47 +08:00
..
batch_manager	feat: Optimize TRTLLM Sampler perf single beam single step (#5550 )	2025-07-07 15:44:47 +02:00
common	[TRTLLM-4987][feat] Partial support of context logits in TRTLLMSampler (#4538 )	2025-06-01 03:32:43 +08:00
executor	feat: KV events for sliding window attention (#5580 )	2025-07-05 06:05:20 +08:00
runtime	refactor: decoding inputs (#5679 )	2025-07-06 08:21:02 +02:00
testing	fix: Improve chunking test and skip empty kernel calls (#5710 )	2025-07-04 09:08:15 +02:00
userbuffers	fix: Move all casters to customCasters. (#3945 )	2025-05-02 19:08:28 +08:00
bindings.cpp	refactor: remove batch_manager::KvCacheConfig and use executor::KvCacheConfig instead (#5384 )	2025-06-26 19:45:52 +08:00
CMakeLists.txt	Fix GEMM+AR fusion on blackwell (#5563 )	2025-07-09 08:48:47 +08:00