TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

qixiang-99 ca7b6ec8d8 Feat/pytorch vswa kvcachemanager (#5151 ) Signed-off-by: qixiang-99 <203170375+qixiang-99@users.noreply.github.com>		2025-07-02 15:58:00 +08:00
..
batch_manager	Feat/pytorch vswa kvcachemanager (#5151 )	2025-07-02 15:58:00 +08:00
common	[TRTLLM-4987][feat] Partial support of context logits in TRTLLMSampler (#4538 )	2025-06-01 03:32:43 +08:00
executor	[TRTLLM-6104] feat: add request_perf_metrics to LLMAPI (#5497 )	2025-06-27 17:03:05 +02:00
runtime	refactor: decoder state setup (#5093 )	2025-06-30 11:09:43 +02:00
testing	refactor: Move ModelSpec to core library (#3980 )	2025-05-04 01:39:09 +08:00
userbuffers	fix: Move all casters to customCasters. (#3945 )	2025-05-02 19:08:28 +08:00
bindings.cpp	refactor: remove batch_manager::KvCacheConfig and use executor::KvCacheConfig instead (#5384 )	2025-06-26 19:45:52 +08:00
CMakeLists.txt	feat: large-scale EP(part 2: MoE Load Balancer - core utilities) (#4384 )	2025-05-20 17:53:48 +08:00