TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-01 08:41:13 +08:00

History

Robin Kobus 8dfa31c71d refactor: remove batch_manager::KvCacheConfig and use executor::KvCacheConfig instead (#5384 ) Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>		2025-06-26 19:45:52 +08:00
..
all_models	[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312 )	2025-06-20 03:01:10 +08:00
ci	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00
inflight_batcher_llm	refactor: remove batch_manager::KvCacheConfig and use executor::KvCacheConfig instead (#5384 )	2025-06-26 19:45:52 +08:00
scripts	feat: add multi-node support for Triton with pytorch backend (#5172 )	2025-06-13 13:27:58 -07:00
tools	[nvbug 5283506] fix: Fix spec decode triton test (#4845 )	2025-06-09 08:40:17 -04:00
requirements.txt	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00