mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-04 18:21:52 +08:00
Signed-off-by: Yao Yao <lowsfer@users.noreply.github.com> KVCacheManagerV2 is a new python-based implementation of the KV cache manager, featuring cleaner API, better abstraction and better code quality without the accumulated legacy. |
||
|---|---|---|
| .. | ||
| kv_cache_manager_v2 | ||
| memory_pools | ||
| processor_wrapper | ||
| __init__.py | ||
| enc_dec_model_runner.py | ||
| generation.py | ||
| kv_cache_manager.py | ||
| medusa_utils.py | ||
| model_runner_cpp.py | ||
| model_runner.py | ||
| multimodal_model_runner.py | ||
| redrafter_utils.py | ||
| session.py | ||