TensorRT-LLMs/tensorrt_llm/_torch/attention_backend
sunnyqgg ea3e0eea51
[TRTLLM-7954][feat] Target model KV cache rellocation (#8421)
Signed-off-by: qgai <qgai@nvidia.com>
2025-10-23 09:36:50 +08:00
..
sparse [TRTLLM-7954][feat] Target model KV cache rellocation (#8421) 2025-10-23 09:36:50 +08:00
__init__.py [TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support (#8086) 2025-10-14 08:23:16 -07:00
flashinfer.py [None][chore] Mass integration of release/1.0 - 3rd (#7519) 2025-09-08 14:03:04 +08:00
interface.py [TRTLLM-7954][feat] Target model KV cache rellocation (#8421) 2025-10-23 09:36:50 +08:00
star_flashinfer.py Remove dummy forward path (#3669) 2025-04-18 16:17:50 +08:00
trtllm.py [TRTLLM-7954][feat] Target model KV cache rellocation (#8421) 2025-10-23 09:36:50 +08:00
utils.py [TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support (#8086) 2025-10-14 08:23:16 -07:00
vanilla.py [TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support (#8086) 2025-10-14 08:23:16 -07:00