TensorRT-LLMs/tensorrt_llm/_torch/attention_backend
HuiGao-NV 97674c3114
[TRTLLM-8690][feat] add more tensors to share buffers (#8691)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-03 21:08:01 -08:00
..
sparse [TRTLLM-8690][feat] add more tensors to share buffers (#8691) 2025-11-03 21:08:01 -08:00
__init__.py [TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support (#8086) 2025-10-14 08:23:16 -07:00
flashinfer.py [TRTLLM-8690][feat] add more tensors to share buffers (#8691) 2025-11-03 21:08:01 -08:00
interface.py [TRTLLM-8690][feat] add more tensors to share buffers (#8691) 2025-11-03 21:08:01 -08:00
star_flashinfer.py Remove dummy forward path (#3669) 2025-04-18 16:17:50 +08:00
trtllm.py [TRTLLM-8690][feat] add more tensors to share buffers (#8691) 2025-11-03 21:08:01 -08:00
utils.py [TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache (#8405) 2025-10-24 13:40:41 -04:00
vanilla.py [TRTLLM-8536][feat] Add the sparse attention framework and one use case--RocketKV support (#8086) 2025-10-14 08:23:16 -07:00