TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

qsang-nv 0f42a24f45 [None][feat] Fix attention sink load in xqa (#8836 ) Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>		2025-11-03 09:39:45 +08:00
..
fmha_v2	[None][feat] Add fmha_v2 kernel for head_dim=80 and sm=100 to support VLM (#8392 )	2025-10-17 19:42:47 +08:00
xqa	[None][feat] Fix attention sink load in xqa (#8836 )	2025-11-03 09:39:45 +08:00