mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-16 15:55:08 +08:00
[None][chore] AutoDeploy: Set nanov3 and superv3 configs to use flashinfer ssm (#11183)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
This commit is contained in:
parent
d90a8e5700
commit
767b8dcab3
@ -45,3 +45,5 @@ transforms:
|
||||
fuse_mamba_a_log:
|
||||
stage: post_load_fusion
|
||||
enabled: true
|
||||
insert_cached_ssm_attention:
|
||||
backend: flashinfer_ssm
|
||||
|
||||
@ -44,3 +44,5 @@ transforms:
|
||||
fuse_mamba_a_log:
|
||||
stage: post_load_fusion
|
||||
enabled: true
|
||||
insert_cached_ssm_attention:
|
||||
backend: flashinfer_ssm
|
||||
|
||||
Loading…
Reference in New Issue
Block a user