Skip to content

Fix AttributeError on s_aux=None in flash_attention_forward#45589

Merged
ArthurZucker merged 1 commit intohuggingface:mainfrom
jamesbraza:fix/s-aux-none-guard-45588
Apr 23, 2026
Merged

Fix AttributeError on s_aux=None in flash_attention_forward#45589
ArthurZucker merged 1 commit intohuggingface:mainfrom
jamesbraza:fix/s-aux-none-guard-45588

Conversation

@jamesbraza
Copy link
Copy Markdown
Contributor

`flash_attention_forward` unconditionally called `s_aux.to(query.dtype)`,
which crashed with `AttributeError: 'NoneType' object has no attribute 'to'`
for models that don't use attention sinks (e.g. Gemma). Mirrors the parallel
guard added in huggingface#40434 for `flash_paged.py`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vasqu vasqu added the for patch Tag issues / labels that should be included in the next patch label Apr 23, 2026
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep let's patch asap

@ArthurZucker ArthurZucker merged commit a804c74 into huggingface:main Apr 23, 2026
25 checks passed
@jamesbraza jamesbraza deleted the fix/s-aux-none-guard-45588 branch April 23, 2026 07:41
ArthurZucker pushed a commit that referenced this pull request Apr 23, 2026
…5589)

Guard `s_aux` cast in `flash_attention_forward` for sink-less models

`flash_attention_forward` unconditionally called `s_aux.to(query.dtype)`,
which crashed with `AttributeError: 'NoneType' object has no attribute 'to'`
for models that don't use attention sinks (e.g. Gemma). Mirrors the parallel
guard added in #40434 for `flash_paged.py`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
tarekziade pushed a commit that referenced this pull request Apr 23, 2026
…5589)

Guard `s_aux` cast in `flash_attention_forward` for sink-less models

`flash_attention_forward` unconditionally called `s_aux.to(query.dtype)`,
which crashed with `AttributeError: 'NoneType' object has no attribute 'to'`
for models that don't use attention sinks (e.g. Gemma). Mirrors the parallel
guard added in #40434 for `flash_paged.py`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

for patch Tag issues / labels that should be included in the next patch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integrations/flash_attention.py crashes with AttributeError on s_aux=None for sink-less models

4 participants