Simplify GQA conditions in sdpa_attention.py by justinchuby · Pull Request #41699 · huggingface/transformers

justinchuby · 2025-10-17T16:01:42Z

Removed unnecessary checks for key being a torch.fx.Proxy in GQA conditions because fx tracing is no longer supported, and torch.export supports enable_gqa.

vasqu

Perfect, thank you! Makes more sense esp since we removed fx support explicitly in #41683

justinchuby · 2025-10-17T16:27:45Z

@@ -32,13 +32,11 @@ def use_gqa_in_sdpa(attention_mask: Optional[torch.Tensor], key: torch.Tensor) -
    # 1.cuda or Ascend NPU
    #   - torch version >= 2.5
    #   - attention_mask is None (otherwise it will fall back to the math kernel)


Is it required that attention_mask is None? Just wanting to check it the situation has changed

Yup, at least on older versions it is! Was only recently relaxed

See the description here https://docs.pytorch.org/docs/2.5/generated/torch.nn.functional.scaled_dot_product_attention.html (under gqa --> supports only flash and math --> flash only works without a mask)

We want to avoid the math kernel

HuggingFaceDocBuilderDev · 2025-10-17T16:37:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Removed unnecessary checks for key being a torch.fx.Proxy in GQA conditions because fx tracing is no longer supported, and torch.export supports enable_gqa.

Simplify GQA conditions in sdpa_attention.py

b938fab

Removed unnecessary checks for key being a torch.fx.Proxy in GQA conditions because fx tracing is no longer supported, and torch.export supports enable_gqa.

justinchuby mentioned this pull request Oct 17, 2025

Fix executorch export with dynamic shapes #41559

Closed

5 tasks

vasqu approved these changes Oct 17, 2025

View reviewed changes

vasqu enabled auto-merge (squash) October 17, 2025 16:27

justinchuby commented Oct 17, 2025

View reviewed changes

vasqu merged commit 347a0f9 into huggingface:main Oct 17, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify GQA conditions in sdpa_attention.py#41699

Simplify GQA conditions in sdpa_attention.py#41699
vasqu merged 1 commit intohuggingface:mainfrom
justinchuby:patch-3

justinchuby commented Oct 17, 2025

Uh oh!

vasqu left a comment

Uh oh!

justinchuby Oct 17, 2025

Uh oh!

vasqu Oct 17, 2025

Uh oh!

vasqu Oct 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

justinchuby commented Oct 17, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

justinchuby Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

vasqu Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

vasqu Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vasqu Oct 17, 2025 •

edited

Loading