Skip to content

Fix new FA2 if is_causal is passed explicitly#35390

Merged
Cyrilvallez merged 3 commits intomainfrom
gpt2-fix
Dec 22, 2024
Merged

Fix new FA2 if is_causal is passed explicitly#35390
Cyrilvallez merged 3 commits intomainfrom
gpt2-fix

Conversation

@Cyrilvallez
Copy link
Copy Markdown
Member

@Cyrilvallez Cyrilvallez commented Dec 22, 2024

What does this PR do?

In GPT2 we have to pass is_causal explicitly for SDPA, but it causes double occurence for FA2 as highlighted in #35380. This fixes it.
Also reverts the reshape to simple view for simplicity (even if they are technically equivalent, as reshape calls view if possible)

@Cyrilvallez
Copy link
Copy Markdown
Member Author

cc @ArthurZucker, this should be included in the coming release!

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay thanks

@Cyrilvallez Cyrilvallez merged commit 05260a1 into main Dec 22, 2024
@Cyrilvallez Cyrilvallez deleted the gpt2-fix branch December 22, 2024 19:00
@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Dec 22, 2024

@Cyrilvallez not sure if view is gonna cause issues as non-contiguous tensors will result in an error (which really is the only reason to use reshape instead), i.e. calling contiguous before is safer imo.

@Cyrilvallez
Copy link
Copy Markdown
Member Author

We were using 'view' without 'contiguous' before the refactors, so this should be fine!

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Dec 22, 2024

Sounds good! Was just overly cautious then :)

@Cyrilvallez
Copy link
Copy Markdown
Member Author

No worries, thanks for double-checking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants