-
Notifications
You must be signed in to change notification settings - Fork 33.1k
Kernels flash attn #39474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Kernels flash attn #39474
Changes from all commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
cd4c7cb
use partial to wrap around `transformers` utils!
ArthurZucker 005f482
try to refactor?
ArthurZucker 1b834a4
revert one wrong change
ArthurZucker d93f366
just a nit
ArthurZucker 2b7d411
push
ArthurZucker affba20
reverter watever was wrong!
ArthurZucker 1959eb2
some nits
ArthurZucker 888cd40
fixes when there is no attention mask
ArthurZucker 8f5e62b
Merge branch 'main' of github.com:huggingface/transformers into kerne…
ArthurZucker 5a7ae11
bring the licence back
ArthurZucker c57673b
some fixes
ArthurZucker 7d69d83
nit
ArthurZucker 7e94910
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
ArthurZucker 112e2a6
style
ArthurZucker 501aa7e
remove prints
ArthurZucker 04088be
correct dtype
ArthurZucker b1e104b
fa flags for testing
vasqu 7087e7b
update
ArthurZucker cc58aca
Merge branch 'main' into kernels-flash-attn
ArthurZucker 6a2996a
use paged attention if requested!
ArthurZucker 8ddc525
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
ArthurZucker a586294
updates
ArthurZucker 57842f5
a clone was needed, not sure why
ArthurZucker 43b7f32
automatically create cu seq lens when input is flash, this at least m…
ArthurZucker 12bad1b
simplify and improve?
ArthurZucker c0b600a
flash attention is kinda broken on recent cuda version so allow the o…
ArthurZucker 5c64874
Merge branch 'main' into kernels-flash-attn
ArthurZucker 11e5000
fix!
ArthurZucker 1c07350
protect kernels import
ArthurZucker cdaa1eb
update
ArthurZucker 767d585
properly parse generation config being passed
ArthurZucker 10f866e
Merge branch 'kernels-flash-attn' of github.com:huggingface/transform…
ArthurZucker c75c539
revert and update
ArthurZucker a2f3126
add two tests
ArthurZucker 63b01c3
Merge branch 'main' of github.com:huggingface/transformers into kerne…
ArthurZucker 85829d7
some fixes
ArthurZucker 56981a5
fix test FA2
ArthurZucker b3f7a49
takes comment into account
ArthurZucker 21e07f7
fixup
ArthurZucker a8b7ec6
revert changes
ArthurZucker f111d33
revert the clone, it is only needed because the metal kernel is not d…
ArthurZucker cd98c1f
[docs] update attention implementation and cache docs (#39547)
zucchini-nlp f457a08
fix mps on our side for now
ArthurZucker 38d241b
Update src/transformers/integrations/flash_paged.py
ArthurZucker cb58187
Merge branches 'main' and 'kernels-flash-attn' of github.com:huggingf…
ArthurZucker c0f4f09
no qa
ArthurZucker File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.