Dao-AILab / flash-attention Public

Notifications You must be signed in to change notification settings
Fork 2.3k
Star 21.5k

Code
Issues 927
Pull requests 105
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: Dao-AILab/flash-attention

Labels 9 Milestones 0

New pull request New

105 Open 383 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Cute, Bwd, Sm100] Add varlen for sm100 bwd

#2150 opened Jan 8, 2026 by jayhshah

Loading…

[Cute] Fix two tests that were failing

#2149 opened Jan 8, 2026 by henrylhtsang

Loading…

[Cute] Update deprecated cute DSL APIs

#2148 opened Jan 7, 2026 by henrylhtsang

Loading…

[ROCM] Add support with Infinity Cache (LLC) awareness for improved performance

#2147 opened Jan 7, 2026 by tianwyan

Loading…

[CUTE][SM100] Fix backward gqa on sm100 post mask-mod semantic change

#2146 opened Jan 7, 2026 by drisspg

Loading…

[CUTE][SM90]Enable pack-gqa with broadcasted maskmods

#2145 opened Jan 7, 2026 by drisspg

Loading…

[Cute] Clarify and fix subtle cachekey bug

#2143 opened Jan 6, 2026 by drisspg

Loading…

Add FLASH_ATTENTION_FORCE_NON_STABLE_API option to allow building on NVidia Pytorch 25.09 image

#2140 opened Jan 5, 2026 by jp-gr

Loading…

[Cute] API update to include FlexAttention parameters

#2138 opened Jan 5, 2026 by reubenconducts

Loading…

score-mod backward SM90

#2137 opened Jan 5, 2026 by drisspg

Loading…

block-sparse backward SM90

#2136 opened Jan 5, 2026 by drisspg

Loading…

[Cute,Fwd,Sm120] FA Cute DSL sm12x

#2113 opened Dec 31, 2025 by johnnynunez • Draft

Fix AMD Triton backend crash when dropout != 0 and return_attn_probs …

#2111 opened Dec 30, 2025 by Logiquo

Loading…

[Cute,Fwd,Sm100] fp8 e4m3 and e5m2 support

#2109 opened Dec 29, 2025 by dcw02

Loading…

refactor llama test

#2107 opened Dec 29, 2025 by m3ngyang

Loading…

[Fix typos] remove redundant double semicolons

#2106 opened Dec 29, 2025 by kisseternity

Loading…

[Cute,Fwd,Sm100] distributed offset calculation for paged KV

#2104 opened Dec 28, 2025 by timmy-feng

Loading…

[Cute,Fwd,Sm80] Fix fwd + add pack_gqa/local/learnable_sink

#2103 opened Dec 28, 2025 by GWS0428

Loading…

[Cute] Fix: arg pass in cute flash-attn inferface

#2101 opened Dec 27, 2025 by SeanLi-OI

Loading…

Fix softmax incorrect row_max issue

#2083 opened Dec 17, 2025 by imbr92

Loading…

Fix TypeError when ColumnParallelLinear is None

#2080 opened Dec 17, 2025 by ailuntz

Loading…

Reduce Chance of Build OOM

#2079 opened Dec 17, 2025 by Qubitium

Loading…

[Cute,Fwd,SM100] Add softmax precision control parameters: rescale_threshold and disable_e2e

#2067 opened Dec 15, 2025 by ssxuwinter

Loading…

Add missing code highlighting to the README

#2061 opened Dec 10, 2025 by bryant1410

Loading…

Update README.md

#2058 opened Dec 10, 2025 by eduardoruiz1999

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Updated in the last three days: updated:>2026-01-05.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!