[`Gemma2`] Support FA2 softcapping by ArthurZucker · Pull Request #31887 · huggingface/transformers

ArthurZucker · 2024-07-10T10:22:37Z

What does this PR do?

Adds support for the new FA2 softcapping following Dao-AILab/flash-attention#1025

HuggingFaceDocBuilderDev · 2024-07-10T12:53:41Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LysandreJik

OK, looks good to me! 2.6.0 was released 3 hours ago, let's go

amyeroberts

LGTM - thanks for adding!

* Support softcapping * strictly greater than * update

ShadowTeamCN · 2024-07-12T07:16:46Z

Good to see this. Can we use it for model fine-tuning, or is it just for inference? Google recommends fine-tuning in 'eager' mode.

ArthurZucker · 2024-07-12T10:05:04Z

Now you can use it for finetuning as well if you have the correct version of FA2. Not sure if finetuning "requires" it

heartkilla · 2024-07-12T10:08:52Z

Great! Any plans for sdpa support as well?

ArthurZucker · 2024-07-12T10:12:15Z

Sdpa is a bit more complicated, we need to use flex attention, did not have time to implement. Do you want to open a PR?

hiyouga · 2024-07-13T15:10:35Z

Hi @ArthurZucker, should we also add the sliding window and soft-capping to flash_attn_func

transformers/src/transformers/models/gemma2/modeling_gemma2.py

Lines 466 to 469 in fc35907

    
           else: 
        
               attn_output = flash_attn_func( 
        
                   query_states, key_states, value_states, dropout, softmax_scale=softmax_scale, causal=causal 
        
               )

just like

transformers/src/transformers/models/mistral/modeling_mistral.py

Lines 519 to 528 in fc35907

    
           else: 
        
               attn_output = flash_attn_func( 
        
                   query_states, 
        
                   key_states, 
        
                   value_states, 
        
                   dropout, 
        
                   softmax_scale=softmax_scale, 
        
                   causal=causal, 
        
                   window_size=(self.config.sliding_window, self.config.sliding_window), 
        
               )

ArthurZucker · 2024-07-15T12:39:18Z

It should be here on main: https://github.com/huggingface/transformers/blob/main/src/transformers/models/gemma2/modeling_gemma2.py#L361 we updated the whole FA2 integration .

On the release branch it was there AFAIK

hiyouga · 2024-07-15T13:55:18Z

Get it, thanks for replying!

ArthurZucker added 3 commits July 10, 2024 12:20

Support softcapping

88e1f18

strictly greater than

1e92cae

update

cd8d08d

ArthurZucker requested review from LysandreJik and amyeroberts July 11, 2024 08:04

LysandreJik approved these changes Jul 11, 2024

View reviewed changes

amyeroberts approved these changes Jul 11, 2024

View reviewed changes

ArthurZucker merged commit f4ec7a2 into main Jul 11, 2024

ArthurZucker deleted the gemma-fa2 branch July 11, 2024 09:57

SeongBeomLEE mentioned this pull request Jul 11, 2024

[fix] AttributeError in is_flash_attn_greater_or_equal #31908

Closed

ArthurZucker added a commit that referenced this pull request Jul 11, 2024

[Gemma2] Support FA2 softcapping (#31887)

e002fcd

* Support softcapping * strictly greater than * update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`Gemma2`] Support FA2 softcapping#31887

[`Gemma2`] Support FA2 softcapping#31887
ArthurZucker merged 3 commits intomainfrom
gemma-fa2

ArthurZucker commented Jul 10, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jul 10, 2024

Uh oh!

LysandreJik left a comment

Uh oh!

amyeroberts left a comment

Uh oh!

ShadowTeamCN commented Jul 12, 2024

Uh oh!

ArthurZucker commented Jul 12, 2024

Uh oh!

heartkilla commented Jul 12, 2024

Uh oh!

ArthurZucker commented Jul 12, 2024

Uh oh!

hiyouga commented Jul 13, 2024 •

edited

Loading

Uh oh!

ArthurZucker commented Jul 15, 2024 •

edited

Loading

Uh oh!

hiyouga commented Jul 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

ArthurZucker commented Jul 10, 2024

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jul 10, 2024

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

ShadowTeamCN commented Jul 12, 2024

Uh oh!

ArthurZucker commented Jul 12, 2024

Uh oh!

heartkilla commented Jul 12, 2024

Uh oh!

ArthurZucker commented Jul 12, 2024

Uh oh!

hiyouga commented Jul 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArthurZucker commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hiyouga commented Jul 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

hiyouga commented Jul 13, 2024 •

edited

Loading

ArthurZucker commented Jul 15, 2024 •

edited

Loading