Mamba & RecurrentGemma: enable strict signature by gante · Pull Request #31549 · huggingface/transformers

gante · 2024-06-22T11:27:46Z

What does this PR do?

Mamba accepts **kwargs, and thus attention_mask can be passed. Many users thus assume it behaves just like other models and can support left-padding.

RecurrentGemma also accept **kwargs, but simply not to crash generate.

This PR enables a strict signature on Mamba and RecurrentGemma.

gante · 2024-06-22T11:30:29Z

        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
-        **kwargs,  # `attention_mask` is passed by the tokenizer and we don't want it


alternatively, we can accept attention_mask and raise an exception when it is not None or not all ones

HuggingFaceDocBuilderDev · 2024-06-22T11:53:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Let's googoogogogogo 🚀

ArthurZucker · 2024-06-27T10:38:20Z

+            model_inputs.update({"output_attentions": output_attentions} if output_attentions else {})
+            model_inputs.update({"output_hidden_states": output_hidden_states} if output_hidden_states else {})


yesssss I think I have a PR open where I dod this! Finally!

amyeroberts · 2024-06-27T10:58:37Z

        use_cache: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
-        **kwargs,  # `attention_mask` is passed by the tokenizer and we don't want it


Removing this will break FDSP :( See #31161

@amyeroberts I had a look and it should be fine: this PR removes **kwargs from the model class (e.g. MambaModel), while the FSDP PR ensures there are **kwargs in the decoder layers (e.g. FalconDecoderLayer).

We can see on main that the model themselves don't have **kwargs, even after the FSDP fix (e.g. llama) 🤗

enable strict signature

a25e037

gante requested a review from ArthurZucker June 22, 2024 11:27

this should not have been deleted

20b49b5

gante commented Jun 22, 2024

View reviewed changes

gante changed the title ~~Mamba: enable strict signature~~ Mamba & RecurrentGemma: enable strict signature Jun 22, 2024

recurrent_gemma too

1358011

ArthurZucker approved these changes Jun 27, 2024

View reviewed changes

amyeroberts reviewed Jun 27, 2024

View reviewed changes

gante merged commit 594c161 into huggingface:main Jul 8, 2024

gante deleted the mamba_strict_signature branch July 8, 2024 14:48

ArthurZucker mentioned this pull request Aug 9, 2024

Google RecurrentGemma Models don't work in Transformers 4.43 anymore #32549

Closed

4 tasks

manueldeprada mentioned this pull request Apr 21, 2025

Fixes #37219 : RecurrentGemma crashes for inputs longer than sliding window length #37613

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mamba & RecurrentGemma: enable strict signature#31549

Mamba & RecurrentGemma: enable strict signature#31549
gante merged 3 commits intohuggingface:mainfrom
gante:mamba_strict_signature

gante commented Jun 22, 2024 •

edited

Loading

Uh oh!

gante Jun 22, 2024 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jun 22, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Jun 27, 2024

Uh oh!

amyeroberts Jun 27, 2024

Uh oh!

gante Jun 27, 2024

Uh oh!

amyeroberts Jun 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		model_inputs.update({"output_attentions": output_attentions} if output_attentions else {})
		model_inputs.update({"output_hidden_states": output_hidden_states} if output_hidden_states else {})

Conversation

gante commented Jun 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

gante Jun 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Jun 22, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Jun 27, 2024

Choose a reason for hiding this comment

Uh oh!

amyeroberts Jun 27, 2024

Choose a reason for hiding this comment

Uh oh!

gante Jun 27, 2024

Choose a reason for hiding this comment

Uh oh!

amyeroberts Jun 27, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gante commented Jun 22, 2024 •

edited

Loading

gante Jun 22, 2024 •

edited

Loading