Fix: StaticCache & `inputs_embeds` by zucchini-nlp · Pull Request #32932 · huggingface/transformers

zucchini-nlp · 2024-08-22T06:58:37Z

What does this PR do?

Fixes #32911. Enables generation with Static Cache and inputs embeds, previously it was failing due to incorrect calculation of max_cache_length

Added a test for that and added tests for Gemma2ForCausalLM. Some things to note:

Gemma2 doesn't support StaticCache. It can with some small changes but imo we shouldn't
Static shape cache classes have no support for contrastive search, dola, low-memory generation and assisted decoding. So these tests are all skipped in Gemma2. I think if we want to enable the, it should go on another PR for upgrading static cache classes

HuggingFaceDocBuilderDev · 2024-08-22T07:31:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Thanks, to nits but good otherwise. Do we take the max of num beams, num return sequences because they stem from beams?

gante

Thank you for taking care of gemma 2 🤗

gante · 2024-08-23T13:24:16Z

        else ()
    )
-    all_generative_model_classes = ()
+    all_generative_model_classes = (Gemma2ForCausalLM,) if is_torch_available() else ()


😱 good spot!

This was removed because it was faiiling too many tests

yes, I skipped those that shouldn't be triggered due to model-specific cache and fixed other failing ones

gante · 2024-08-23T13:38:13Z

+    def test_generate_from_inputs_embeds_with_static_cache(self):
+        pass
+
+    def _check_attentions_for_generate(


Let's add the reason for the overwrite at the top of the fn as a comment, here an on the other functions that need an overwrite! That way, we immediately know why the function needs to exist :)

(I see that you added a few comments below, like HybridCache has fixed length for key/values, moving it to the top suffices)

gante

Thank you for iterating 💛

zzxslp · 2024-09-06T06:11:44Z

Hi, run into similar errors as in #32911, will this PR get merged?

zucchini-nlp · 2024-09-06T07:33:41Z

Yes, merging now, should be ready

squash commit

zucchini-nlp requested review from ArthurZucker and gante August 22, 2024 06:58

ArthurZucker reviewed Aug 22, 2024

View reviewed changes

Comment thread src/transformers/generation/utils.py Outdated

Comment thread src/transformers/generation/utils.py Outdated

gante reviewed Aug 23, 2024

View reviewed changes

squash commit

fce9e7e

zucchini-nlp force-pushed the embeds-with-static-cache branch from 4dd1494 to fce9e7e Compare August 30, 2024 17:33

zucchini-nlp and others added 2 commits August 30, 2024 19:34

Merge branch 'main' into embeds-with-static-cache

926eaa0

Merge branch 'huggingface:main' into embeds-with-static-cache

0378197

zucchini-nlp requested a review from ArthurZucker September 4, 2024 10:05

gante approved these changes Sep 5, 2024

View reviewed changes

Merge branch 'huggingface:main' into embeds-with-static-cache

862fddc

zucchini-nlp merged commit 1759bb9 into huggingface:main Sep 6, 2024

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024

Fix: StaticCache & inputs_embeds (huggingface#32932)

12cdc55

squash commit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: StaticCache & `inputs_embeds`#32932

Fix: StaticCache & `inputs_embeds`#32932
zucchini-nlp merged 4 commits intohuggingface:mainfrom
zucchini-nlp:embeds-with-static-cache

zucchini-nlp commented Aug 22, 2024 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

gante left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gante Aug 23, 2024

Uh oh!

ArthurZucker Aug 27, 2024

Uh oh!

zucchini-nlp Aug 27, 2024

Uh oh!

gante Aug 23, 2024

Uh oh!

gante left a comment

Uh oh!

zzxslp commented Sep 6, 2024

Uh oh!

zucchini-nlp commented Sep 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

zucchini-nlp commented Aug 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gante Aug 23, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

gante Aug 23, 2024

Choose a reason for hiding this comment

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

zzxslp commented Sep 6, 2024

Uh oh!

zucchini-nlp commented Sep 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zucchini-nlp commented Aug 22, 2024 •

edited

Loading