🚨 [v5] remove deprecated cache tuple input support by gante · Pull Request #41405 · huggingface/transformers

gante · 2025-10-07T10:24:53Z

What does this PR do?

This PR:

Removes the last remnants (🤞) of tuple cache input support, scheduled to be removed in v4.58
Removes other minor items scheduled to be removed in v4.58, caught in the search
Updates a few incorrect past_key_values type hints
⚠️ EDIT: add cache initialization in forward on models that were missing it

HuggingFaceDocBuilderDev · 2025-10-07T10:33:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-10-07T11:41:28Z

CI is failing for models that don't contain cache initialization in their forward pass, working on it

…se_cache=True

zucchini-nlp

Great

zucchini-nlp · 2025-10-07T11:54:44Z

-        if requires_cross_attention_cache and not isinstance(model_kwargs[cache_name], EncoderDecoderCache):
+        if (
+            requires_cross_attention_cache
+            and cache_name in model_kwargs


ig this is for special models where name isn't "past_key_values". Is it not going to fail later in model, if we leave the non-EncoderDecoderCache cache?

This branch, which converts an decoder-only cache into an encoder-decoder cache, is meant to kick in in the following case:

the model is encoder-decoder

the user has specified that they want a specific cache implementation

I'll replace cache_name by "past_key_values" to make it clearer that only models with the default cache name will use this branch

zucchini-nlp · 2025-10-07T11:58:49Z

        else:
            use_cache = False

-        return_legacy_cache = False


imo we should prepare cache from scratch if it is None, similar to other models. Same for other bert-like models

yes, that was indeed missing 👍 adding in this PR

zucchini-nlp · 2025-10-07T12:02:02Z


        self._vision_feature_layer = kwargs.get("vision_feature_layer", -1)

-        @property


lets delete the line 126 as well, it is not used in modeling

github-actions · 2025-10-07T12:35:06Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aimv2, autoformer, bark, bart, bert, bert_generation, big_bird, bigbird_pegasus, biogpt, blenderbot, blenderbot_small, blip, bloom, blt, bridgetower, camembert

vasqu

Don't have any major comments but we should sync with #41378 (which I noticed halfway through here)

Seems like a bit of duplicated effort atm. So it's hard to say how to proceed without proper comms with @Cyrilvallez here. Otherwise LGTM (when the CI failures are fixed)

vasqu · 2025-10-07T12:29:18Z

        encoder_hidden_states: Optional[torch.Tensor] = None,
        encoder_attention_mask: Optional[torch.Tensor] = None,
-        past_key_values: Optional[Union[list[torch.FloatTensor], Cache]] = None,
+        past_key_values: Optional[Cache] = None,


Finally 🙏

vasqu · 2025-10-07T12:34:45Z

+            if isinstance(past_key_values, DynamicCache):
                past_key_values = EncoderDecoderCache(past_key_values, DynamicCache(config=self.config))


Not super important but we still need this if generation checks for cross attn cache necessity and converts if necessary (also slightly changed in this PR for other cache names ig). Might be that this needs this custom logic, not familiar with the model here.

Relevant lines in main

transformers/src/transformers/generation/utils.py

Lines 2019 to 2023 in 0464d9e

if requires_cross_attention_cache and not isinstance(model_kwargs[cache_name], EncoderDecoderCache):

model_kwargs[cache_name] = EncoderDecoderCache(

model_kwargs[cache_name], # self-attention cache

DynamicCache(**dynamic_cache_kwargs), # cross-attention cache

)

vasqu · 2025-10-07T12:38:22Z

                )
                use_cache = False

-        return_legacy_cache = False


We have a lot of deletions for a lot of models similar to here. Do all of them need a cache init or is it more model dependent? (Sorry for the mess here on the dependencies in the case we need them all 😢)

gante · 2025-10-07T12:49:26Z

@vasqu indeed duplicated work. I'm going to sync with @Cyrilvallez

gante · 2025-10-07T12:52:21Z

Closing in place of #41378 , which is more complete!

gante added 2 commits October 7, 2025 10:07

rm tuple input support (and other minor v4.58 deprecations)

e91e8bf

cache type hints

348fe07

gante requested review from vasqu and zucchini-nlp October 7, 2025 10:25

gante added 2 commits October 7, 2025 11:22

tests

53bd34b

make fixup

82c7022

bert-based models now initialize a cache in their forward pass when u…

74a6fb6

…se_cache=True

zucchini-nlp approved these changes Oct 7, 2025

View reviewed changes

PR comments by Raushan

e8b3b1c

vasqu reviewed Oct 7, 2025

View reviewed changes

a few more cache inits

ceefec6

gante mentioned this pull request Oct 7, 2025

🚨🚨 Remove all traces of legacy cache format #41378

Merged

gante closed this Oct 7, 2025


		self._vision_feature_layer = kwargs.get("vision_feature_layer", -1)

		@property

		if isinstance(past_key_values, DynamicCache):
		past_key_values = EncoderDecoderCache(past_key_values, DynamicCache(config=self.config))

	if requires_cross_attention_cache and not isinstance(model_kwargs[cache_name], EncoderDecoderCache):
	model_kwargs[cache_name] = EncoderDecoderCache(
	model_kwargs[cache_name], # self-attention cache
	DynamicCache(**dynamic_cache_kwargs), # cross-attention cache
	)

Conversation

gante commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 7, 2025

Uh oh!

gante commented Oct 7, 2025

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Oct 7, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante commented Oct 7, 2025

Uh oh!

gante commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gante commented Oct 7, 2025 •

edited

Loading