Fix 'T5GemmaConfig' object has no attribute 'num_hidden_layers' by lewtun · Pull Request #40454 · huggingface/transformers

lewtun · 2025-08-26T11:57:51Z

What does this PR do?

Fixes AttributeError: 'T5GemmaConfig' object has no attribute 'num_hidden_layers' when training. I wasn't sure if this kind of issue is typically unit tested, so I'm happy to add a regression test if you can point me to where it should go :)

Minimal repro script below:

echo -e "Question: Why is the sky blue? Answer:" | transformers run --task text2text-generation --model google/t5gemma-b-b-ul2 --device 0

Stack trace on main (commit 58cebc848baa0af2e4ff159fb11504d94179f376):

Traceback (most recent call last):
  File "/fsx/lewis/git/hf/transformers/transformers/bin/transformers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/commands/transformers_cli.py", line 59, in main
    service.run()
  File "/fsx/lewis/git/hf/transformers/src/transformers/commands/run.py", line 99, in run
    output = nlp(**entry) if self._reader.is_multi_columns else nlp(entry)
                                                                ^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/pipelines/text2text_generation.py", line 191, in __call__
    result = super().__call__(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/pipelines/base.py", line 1467, in __call__
    return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/pipelines/base.py", line 1474, in run_single
    model_outputs = self.forward(model_inputs, **forward_params)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/pipelines/base.py", line 1374, in forward
    model_outputs = self._forward(model_inputs, **forward_params)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/pipelines/text2text_generation.py", line 220, in _forward
    output_ids = self.model.generate(**model_inputs, **generate_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/transformers/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/generation/utils.py", line 2399, in generate
    self._prepare_cache_for_generation(
  File "/fsx/lewis/git/hf/transformers/src/transformers/generation/utils.py", line 2007, in _prepare_cache_for_generation
    else EncoderDecoderCache(DynamicCache(**dynamic_cache_kwargs), DynamicCache(**dynamic_cache_kwargs))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/cache_utils.py", line 1019, in __init__
    for _ in range(config.num_hidden_layers)
                   ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fsx/lewis/git/hf/transformers/src/transformers/configuration_utils.py", line 207, in __getattribute__
    return super().__getattribute__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'T5GemmaConfig' object has no attribute 'num_hidden_layers'

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

lewtun · 2025-08-26T12:46:15Z

Ah I see my change led to some failed tests - let me investigate

HuggingFaceDocBuilderDev · 2025-08-26T13:59:10Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante · 2025-08-27T09:37:24Z

@lewtun I think this is an issue with the EncoderDecoderCache initialization, rather than with T5Gemma -- #40277 corrected a few things for encoder-decoder models, but I missed the lines listed in your stack trace!

In a nutshell, the EncoderDecoderCache initialization in L2003-2007 in utils.py needs to follow the same pattern as in _get_cache, and pull the right encoder/decoder sub-configs before initializing the corresponding DynamicCache.

Do you want to have a go at it, or would you rather have me fix it? 🤗

lewtun · 2025-08-27T09:38:41Z

@lewtun I think this is an issue with the EncoderDecoderCache initialization, rather than with T5Gemma -- #40277 corrected a few things for encoder-decoder models, but I missed the lines listed in your stack trace!

In a nutshell, the EncoderDecoderCache initialization in L2003-2007 in utils.py needs to follow the same pattern as in _get_cache, and pull the right encoder/decoder sub-configs before initializing the corresponding DynamicCache.

Do you want to have a go at it, or would you rather have me fix it? 🤗

Thanks for the context! I can have a go at it :)

gante

A few nits to trim unnecessary logic :D

gante · 2025-08-27T10:47:14Z

+                    # Access the encoder/decoder sub-configs directly for models like T5Gemma
+                    if hasattr(self.config, "decoder") and hasattr(self.config, "encoder"):
+                        decoder_cache_kwargs["config"] = self.config.decoder
+                        encoder_cache_kwargs["config"] = self.config.encoder
+                    else:


This shouldn't be needed -- if self.config.get_text_config(...) is not working properly, then it means it is missing logic :)

(I think get_text_config() is missing "encoder" in the encoder_possible_text_config_names list)

Thanks for the tip, I tried adding "encoder" to encoder_possible_text_config_names but then hit this constraint when get_text_config() is called with default values:

transformers/src/transformers/configuration_utils.py

Lines 1226 to 1230 in d10603f

if len(valid_text_config_names) > 1:

raise ValueError(

f"Multiple valid text configs were found in the model config: {valid_text_config_names}. In this "

"case, using `get_text_config()` would be ambiguous. Please specify the desired text config directly, "

"e.g. `text_config = config.sub_config_name`"

I have now tried to override the get_text_config() method for T5Gemma specifically, but let me know if you think the base method should be made to work out of the box instead

gante · 2025-08-27T10:49:35Z

+                # For encoder-decoder models, we need to use separate configs for encoder and decoder
+                decoder_cache_kwargs = dynamic_cache_kwargs.copy()
+                encoder_cache_kwargs = dynamic_cache_kwargs.copy()


dynamic_cache_kwargs only contains a config which we know for sure we won't use in this branch -- we don't need to copy the object, we can create a new dictionary

gante · 2025-08-27T10:50:25Z

+                decoder_cache_kwargs = dynamic_cache_kwargs.copy()
+                encoder_cache_kwargs = dynamic_cache_kwargs.copy()
+
+                if "config" in dynamic_cache_kwargs:


this is always true in this branch

github-actions · 2025-08-28T09:54:34Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: t5gemma

gante · 2025-08-29T17:34:22Z

(see #40553 )

gante · 2025-09-01T12:56:32Z

closing in favor of #40553

lewtun requested a review from gante August 26, 2025 11:58

lewtun commented Aug 26, 2025

View reviewed changes

Comment thread src/transformers/models/t5gemma/modeling_t5gemma.py Outdated

lewtun marked this pull request as draft August 26, 2025 13:49

lewtun removed the request for review from gante August 26, 2025 13:49

lewtun mentioned this pull request Aug 27, 2025

T5Gemma failing on provided example #39522

Closed

4 tasks

Fix num_hidden_layers

725d8f3

lewtun force-pushed the fix-t5-gemma-config branch from a18531d to 725d8f3 Compare August 27, 2025 08:10

lewtun changed the title ~~Fix T5Gemma config~~ Fix 'T5GemmaConfig' object has no attribute 'num_hidden_layers' Aug 27, 2025

lewtun added 4 commits August 27, 2025 10:11

Merge branch 'main' into fix-t5-gemma-config

189cfe9

Add missing setter

cc21a55

Fix config

879e295

Fix qualiry

3942fb8

lewtun marked this pull request as ready for review August 27, 2025 09:24

lewtun requested a review from ArthurZucker August 27, 2025 09:24

lewtun added 2 commits August 27, 2025 09:43

Revert T5Gemma

4567081

Fix cache

ef13c59

lewtun force-pushed the fix-t5-gemma-config branch from 1aef8f6 to ef13c59 Compare August 27, 2025 09:50

Merge branch 'main' into fix-t5-gemma-config

086df61

gante reviewed Aug 27, 2025

View reviewed changes

Simplify logic

4464609

lewtun commented Aug 28, 2025

View reviewed changes

Comment thread src/transformers/generation/utils.py Outdated

lewtun commented Aug 28, 2025

View reviewed changes

Comment thread src/transformers/models/t5gemma/configuration_t5gemma.py

lewtun added 3 commits August 28, 2025 11:01

Merge branch 'main' into fix-t5-gemma-config

859075f

Fix edge cases

df57c45

Fix style

5e07abf

Merge branch 'main' into fix-t5-gemma-config

22fddd7

gante mentioned this pull request Aug 29, 2025

🔴🔴 [config] Add get_sub_config (in place of get_text_config) and fix related bugs #40553

Closed

gante closed this Sep 1, 2025

gante mentioned this pull request Sep 17, 2025

[t5gemma] fix get_text_config and related fixes #40939

Merged

	if len(valid_text_config_names) > 1:
	raise ValueError(
	f"Multiple valid text configs were found in the model config: {valid_text_config_names}. In this "
	"case, using `get_text_config()` would be ambiguous. Please specify the desired text config directly, "
	"e.g. `text_config = config.sub_config_name`"

Conversation

lewtun commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

lewtun commented Aug 26, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 26, 2025

Uh oh!

gante commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun commented Aug 27, 2025

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

gante Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lewtun Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

gante Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

gante Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Aug 28, 2025

Uh oh!

gante commented Aug 29, 2025

Uh oh!

gante commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lewtun commented Aug 26, 2025 •

edited

Loading

gante commented Aug 27, 2025 •

edited

Loading

gante Aug 27, 2025 •

edited

Loading