Skip to content

Fix AttributeError: add num_hidden_layers property to T5GemmaConfig#40920

Closed
avchauzov wants to merge 2 commits intohuggingface:mainfrom
avchauzov:fix-t5gemma-num-hidden-layers
Closed

Fix AttributeError: add num_hidden_layers property to T5GemmaConfig#40920
avchauzov wants to merge 2 commits intohuggingface:mainfrom
avchauzov:fix-t5gemma-num-hidden-layers

Conversation

@avchauzov
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR fixes an AttributeError that occurs when using T5Gemma models with Seq2SeqTrainer. The issue arises because DynamicCache in cache_utils.py expects config.num_hidden_layers to exist, but T5GemmaConfig doesn't have this attribute.

Changes:

  • Added num_hidden_layers property to T5GemmaConfig that returns encoder.num_hidden_layers
  • Added test to verify the property works correctly and is read-only

Problem: When training T5Gemma models with Seq2SeqTrainer, the code fails with:
AttributeError: 'T5GemmaConfig' object has no attribute 'num_hidden_layers'

Solution: Follow the pattern used by other encoder-decoder models (ProphetNet, Funnel) by adding a computed property that returns the encoder's layer count.

Fixes #40901

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker @SunMarc

(T5Gemma model configuration and Seq2SeqTrainer integration)

- Add num_hidden_layers property that returns encoder.num_hidden_layers
- Fixes issue with Seq2SeqTrainer and DynamicCache expecting config.num_hidden_layers
- Resolves #40901
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: t5gemma

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! Left a comment and pinging @gante for the generation part

Comment on lines +619 to +620
encoder_num_hidden_layers=self.model_tester.encoder_num_hidden_layers,
decoder_num_hidden_layers=self.model_tester.decoder_num_hidden_layers,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you are trying to do here as it doesn't exist

Comment on lines +193 to +194
encoder_num_hidden_layers=self.encoder_num_hidden_layers,
decoder_num_hidden_layers=self.decoder_num_hidden_layers,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment on lines +330 to +341
@property
def num_hidden_layers(self) -> int:
"""Number of hidden layers in the encoder."""
return self.encoder.num_hidden_layers

@num_hidden_layers.setter
def num_hidden_layers(self, value):
"""Set number of hidden layers in the encoder."""
raise NotImplementedError(
"This model does not support setting `num_hidden_layers`. "
"Please set `encoder.num_hidden_layers` and `decoder.num_hidden_layers` separately."
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we want this, maybe it will be better to just fix the generation to get the right attribute ?

@SunMarc SunMarc requested a review from gante September 17, 2025 16:29
Copy link
Copy Markdown
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening the PR @avchauzov, but the fix is already addressed in #40939 🤗

The actual fix consists in removing a bad overwrite, and then fixing a few subtle bugs :)

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Sep 18, 2025

Nice !

@avchauzov
Copy link
Copy Markdown
Contributor Author

@gante Thank you! I'm closing this PR then!

@avchauzov avchauzov closed this Sep 21, 2025
@avchauzov avchauzov deleted the fix-t5gemma-num-hidden-layers branch September 21, 2025 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot fine-tune T5Gemma with Seq2SeqTrainer

3 participants