Skip to content

deleted references of self.vocab_size and self.type_vocab_size for multiple models [TF implementation]#21164

Merged
gante merged 28 commits intohuggingface:mainfrom
susnato:resize_token_embeddings_models
Jan 20, 2023
Merged

deleted references of self.vocab_size and self.type_vocab_size for multiple models [TF implementation]#21164
gante merged 28 commits intohuggingface:mainfrom
susnato:resize_token_embeddings_models

Conversation

@susnato
Copy link
Copy Markdown
Contributor

@susnato susnato commented Jan 18, 2023

What does this PR do?

This PR deletes references of self.vocab_size and self.type_vocab_size for these models [Tensorflow implementation] : bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2, camembert, clip, ctrl, deberta, deberta_v2, distilbert, esm, funnel, gptj, groupvit, longformer, mobilebert, mpnet, openai, rembert, roberta, roberta_prelayernorm, roformer, xlm_roberta, xlnet.

Before submitting

Who can review?

@gante

@susnato
Copy link
Copy Markdown
Contributor Author

susnato commented Jan 18, 2023

Hi, @gante as you said here I made changes for src/transformers/models/albert/modeling_tf_albert.py, could you please check it? If it's ok then I will push other changes of other models.

The failed test ci/circleci: check_repository_consistency are due to the fact that I only changed Embeddings for albert model and it is different from TFBertEmbeddings , the test will be successful when I change them too.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Jan 18, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Copy Markdown
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@susnato LGTM, this is precisely what I wanted to do!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have the patience to do it, removing the other attributes that are set from config (like self.type_vocab_size) would also be great :) That way there will be a greater similarity between PT and TF 🧡

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gante
Ok, I will remove self.type_vocab_size, btw there seem to be 3 more lines

self.embedding_size = config.embedding_size
self.max_position_embeddings = config.max_position_embeddings
self.initializer_range = config.initializer_range

I don't think we need to change them for this purpose, right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed, but it would be a nice refactor :) Don't worry about it if you are not interested in making the change, we can pick it up afterward -- the vocab_size is the most important one for now, as it is causing bugs.

@gante
Copy link
Copy Markdown
Contributor

gante commented Jan 18, 2023

@susnato before moving forward, let's make sure our transformers master agrees with the set of changes we are about to make :)


@sgugger this PR contains a change I want to apply across TF models, and @susnato is kindly doing the bulk of the work. In essence, some config attributes are not constant throughout the model's life, like config.vocab_size, and our TF implementation stores them as immutable class attributes (e.g. self.vocab_size = config.vocab_size). PT doesn't have this problem, since it simply stores self.config = config in the layers, which benefits from the updates the mutable config dictionary may receive elsewhere.

The proposed change is to make TF implementation closer to PT implementation and store self.config in the layers, as opposed to individual configuration parameters. This also solves the bug that triggered this discussion, where the vocabulary size was not being correctly updated and causing exceptions.

Let us know if you are okay with us making this change over most model architectures 🚀

Copy link
Copy Markdown
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have always wondered why TensorFlow models did not store the config. Just pinging @Rocketknight1 to also have a look in case there is some TF-arcane behing this that would break something but otherwise all for this change!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great!

@susnato susnato force-pushed the resize_token_embeddings_models branch from 3772de9 to 0f75e19 Compare January 20, 2023 05:32
@susnato susnato changed the title deleted references of self.vocab_size only for albert deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm Jan 20, 2023
@susnato susnato changed the title deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2 Jan 20, 2023
@susnato susnato changed the title deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2 deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2 [TF] Jan 20, 2023
@susnato susnato changed the title deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2 [TF] deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2, camembert, clip, ctrl, deberta, deberta_v2, distilbert, esm, funnel, gptj, groupvit [TF] Jan 20, 2023
@gante
Copy link
Copy Markdown
Contributor

gante commented Jan 20, 2023

@susnato you might need to run make fixup locally, to automatically format the code and make our CI happy

@susnato susnato changed the title deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2, camembert, clip, ctrl, deberta, deberta_v2, distilbert, esm, funnel, gptj, groupvit [TF] deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2, camembert, clip, ctrl, deberta, deberta_v2, distilbert, esm, funnel, gptj, groupvit, longformer, mobilebert, mpnet, openai, rembert, roberta, roberta_prelayernorm, roformer, xlm_roberta, xlnet [TF] Jan 20, 2023
@susnato
Copy link
Copy Markdown
Contributor Author

susnato commented Jan 20, 2023

Hi, @gante I added all the models I found to have self.vocab_size and removed reference to self.vocab_size and self.type_vocab_size and also all the tests are passed! Could you please check it?

Copy link
Copy Markdown
Contributor

@gante gante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Can we edit the PR title to a shorter one before merging? 😅

@susnato susnato changed the title deleted references of self.vocab_size for bert, albert, lxmert, electra, tapas, convbert, layoutlm, gpt2, camembert, clip, ctrl, deberta, deberta_v2, distilbert, esm, funnel, gptj, groupvit, longformer, mobilebert, mpnet, openai, rembert, roberta, roberta_prelayernorm, roformer, xlm_roberta, xlnet [TF] deleted references of self.vocab_size and self.type_vocab_size for multiple models [TF implementation] Jan 20, 2023
@susnato
Copy link
Copy Markdown
Contributor Author

susnato commented Jan 20, 2023

LGTM +1

Can we edit the PR title to a shorter one before merging? sweat_smile

@gante Done!

@gante
Copy link
Copy Markdown
Contributor

gante commented Jan 20, 2023

Awesome!

Thank you for all the work you've put into fixing this, @susnato 🤗

@gante gante merged commit 202d686 into huggingface:main Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants