Support Roberta on `accelerate` by younesbelkada · Pull Request #19850 · huggingface/transformers

younesbelkada · 2022-10-24T18:12:49Z

What does this PR do?

This PR adds Roberta model family support with accelerate ! This aims to support int8 quantization for these models.
Before merging I have notice few nits that I need to fix and discuss! I am unsure whether these fixes should be here or in accelerate.

1- If the model has the attribute _keys_to_ignore_on_save, it seems that it does not get properly initialized by accelerate (but I might be missing something here). AFAIK all models that has accelerate support for now, has at most the attribute _keys_to_ignore_on_load_missing but not _keys_to_ignore_on_save. Therefore when the base model gets saved in the accelerate test, these parameters does not get saved in the state_dict. I had to come up with a modification in the _load_pretrained_model function to randomly initialize these parameters since they're ignored by the _load_state_dict_into_meta_model. This post processing trick happens here.
2- Therefore I had to change the accelerate tests. Since the parameters that are assigned in the _keys_to_ignore_on_save are initialized randomly, I propose to check the logits compatibility between the base model and the accelerate model only for the attention outputs and not the lm_head output. These modifications happens here - maybe this modification could happen in the super class?
3- Last nit: in the accelerate tests, it is better to not override the variable inputs_dict since inside the main loop, we can switch from a xxxForMultipleChoice to a xxxForQuestionAnswering model. Therefore, this variable does not get modified by the class function _prepare_for_class since it gets modified only if the model is a MODEL_FOR_MULTIPLE_CHOICE model. and not reset to the correct inputs_dict afterwards.

Need also to fix the slow test for lilt model where I am having ValueError: embeddings.position_ids doesn't have any device set.

cc @sgugger

younesbelkada · 2022-10-24T18:16:10Z

            model = LiltModel.from_pretrained(model_name)
            self.assertIsNotNone(model)

+    @require_accelerate


These tests could go on the super class, but not sure

HuggingFaceDocBuilderDev · 2022-10-24T18:27:54Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada · 2022-10-26T16:36:32Z

Closing in favor of #19906

younesbelkada added 2 commits October 23, 2022 21:19

v1

aefea27

add more conditions

caf88fe

younesbelkada commented Oct 24, 2022

View reviewed changes

younesbelkada marked this pull request as draft October 24, 2022 18:31

younesbelkada closed this Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Roberta on `accelerate`#19850

Support Roberta on `accelerate`#19850
younesbelkada wants to merge 2 commits intohuggingface:mainfrom
younesbelkada:add_roberta_accelerate

younesbelkada commented Oct 24, 2022

Uh oh!

younesbelkada Oct 24, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Oct 24, 2022 •

edited

Loading

Uh oh!

younesbelkada commented Oct 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

younesbelkada commented Oct 24, 2022

What does this PR do?

Uh oh!

younesbelkada Oct 24, 2022

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Oct 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada commented Oct 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HuggingFaceDocBuilderDev commented Oct 24, 2022 •

edited

Loading