Skip to content

Support Roberta on accelerate#19850

Closed
younesbelkada wants to merge 2 commits intohuggingface:mainfrom
younesbelkada:add_roberta_accelerate
Closed

Support Roberta on accelerate#19850
younesbelkada wants to merge 2 commits intohuggingface:mainfrom
younesbelkada:add_roberta_accelerate

Conversation

@younesbelkada
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR adds Roberta model family support with accelerate ! This aims to support int8 quantization for these models.
Before merging I have notice few nits that I need to fix and discuss! I am unsure whether these fixes should be here or in accelerate.

1- If the model has the attribute _keys_to_ignore_on_save, it seems that it does not get properly initialized by accelerate (but I might be missing something here). AFAIK all models that has accelerate support for now, has at most the attribute _keys_to_ignore_on_load_missing but not _keys_to_ignore_on_save. Therefore when the base model gets saved in the accelerate test, these parameters does not get saved in the state_dict. I had to come up with a modification in the _load_pretrained_model function to randomly initialize these parameters since they're ignored by the _load_state_dict_into_meta_model. This post processing trick happens here.
2- Therefore I had to change the accelerate tests. Since the parameters that are assigned in the _keys_to_ignore_on_save are initialized randomly, I propose to check the logits compatibility between the base model and the accelerate model only for the attention outputs and not the lm_head output. These modifications happens here - maybe this modification could happen in the super class?
3- Last nit: in the accelerate tests, it is better to not override the variable inputs_dict since inside the main loop, we can switch from a xxxForMultipleChoice to a xxxForQuestionAnswering model. Therefore, this variable does not get modified by the class function _prepare_for_class since it gets modified only if the model is a MODEL_FOR_MULTIPLE_CHOICE model. and not reset to the correct inputs_dict afterwards.

Need also to fix the slow test for lilt model where I am having ValueError: embeddings.position_ids doesn't have any device set.

cc @sgugger

model = LiltModel.from_pretrained(model_name)
self.assertIsNotNone(model)

@require_accelerate
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests could go on the super class, but not sure

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Oct 24, 2022

The documentation is not available anymore as the PR was closed or merged.

@younesbelkada younesbelkada marked this pull request as draft October 24, 2022 18:31
@younesbelkada
Copy link
Copy Markdown
Contributor Author

Closing in favor of #19906

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants