Support Roberta on accelerate#19850
Closed
younesbelkada wants to merge 2 commits intohuggingface:mainfrom
Closed
Conversation
younesbelkada
commented
Oct 24, 2022
| model = LiltModel.from_pretrained(model_name) | ||
| self.assertIsNotNone(model) | ||
|
|
||
| @require_accelerate |
Contributor
Author
There was a problem hiding this comment.
These tests could go on the super class, but not sure
|
The documentation is not available anymore as the PR was closed or merged. |
Contributor
Author
|
Closing in favor of #19906 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR adds
Robertamodel family support withaccelerate! This aims to supportint8quantization for these models.Before merging I have notice few nits that I need to fix and discuss! I am unsure whether these fixes should be here or in
accelerate.1- If the model has the attribute
_keys_to_ignore_on_save, it seems that it does not get properly initialized byaccelerate(but I might be missing something here). AFAIK all models that hasacceleratesupport for now, has at most the attribute_keys_to_ignore_on_load_missingbut not_keys_to_ignore_on_save. Therefore when the base model gets saved in theacceleratetest, these parameters does not get saved in the state_dict. I had to come up with a modification in the_load_pretrained_modelfunction to randomly initialize these parameters since they're ignored by the_load_state_dict_into_meta_model. This post processing trick happens here.2- Therefore I had to change the
acceleratetests. Since the parameters that are assigned in the_keys_to_ignore_on_saveare initialized randomly, I propose to check the logits compatibility between the base model and the accelerate model only for the attention outputs and not thelm_headoutput. These modifications happens here - maybe this modification could happen in the super class?3- Last nit: in the
acceleratetests, it is better to not override the variableinputs_dictsince inside the main loop, we can switch from axxxForMultipleChoiceto axxxForQuestionAnsweringmodel. Therefore, this variable does not get modified by the class function_prepare_for_classsince it gets modified only if the model is aMODEL_FOR_MULTIPLE_CHOICEmodel. and not reset to the correctinputs_dictafterwards.Need also to fix the slow test for
liltmodel where I am havingValueError: embeddings.position_ids doesn't have any device set.cc @sgugger