`accelerate` support for `RoBERTa` family by younesbelkada · Pull Request #19906 · huggingface/transformers

younesbelkada · 2022-10-26T16:28:57Z

What does this PR do?

This PR adds accelerate support for:

RoBERTa
data2vec_text
Lilt
Luke
XLM-RoBERTa
CamemBERT
LongFormer
This way, any of the models above can be loaded in 8bit using load_in_8bit=True.

Since these models copy the same xxxLMHead from RoBERTa I had to change the copied modules too - happy also to break down this PR into several smaller PRs,

This PR also fixes a small bug on accelerate tests where the variable input_dict is overriden by xxForMultipleChoice models.

Can also confirm all slow tests pass (single + multiple GPUs)

cc @sgugger @ydshieh

Added `accelerate` support for - `RoBERTa` - `data2vec_text` - `Lilt` - `Luke` - `XLM-RoBERTa` fixes - small bug in `test_modeling_common`

HuggingFaceDocBuilderDev · 2022-10-26T16:42:16Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks, this looks like a way better fix!

sgugger · 2022-10-26T17:10:54Z

        # To tie those two weights if they get disconnected (on TPU or when the bias is resized)
-        self.bias = self.decoder.bias
+        # For accelerate compatibility and to not break backward compatibility
+        if self.decoder.bias.device == torch.device("meta"):


The test will probably break if PyTorch is < 1.9, so we need a safer way to test if the device is meta (can be in an util if the test ends up being long).

I propose a fix here, 05da693
I am not sure if device.type can be retrieved for PT<1.9 but it is something that I have seen on accelerate I think

Here is a quick try on torch == 1.7.1 !

>>> torch.__version__ '1.7.1' >>> vec = torch.randn(1, 1) >>> vec.device device(type='cpu') >>> vec.device.type 'cpu'

sgugger · 2022-10-26T17:11:18Z

    config_class = RobertaConfig
    base_model_prefix = "roberta"
    supports_gradient_checkpointing = True
+    _no_split_modules = []


We don't even need the base block?

Yes, for some models (roberta, lilt), passing an empty list was sufficient. I guess the accelerate tests are still run since the condition checks only if the list is None.

if model_class._no_split_modules is None: continue

accelerate support

0a02954

Added `accelerate` support for - `RoBERTa` - `data2vec_text` - `Lilt` - `Luke` - `XLM-RoBERTa` fixes - small bug in `test_modeling_common`

younesbelkada requested review from sgugger and ydshieh October 26, 2022 16:29

younesbelkada mentioned this pull request Oct 26, 2022

Support Roberta on accelerate #19850

Closed

sgugger reviewed Oct 26, 2022

View reviewed changes

change assert condition

05da693

sgugger approved these changes Oct 26, 2022

View reviewed changes

younesbelkada merged commit 7629656 into huggingface:main Oct 26, 2022

hackyon mentioned this pull request Feb 15, 2024

Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True #29024

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`accelerate` support for `RoBERTa` family#19906

`accelerate` support for `RoBERTa` family#19906
younesbelkada merged 2 commits intohuggingface:mainfrom
younesbelkada:add_accelerate_roberta

younesbelkada commented Oct 26, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 26, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

sgugger Oct 26, 2022

Uh oh!

younesbelkada Oct 26, 2022

Uh oh!

younesbelkada Oct 26, 2022

Uh oh!

sgugger Oct 26, 2022

Uh oh!

sgugger Oct 26, 2022

Uh oh!

younesbelkada Oct 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

younesbelkada commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

sgugger Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

younesbelkada Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

younesbelkada Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

younesbelkada Oct 26, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

younesbelkada commented Oct 26, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 26, 2022 •

edited

Loading