Add parallelize method to GPT-neo models

# 🚀 Feature request

Add `parallelize` method to GPT-neo models, so we can finetune them using model parallelism using less expensive GPUs.

## Motivation

I want to finetune a GPT-neo model using model parallelism in order to do it using less expensive GPUs. It is not yet implemented, and, as higher-end GPUs are too expensive, it would be better if we distributed the model along several less expensive GPUs  rather than using a very expensive one. It would also make it possible for us to iterate using larger batches, what can have big impact on the model fitting. 

I would be very glad if you people could do it and I think it would enable the finetuning of specific purpose GPT-neo language models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parallelize method to GPT-neo models #11054

🚀 Feature request

Motivation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add parallelize method to GPT-neo models #11054

Description

🚀 Feature request

Motivation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions