Skip to content

Add parallelize method to GPT-neo models #11054

@anamtamais

Description

@anamtamais

🚀 Feature request

Add parallelize method to GPT-neo models, so we can finetune them using model parallelism using less expensive GPUs.

Motivation

I want to finetune a GPT-neo model using model parallelism in order to do it using less expensive GPUs. It is not yet implemented, and, as higher-end GPUs are too expensive, it would be better if we distributed the model along several less expensive GPUs rather than using a very expensive one. It would also make it possible for us to iterate using larger batches, what can have big impact on the model fitting.

I would be very glad if you people could do it and I think it would enable the finetuning of specific purpose GPT-neo language models.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions