Training text_to_img with multi gpu setup

Hi,

I really like the text_to_image training script and had really good results with it. I want to train a model now with multiple GPUs.
My current understanding:
I want to use a data paralell setup thus each GPU gets its batches and at the end they sync the gradients. This should not result in a significant increase of training time (right?). 
So far, when training with 2 GPUs the training time is doubled compared to a single GPU setup. The loss converges faster because of the higher batch size but the overall throughput / h would not increase that way.

Please correct me if i am wrong. 

If i am correct, how do i enable the DP setup in accelerate or should I revert to torch and its DataParalell wrapper?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training text_to_img with multi gpu setup #2981

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Training text_to_img with multi gpu setup #2981

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions