-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
While investigating #2851 (getting error message when using push_to_hub in a training), I realized that most of the training scripts rely on Repository to push training data to the Hub at the end of the training (train_text_to_image_flax, train_instruct_pix2pix, train_dreambooth,...).
I guess this is due to some copy-pasting from another script (and that's fine since it works). However, I think it would be better to use upload_folder instead of Repository. The differences I see are:
upload_folderis expected to be ~1.3x to 1.6x faster than git push- if the repo already exists, there is no need to clone it locally before starting the training. With upload_folder, the "clone" step is avoided
upload_folderdo not have a sense of "local repo". This is a problem if we want the users to have a local copy of the repo at the end of their training with a commit history. Withupload_folderwe would still have the artifacts locally but not as a git repo, just as a normal folder.upload_folderis more "user-friendly" with a progress bar compared to running "git push" in the background without any logsRepositoryhas the advantage of being able to run in the background. However, since the scripts only pushes once at the end of the training, it's not a problem to have a "blocking" step.
@patrickvonplaten @pcuenca @sayakpaul what's your opinion on that? If you also think it makes sense, I'd be glad to open a PR. Just let me know if there are some specificities I need to know.
Note: about #2851 itself and if we keep Repository, I think it would be best to use repo.push_to_hub(..., blocking=True) in all the scripts in order to properly block the main thread at the end of the training.