Dreambooth qol improvements (generating samples, device selection, multi res, checkpoint conversion) #2030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

subpanic wants to merge 9 commits into huggingface:main from subpanic:dreambooth_qol_improvements

subpanic commented Jan 18, 2023

Some additional features and quality of life improvements for the dream booth script. Opening a PR in case some/all of these are worth merging. I believe all additions should be non-destructive for existing script users assuming autocast isn't a problem (but it might be, so worth investigating).

Support for optionally generating sample image outputs for each checkpoint
Supporting for optionally automatically converting each checkpoint to a diffusers model
Support for optionally allowing arbitrary image resolutions in training data (sample generation will still be set by the resolution argument)
Support for optionally setting a specific device to train on

Some questions around areas I'm not 100% on for broader audience:

In order to support training on a specific device (instead of restricting purely to accelerator config), I explicitly set_device for torch and directly access the accelerator.state and set the device. Unsure if there's a cleaner way to do this
Generating sample images I was consistently running into type mismatch issues either with calling the pipeline for inference or with calling the unet in the training loop. (tensor mismatch for float32 vs float16 type issues). I resolved this by enabling the autocast context around those two calls with torch.cuda.amp.autocast(enabled=True):. I have not noticed any performance or obvious training issues due to this but if someone knows why this is a bad idea it would be greatly appreciated if they could help work out how to avoid the type mismatch issues without autocrats

subpanic added 9 commits

December 31, 2022 15:54


          add preview image support alongside checkpointing for dreambooth.

aee5096


          Merge branch 'main' into dreambooth_previews

223fbc7


          Adding support for sample image outputs during training for each chec…

48dc919

…kpoint interval.


          Updating command args for sample generation

2d04837


          adding dreambooth helper scripts

8540e2b


          Merge branch 'main' into dreambooth_previews

b684bbb


          Cleanup & adding support for multires training and device selection

519aa0a


          Merge branch 'main' into dreambooth_qol_improvements

8336e20


          Additional cleanup.

4086ed0

HuggingFaceDocBuilderDev commented Jan 18, 2023

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

patrickvonplaten requested a review from patil-suraj

January 22, 2023 19:56

Contributor

patrickvonplaten commented Jan 22, 2023

cc @patil-suraj

patrickvonplaten assigned patil-suraj

patil-suraj reviewed

View reviewed changes

Contributor

patil-suraj left a comment •

edited

Loading

Thanks a lot for the PR @subpanic , I left a few comments below.

Note that the goal of the example scripts is to have simple scripts that cover the most common use cases and are easier to read and modify if users wants to change something. For additional features like multires we encourage users to take the script and adjust on their own, so as to keep the example here simpler. Hope you understand.

Support for optionally generating sample image outputs for each checkpoint.

That's a good point, @patrickvonplaten is working on adding logging to all training examples. @patrickvonplaten maybe we could add an argument like validation_steps so users could sync checkpointing and generating.

examples/dreambooth/train_dreambooth.py

    
                          raise ValueError(f"Instance {self.instance_data_root} images root doesn't exists.")

                      self.instance_images_path = list(Path(instance_data_root).iterdir())

                      self.instance_images_path = [path for path in self.instance_images_path if path.suffix in [".jpg", ".png", ".jpeg"]]

Contributor

patil-suraj Jan 23, 2023

Good idea!

examples/dreambooth/train_dreambooth.py

Comment on lines +473 to +475

    
                  gpu = int(args.device.split(":")[1]) if args.device != "cuda" else "cuda"

                  if torch.cuda.is_available():

                      torch.cuda.set_device(gpu)

Contributor

patil-suraj Jan 23, 2023

don't think we should have this here, this will break multi-gpu. We can set the device using CUDA_VISIBLE_DEVICES env variable. Which is also recommended by torch, cf https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html

examples/dreambooth/train_dreambooth.py

    
                              # Predict the noise residual

                              model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample

                              with torch.cuda.amp.autocast(enabled=True):

Contributor

patil-suraj Jan 23, 2023

When we pass --mixed_precision="fp16" accelerate automatically uses autocast; we should not hardcode this here.

examples/dreambooth/train_dreambooth.py

Comment on lines +856 to +903

    
                                      # Also generate and save sample images if specified

                                      if args.samples_per_checkpoint > 0:

                                          # Make sure any data leftover from previous interim pipeline is cleared

                                          if torch.cuda.is_available():

                                              torch.cuda.empty_cache()

                                          # Load current training state into a new diffusion pipeline to generate samples

                                          pipeline = DiffusionPipeline.from_pretrained(

                                              args.pretrained_model_name_or_path,

                                              unet=accelerator.unwrap_model(unet),

                                              text_encoder=accelerator.unwrap_model(text_encoder),

                                              torch_dtype=weight_dtype

                                          ).to(accelerator.device)

                                          if args.convert_checkpoints:

                                              convertedOutPath = os.path.join(args.output_dir, "converted_checkpoints", f"checkpoint-{global_step}")

                                              logger.info(f"Converting checkpoint-{global_step} to diffusers model at {convertedOutPath}")

                                              pipeline.save_pretrained(convertedOutPath)

                                          logger.info(f"Generating {args.samples_per_checkpoint} samples at step {global_step} with prompt: {args.sample_prompt}")

                                          # Allow a statically set or random seed for for generated samples

                                          sampleSeed = args.sample_seed if args.sample_seed != -1 else torch.Generator(accelerator.device).seed()

                                          sampleGenerator = torch.Generator(device=accelerator.device)

                                          for sampleIdx in range(args.samples_per_checkpoint):

                                              sampleGenerator.manual_seed(sampleSeed)

                                              # Generate samples

                                              with torch.cuda.amp.autocast(enabled=True):

                                                  images = pipeline(

                                                      prompt=args.sample_prompt, 

                                                      num_images_per_prompt=1,

                                                      num_inference_steps=args.sample_steps,

                                                      generator=sampleGenerator,

                                                      width=args.resolution,

                                                      height=args.resolution).images

                                              # Save samples to 'samples' folder in output directory

                                              for i, image in enumerate(images):

                                                  hash_image = hashlib.sha1(image.tobytes()).hexdigest()

                                                  os.makedirs(os.path.join(args.output_dir, "samples"), exist_ok=True)

                                                  image_filename = os.path.join(args.output_dir, "samples", f"step-{global_step}_seed-{sampleSeed}_loss-{loss.detach().item()}_hash-{hash_image}.png")

                                                  image.save(image_filename)

                                              sampleSeed += 1

                                          # Remove interim pipeline reference

                                          del pipeline

Contributor

patil-suraj Jan 23, 2023

This can be simplified a lot , we have an example for logging, cf https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py#L902

Contributor

patrickvonplaten Jan 23, 2023

Hey @subpanic, could you maybe try to use the mechanism of https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py#L902 here for generation? It should be a bit easier than what we currently have :-)

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                      logging_dir=logging_dir,

                  )

                  # Accelerator device target is managed by an AcceleratorState object, grabbing

Contributor

patrickvonplaten Jan 23, 2023

Let's not do this here. accelerate is responsible for device handling :-)

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                          ]

                      )

                      if multires_enabled:    

                          self.image_transforms = transforms.Compose(

Contributor

patrickvonplaten Jan 23, 2023

@patil-suraj what do you think regarding multires?

Contributor

patil-suraj Jan 23, 2023

Not really in favour of this. I think better to keep the script simpler.

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  parser.add_argument("--convert_checkpoints", action="store_true", help="Auto-convert checkpoints to an inference ready structure")

                  parser.add_argument("--multires", action="store_true", help="Disables dataset image transforms. Allows training on image datasets of arbitrary resolutions")

                  parser.add_argument("--device", type=str, default="cuda", help="Set the device to use for training (cuda, cuda:0, cuda:1, etc.).")

Contributor

patrickvonplaten Jan 23, 2023

Let's not add this. accelerate should handle this :-)

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  parser.add_argument("--sample_steps", type=int, default=40, help="Number of steps for generating sample images.")

                  parser.add_argument("--sample_prompt", type=str, default=None, help="Prompt to use for sample image generation.")

                  parser.add_argument("--sample_seed", type=int, default=-1, help="Seed for the per-checkpoint sample image generation. -1 to select random seed")

                  parser.add_argument("--convert_checkpoints", action="store_true", help="Auto-convert checkpoints to an inference ready structure")

Contributor

patrickvonplaten Jan 23, 2023

Let's also not add this. We will handle this soon directly in accelerate as well, see: #2048

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                      "--enable_xformers_memory_efficient_attention", action="store_true", help="Whether or not to use xformers."

                  )

                  parser.add_argument("--samples_per_checkpoint", type=int, default=0, help="Whether or not to save samples for every checkpoint specified by --checkpointing_steps.")

                  parser.add_argument("--sample_steps", type=int, default=40, help="Number of steps for generating sample images.")

Contributor

patrickvonplaten Jan 23, 2023

Can we instead maybe use:

diffusers/examples/dreambooth/train_dreambooth_lora.py

Line 171 in d75ad93

"--validation_epochs",

?

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  parser.add_argument(

                      "--enable_xformers_memory_efficient_attention", action="store_true", help="Whether or not to use xformers."

                  )

                  parser.add_argument("--samples_per_checkpoint", type=int, default=0, help="Whether or not to save samples for every checkpoint specified by --checkpointing_steps.")

Contributor

patrickvonplaten Jan 23, 2023

Hmm can we maybe align this with

diffusers/examples/dreambooth/train_dreambooth_lora.py

Line 171 in d75ad93

"--validation_epochs",

and not have this argument?

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  )

                  parser.add_argument("--samples_per_checkpoint", type=int, default=0, help="Whether or not to save samples for every checkpoint specified by --checkpointing_steps.")

                  parser.add_argument("--sample_steps", type=int, default=40, help="Number of steps for generating sample images.")

                  parser.add_argument("--sample_prompt", type=str, default=None, help="Prompt to use for sample image generation.")

Contributor

patrickvonplaten Jan 23, 2023

Could we call this:

diffusers/examples/dreambooth/train_dreambooth_lora.py

Line 159 in d75ad93

"--validation_prompt",

maybe?

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  parser.add_argument("--samples_per_checkpoint", type=int, default=0, help="Whether or not to save samples for every checkpoint specified by --checkpointing_steps.")

                  parser.add_argument("--sample_steps", type=int, default=40, help="Number of steps for generating sample images.")

                  parser.add_argument("--sample_prompt", type=str, default=None, help="Prompt to use for sample image generation.")

                  parser.add_argument("--sample_seed", type=int, default=-1, help="Seed for the per-checkpoint sample image generation. -1 to select random seed")

Contributor

patrickvonplaten Jan 23, 2023

Suggested change

      
                parser.add_argument("--sample_seed", type=int, default=-1, help="Seed for the per-checkpoint sample image generation. -1 to select random seed")
          
                parser.add_argument("--validation_seed", type=int, default=-1, help="Seed for the per-checkpoint sample image generation. -1 to select random seed")

patrickvonplaten reviewed

View reviewed changes

examples/dreambooth/train_dreambooth.py

    
                  parser.add_argument("--sample_prompt", type=str, default=None, help="Prompt to use for sample image generation.")

                  parser.add_argument("--sample_seed", type=int, default=-1, help="Seed for the per-checkpoint sample image generation. -1 to select random seed")

                  parser.add_argument("--convert_checkpoints", action="store_true", help="Auto-convert checkpoints to an inference ready structure")

                  parser.add_argument("--multires", action="store_true", help="Disables dataset image transforms. Allows training on image datasets of arbitrary resolutions")

Contributor

patrickvonplaten Jan 23, 2023

Looks good to me!

patrickvonplaten reviewed

View reviewed changes

Contributor

patrickvonplaten left a comment

Thanks a lot for the PR. It would be great if we could align things here a bit with: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py#L159 if possible :-)

LucasSloan mentioned this pull request

[WIP] Sample images when checkpointing. #2157

Closed

Contributor

github-actions bot commented Feb 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot added the stale label

github-actions bot closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale