Skip to content

Conversation

@dg845
Copy link
Collaborator

@dg845 dg845 commented Jan 1, 2024

What does this PR do?

This PR makes further improvements to the LCM-(LoRA) distillation scripts.

Changelist:

  1. For all scripts, make the interpolation function when rescaling configurable via the --interpolation_type argument.
  2. For the LCM-LoRA scripts, make the lora_alpha (which controls LoRA scaling) and lora_dropout parameters configurable via the --lora_alpha and --lora_dropout arguments respectively.
  3. For the LCM-LoRA scripts, the LoRA target modules can be made configurable via the --lora_target_modules argument.
  4. For all scripts, make the VAE encoding batch size configurable via the --vae_encode_batch_size argument.
  5. For all scripts, refactor the scalings_for_boundary_conditions function to use the timestep_scaling argument (instead of it being hardcoded to 10.0) and make the scaling factor configurable via --timestep_scaling_factor.

Follow-up to #5778.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@sayakpaul
@patil-suraj

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

  • Let's also support interpolation mode in the non-WDS script?
  • Might make sense to add resolve_interpolation_mode() in training_utils.py, as I imagine it being used in multiple scripts.

@patrickvonplaten
Copy link
Contributor

@patil-suraj can you check here?

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool, lgtm!

@sayakpaul sayakpaul merged commit f3d1333 into huggingface:main Jan 5, 2024
@sayakpaul
Copy link
Member

sayakpaul commented Jan 5, 2024

Thanks a lot, @dg845! The scripts are in a really good spot now!

@dg845 dg845 deleted the lcm-scripts-improve branch January 5, 2024 02:25
dg845 added a commit to dg845/diffusers that referenced this pull request Jan 7, 2024

import numpy as np
import torch
from torchvision import transforms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't have torchvision as a required dependency

AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* Make WDS pipeline interpolation type configurable.

* Make the VAE encoding batch size configurable.

* Make lora_alpha and lora_dropout configurable for LCM LoRA scripts.

* Generalize scalings_for_boundary_conditions function and make the timestep scaling configurable.

* Make LoRA target modules configurable for LCM-LoRA scripts.

* Move resolve_interpolation_mode to src/diffusers/training_utils.py and make interpolation type configurable in non-WDS script.

* apply suggestions from review
@bigmover
Copy link

Wether TCD distilltion training is supported by Diffusers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants