[WIP] Add Unet for consistency models #1

ayushtues · 2023-05-22T06:47:43Z

Adding Unet model and conversion scripts, part of huggingface#3492

To-do

Add building blocks needed for the original repo Unet implementation
Conversion script for weights into diffusers + load pretrained weights into diffusers
Check if forward passes into pretrained model give same results as original repo

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add stable diffusion tensorrt img2img pipeline Signed-off-by: Asfiya Baig <asfiyab@nvidia.com> * update docstrings Signed-off-by: Asfiya Baig <asfiyab@nvidia.com> --------- Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* refactor controlnet and add img2img and inpaint * First draft to get pipelines to work * make style * Fix more * Fix more * More tests * Fix more * Make inpainting work * make style and more tests * Apply suggestions from code review * up * make style * Fix imports * Fix more * Fix more * Improve examples * add test * Make sure import is correctly deprecated * Make sure everything works in compile mode * make sure authorship is correctly attributed

* Add DPM-Solver Multistep Inverse Scheduler * Add draft tests for DiffEdit * Add inverse sde-dpmsolver steps to tune image diversity from inverted latents * Fix tests --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Fix incomplete docstrings for resnet.py

fix tiled vae bleand extent range

Small update to "Next steps" section: - PyTorch 2 is recommended. - Updated improvement figures.

…e#3298) * Update pipeline_if_superresolution.py Allow arbitrary aspect ratio in IFSuperResolutionPipeline by using the input image shape * IFSuperResolutionPipeline: allow the user to override the height and width through the arguments * update IFSuperResolutionPipeline width/height doc string to match StableDiffusionInpaintPipeline conventions --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

…gingface#3424) * Added explanation of 'strength' parameter * Added get_timesteps function which relies on new strength parameter * Added `strength` parameter which defaults to 1. * Swapped ordering so `noise_timestep` can be calculated before masking the image this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1. * Added strength to check_inputs, throws error if out of range * Changed `prepare_latents` to initialise latents w.r.t strength inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0. * WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline still need to add correct regression values * Created a is_strength_max to initialise from pure random noise * Updated unit tests w.r.t new strength parameter + fixed new strength unit test * renamed parameter to avoid confusion with variable of same name * Updated regression values for new strength test - now passes * removed 'copied from' comment as this method is now different and divergent from the cpy * Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Ensure backwards compatibility for prepare_mask_and_masked_image created a return_image boolean and initialised to false * Ensure backwards compatibility for prepare_latents * Fixed copy check typo * Fixes w.r.t backward compibility changes * make style * keep function argument ordering same for backwards compatibility in callees with copied from statements * make fix-copies --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: William Berman <WLBberman@gmail.com>

…s partially downloaded (huggingface#3448) Added bugfix using f strings.

…grad=False) (huggingface#3404) * gradient checkpointing bug fix * bug fix; changes for reviews * reformat * reformat --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Make dreambooth lora more robust to orig unet * up

…y're unnecessary) (huggingface#3463) Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.

add min snr to text2img lora training script

* add inpaint lora scale support * add inpaint lora scale test --------- Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

* Correct from_ckpt * make style

…ngface#3478)

* dreambooth docs torch.compile note * Update examples/dreambooth/README.md Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Update examples/dreambooth/README.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add textual inversion inference to docs * add to toctree --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* distributed inference * move to inference section * apply feedback * update with split_between_processes * apply feedback

…lattened indices (huggingface#3479) explicit view kernel size as number elements in flattened indices

* Remove ONNX tests from PR. They are already a part of push_tests.yml. * Remove mps tests from PRs. They are already performed on push. * Fix workflow name for fast push tests. * Extract mps tests to a workflow. For better control/filtering. * Remove --extra-index-url from mps tests * Increase tolerance of mps test This test passes in my Mac (Ventura 13.3) but fails in the CI hardware (Ventura 13.2). I ran the local tests following the same steps that exist in the CI workflow. * Temporarily run mps tests on pr So we can test. * Revert "Temporarily run mps tests on pr" Tests passed, go back to running on push.

…ocessor2_0` (huggingface#3457) * add: debugging to enabling memory efficient processing * add: better warning message.

add note on local directory path. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ayushtues · 2023-05-23T06:47:08Z

@dg845 Was able to convert the Unet into diffusers, load pretrained checkpoints and get same outputs from forward passes:

ckpt : https://huggingface.co/ayushtues/consistency_models/tree/main

unet = UNet2DModel.from_pretrained("ayushtues/consistency_models", subfolder="diffusers_cd_imagenet64_l2")

ayushtues · 2023-05-23T06:49:30Z

Had to add two new blocks to unet_2d_blocks - AttnDownsampleBlock2D & AttnUpsampleBlock2D, because I needed a resnet based down/upsampler along with attention, and the existing AttnDownBlock2D &AttnUpBlock2D just used a simple conv downsampler.

Open to suggestions on how to design this better

HuggingFaceDocBuilderDev · 2023-05-23T11:49:36Z

The documentation is not available anymore as the PR was closed or merged.

dg845 · 2023-05-24T07:34:02Z

Sorry, didn't get a chance to look at this until now.

Had to add two new blocks to unet_2d_blocks - AttnDownsampleBlock2D & AttnUpsampleBlock2D, because I needed a resnet based down/upsampler along with attention, and the existing AttnDownBlock2D &AttnUpBlock2D just used a simple conv downsampler.

Open to suggestions on how to design this better

The new blocks look fine to me. I guess I'm not the most knowledgeable about unet_2d_blocks so it might be worth it to ask the maintainers directly about the design.

Also, I have a quick question: does the conversion script support class-conditional models? I see for example in the consistency model scripts that there are some examples of training and sampling from them.

dg845 · 2023-05-24T08:52:39Z

I took a closer look at the unet block implementations and it looks like SimpleCrossAttnDownBlock2D/SimpleCrossAttnUpBlock2D are pretty similar to AttnDownsampleBlock2D/AttnUpsampleBlock2D in having a ResnetBlock2D/Attention/ResnetBlock2D downsampler/upsampler pattern. Might it be possible to leverage some or all of the SimpleCrossAttnDownBlock2D/SimpleCrossAttnUpBlock2D implementation?

ayushtues · 2023-05-25T09:17:26Z

Sorry, I was blocked by Github for a while ( supposedly a VS code extension was generating infinite PRs 🙈 ).

Also, I have a quick question: does the conversion script support class-conditional models? I see for example in the consistency model scripts that there are some examples of training and sampling from them.

The current Unet2DModel already supports class conditioning : https://github.com/ayushtues/diffusers/blob/consistency_unet/src/diffusers/models/unet_2d.py#L223, and infact I tested this while making sure the forward passes are the same

ayushtues · 2023-05-25T09:22:30Z

I took a closer look at the unet block implementations and it looks like SimpleCrossAttnDownBlock2D/SimpleCrossAttnUpBlock2D are pretty similar to AttnDownsampleBlock2D/AttnUpsampleBlock2D in having a ResnetBlock2D/Attention/ResnetBlock2D downsampler/upsampler pattern. Might it be possible to leverage some or all of the SimpleCrossAttnDownBlock2D/SimpleCrossAttnUpBlock2D implementation?

These blocks use AttnAddedKVProcessor, which has a different Attention implementation, and would have been non-trivial to convert it into what we need without changing the Attention Processor. Also it would be better to keep the Cross-Attention implementations seperate from simple attention imo.

Also the new blocks are simply a copy of AttnDownBlock2D/AttnUpBlock2D, with self.downsamplers/self.upsamplers being a resnet instead of a simple down/upsampler.

So we can also instead parameterize AttnDownBlock2D to have a resnet downsampler instead based on an argument

dg845 · 2023-05-25T10:12:19Z

Makes sense, looks good to me then :).

* Implement `CustomDiffusionAttnProcessor2_0` * Doc-strings and type annotations for `CustomDiffusionAttnProcessor2_0`. (#1) * Update attnprocessor.md * Update attention_processor.py * Interops for `CustomDiffusionAttnProcessor2_0`. * Formatted `attention_processor.py`. * Formatted doc-string in `attention_processor.py` * Conditional CustomDiffusion2_0 for training example. * Remove unnecessary reference impl in comments. * Fix `save_attn_procs`.

jongwooo and others added 30 commits May 16, 2023 12:51

Replace deprecated command with environment file (huggingface#3409)

326f326

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

fix warning message pipeline loading (huggingface#3446)

d2285f5

[Docs] Fix incomplete docstring for resnet.py (huggingface#3438)

754fac8

Fix incomplete docstrings for resnet.py

fix tiled vae blend extent range (huggingface#3384)

92ea5ba

fix tiled vae bleand extent range

Small update to "Next steps" section (huggingface#3443)

0392ece

Small update to "Next steps" section: - PyTorch 2 is recommended. - Updated improvement figures.

[WIP] Bugfix - Pipeline.from_pretrained is broken when the pipeline i…

415c616

…s partially downloaded (huggingface#3448) Added bugfix using f strings.

Fix gradient checkpointing bugs in freezing part of models (requires_…

15f1bab

…grad=False) (huggingface#3404) * gradient checkpointing bug fix * bug fix; changes for reviews * reformat * reformat --------- Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Make dreambooth lora more robust to orig unet (huggingface#3462)

3ebd2d1

* Make dreambooth lora more robust to orig unet * up

Add min snr to text2img lora training script (huggingface#3459)

2faf91d

add min snr to text2img lora training script

Add inpaint lora scale support (huggingface#3460)

88295f9

* add inpaint lora scale support * add inpaint lora scale test --------- Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

[From ckpt] Fix from_ckpt (huggingface#3466)

2858d7e

* Correct from_ckpt * make style

Update full dreambooth script to work with IF (huggingface#3425)

c9f939b

Add IF dreambooth docs (huggingface#3470)

7200985

parameterize pass single args through tuple (huggingface#3477)

49b7ccf

attend and excite tests disable determinism on the class level (huggi…

8917769

…ngface#3478)

add: if entry in the dreambooth training docs. (huggingface#3472)

e343443

[docs] Textual inversion inference (huggingface#3473)

00c76f6

* add textual inversion inference to docs * add to toctree --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[docs] Distributed inference (huggingface#3376)

e589bdb

* distributed inference * move to inference section * apply feedback * update with split_between_processes * apply feedback

[{Up,Down}sample1d] explicit view kernel size as number elements in f…

85eff63

…lattened indices (huggingface#3479) explicit view kernel size as number elements in flattened indices

[Attention processor] Better warning message when shifting to `AttnPr…

4bbc51d

…ocessor2_0` (huggingface#3457) * add: debugging to enabling memory efficient processing * add: better warning message.

[Docs] add note on local directory path. (huggingface#3397)

49ad61c

add note on local directory path. Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Merge branch 'main2' into consistency_unet

37e1895

Add Unet blocks for consistency models

7c1e81f

ayushtues mentioned this pull request May 22, 2023

Add Consistency Models Pipeline huggingface/diffusers#3492

Merged

8 tasks

ayushmangal added 3 commits May 22, 2023 17:16

Add conversion script for Unet

3a151bd

Fix bug in new unet blocks

b6c5e15

Fix attention weight loading

4e93f09

ayushtues changed the title ~~[WIP] Add Unet blocks for consistency models~~ [WIP] Add Unet for consistency models May 23, 2023

dg845 merged commit d137d11 into dg845:consistency-models-pipeline May 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Add Unet for consistency models #1

[WIP] Add Unet for consistency models #1

Uh oh!

ayushtues commented May 22, 2023 •

edited

Loading

Uh oh!

ayushtues commented May 23, 2023 •

edited

Loading

Uh oh!

ayushtues commented May 23, 2023 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 23, 2023 •

edited

Loading

Uh oh!

dg845 commented May 24, 2023

Uh oh!

dg845 commented May 24, 2023

Uh oh!

ayushtues commented May 25, 2023 •

edited

Loading

Uh oh!

ayushtues commented May 25, 2023 •

edited

Loading

Uh oh!

dg845 commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[WIP] Add Unet for consistency models #1

[WIP] Add Unet for consistency models #1

Uh oh!

Conversation

ayushtues commented May 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 commented May 24, 2023

Uh oh!

dg845 commented May 24, 2023

Uh oh!

ayushtues commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushtues commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 commented May 25, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

ayushtues commented May 22, 2023 •

edited

Loading

ayushtues commented May 23, 2023 •

edited

Loading

ayushtues commented May 23, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented May 23, 2023 •

edited

Loading

ayushtues commented May 25, 2023 •

edited

Loading

ayushtues commented May 25, 2023 •

edited

Loading