-
Notifications
You must be signed in to change notification settings - Fork 6.7k
[Kandinsky] Add combined pipelines / Fix cpu model offload / Fix inpainting #4207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
nice - thank you @patrickvonplaten |
williamberman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! makes sense to me
| mask[mask >= 0.5] = 1 | ||
| mask = torch.from_numpy(mask) | ||
|
|
||
| mask = 1 - mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiyixuxu we need to do a breaking change here because Kandinsky can't accept a different mask format than SD.
Let's make sure in the future that our APIs are always exactly aligned. We cannot have one pipeline having black pixels representing a mask and another having white pixels representing a mask.
| mask[mask >= 0.5] = 1 | ||
| mask = torch.from_numpy(mask) | ||
|
|
||
| mask = 1 - mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yiyixuxu we need to do a breaking change here because Kandinsky can't accept a different mask format than SD.
Let's make sure in the future that our APIs are always exactly aligned. We cannot have one pipeline having black pixels representing a mask and another having white pixels representing a mask.
|
Doing tests & docs tomorrow |
| mask = np.zeros((768, 768), dtype=np.float32) | ||
| # Let's mask out an area above the cat's head | ||
| mask[:250, 250:-250] = 0 | ||
| mask[:250, 250:-250] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's maybe add a "Tip warning" here to ensure the users are communicated about this through and through?
Maybe something like:
<Tip warning=true>
Note that the above change was introduced in the following pull request: https://github.com/huggingface/diffusers/pull/4207. So, if you're using a source installation of `diffusers` or the latest release, you should upgrade your inpainting code to follow the above.
</Tip>
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the comment in the PR description is wrong (just in case we want to copy/paste it somewhere)
| AUTO_TEXT2IMAGE_DECODER_PIPELINES_MAPPING = OrderedDict( | ||
| [ | ||
| ("kandinsky", KandinskyPipeline), | ||
| ("kandinsky22", KandinskyV22Pipeline), | ||
| ] | ||
| ) | ||
| AUTO_IMAGE2IMAGE_DECODER_PIPELINES_MAPPING = OrderedDict( | ||
| [ | ||
| ("kandinsky", KandinskyImg2ImgPipeline), | ||
| ("kandinsky22", KandinskyV22Img2ImgPipeline), | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should potentially add similar auto-classes for Stable unCLIP as well, no (follow-up)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'd say follow-up
|
Currently, installing from this branch leads to: |
sayakpaul
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice!
However, I am unable to run the code snippets when diffusers is installed from the PR branch (pip install -q git+https://github.com/huggingface/diffusers@add_combined_pipeline_kandinsky).
yiyixuxu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another one of these 🤯 PRs 😂
thank you!
src/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py
Outdated
Show resolved
Hide resolved
| return getattr(diffusers_module, config["_class_name"]) | ||
| pipeline_cls = getattr(diffusers_module, config["_class_name"]) | ||
|
|
||
| if load_connected_pipeline: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is this load_connected_pipeline needed? when do we need to set this flag to be True?
so far it seems like we are able to find the correct combined pipeline class just using AutoPipeline.from_pretrained()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I need to write some better docs about it (will do in a follow-up PR). Essentially in case wants to load a connected pipeline via DiffusionPipeline
|
are we not going to add kandinsky controlnet to sd.next? |
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very awesome, we could potentially use the same mechanism for SDXL.
| mask = np.zeros((768, 768), dtype=np.float32) | ||
| # Let's mask out an area above the cat's head | ||
| mask[:250, 250:-250] = 0 | ||
| mask[:250, 250:-250] = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the comment in the PR description is wrong (just in case we want to copy/paste it somewhere)
| ] | ||
| ) | ||
|
|
||
| AUTO_TEXT2IMAGE_DECODER_PIPELINES_MAPPING = OrderedDict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what context will these be used?
I'm thinking about potentially creating auto pipelines for SDXL, the combined one is clear, but where would the base and refiner steps go? Is Decoder a good name or too specific to these models?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good point, it's not in the public API so we can definitely change the naming later. I'll make those mappings private for now to allow them to be changed after
| prior_prior: PriorTransformer, | ||
| prior_image_encoder: CLIPVisionModelWithProjection, | ||
| prior_text_encoder: CLIPTextModelWithProjection, | ||
| prior_tokenizer: CLIPTokenizer, | ||
| prior_scheduler: UnCLIPScheduler, | ||
| prior_image_processor: CLIPImageProcessor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No docstrings (not sure if they are important)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding all those now :-)
| _, hook = cpu_offload_with_hook(cpu_offloaded_model, device, prev_module_hook=hook) | ||
|
|
||
| # We'll offload the last model manually. | ||
| self.prior_hook = hook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to store this? Should be used in interpolate?
| if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: | ||
| self.prior_hook.offload() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: | |
| self.prior_hook.offload() | |
| if hasattr(self, "final_offload_hook") and self.final_offload_hook is not None: | |
| self.final_offload_hook.offload() |
If I am understanding this correctly we should always offload final_offload_hook, so we could put it outside the if.
Are we missing a self.prior_hook.offload() in interpolate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually expected since for this branch the prior is the final model
Co-authored-by: YiYi Xu <yixu310@gmail.com>
|
Ok, I'm merging this PR now. There are a couple of follow-up tasks that one should look into: |
…inting (huggingface#4207) * Add combined pipeline * Download readme * Upload * up * up * fix final * Add enable model cpu offload kandinsky * finish * finish * Fix * fix more * make style * fix kandinsky mask * fix inpainting test * add callbacks * add tests * fix tests * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * docs * docs * correct docs * fix tests * add warning * correct docs --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
…inting (huggingface#4207) * Add combined pipeline * Download readme * Upload * up * up * fix final * Add enable model cpu offload kandinsky * finish * finish * Fix * fix more * make style * fix kandinsky mask * fix inpainting test * add callbacks * add tests * fix tests * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * docs * docs * correct docs * fix tests * add warning * correct docs --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
…inting (huggingface#4207) * Add combined pipeline * Download readme * Upload * up * up * fix final * Add enable model cpu offload kandinsky * finish * finish * Fix * fix more * make style * fix kandinsky mask * fix inpainting test * add callbacks * add tests * fix tests * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * docs * docs * correct docs * fix tests * add warning * correct docs --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
…inting (huggingface#4207) * Add combined pipeline * Download readme * Upload * up * up * fix final * Add enable model cpu offload kandinsky * finish * finish * Fix * fix more * make style * fix kandinsky mask * fix inpainting test * add callbacks * add tests * fix tests * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * docs * docs * correct docs * fix tests * add warning * correct docs --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
…inting (huggingface#4207) * Add combined pipeline * Download readme * Upload * up * up * fix final * Add enable model cpu offload kandinsky * finish * finish * Fix * fix more * make style * fix kandinsky mask * fix inpainting test * add callbacks * add tests * fix tests * Apply suggestions from code review Co-authored-by: YiYi Xu <yixu310@gmail.com> * docs * docs * correct docs * fix tests * add warning * correct docs --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>
What does this PR do?
🚨🚨🚨 1. Breaking change - fixes mask input 🚨🚨🚨
NOW:
mask_imagerepaints white pixels and preserves black pixelsKandinksy was using an incorrect mask format. Instead of using white pixels as a mask (like SD & IF do), Kandinsky models were using black pixels. This needs to be corrected and so that the
diffusersAPI is aligned. We cannot have different mask formats for different pipelines.Important => This means that everyone that already used Kandinsky Inpaint in production / pipeline now needs to change the mask to:
Once this PR is merged we also need to correct all the model cards (cc @yiyixuxu)
2. Adds combined pipelines
As noticed in #4161 by @vladmandic ,
diffuserscurrently has an inconsistent design between Kandinsky Pipelines and other pipelines. The reason for this is that all Kandinsky pipelines (txt2img, img2img & inpaint) are based on Dalle2's UnCLIP design meaning they have to run two diffusion pipelines:Running just the prior or the decoder on its own often makes no sense so we should give the user an easier UX here while making sure we still keep the pipelines separated so that they can be run independently (e.g. on different nodes).
This PR introduces a mechanism that allows to load required prior pipelines directly when loading a decoder pipeline and puts all components in a single "combined" pipeline. Required pipelines are defined in the decoder's model card here: https://huggingface.co/kandinsky-community/kandinsky-2-1/blob/73bf6fba5b4410c671f7c73279ab39932b3ad021/README.md?code=true#L4
Each decoder (txt2img, img2img & inpaint) therefore is now accompanied by a "Combined" pipeline that will automatically be called from
AutoPipelineFor{Text2Img,Img2Img,Inpaint}.The following use cases are now supported and thereby make sure Kandinsky models can be used with the same API as other models:
Text 2 Image
Img2Img
Inpaint
To achieve this the following pipelines have been added:
Edit: updated mask in
inpaintexample.