Skip to content

Conversation

@YashalShakti
Copy link

Fix guess_mode in StableDiffusionXLControlNetPipeline

What does this PR do?

It fixes a bug due to which guess mode was not usable in StableDiffusionXLControlNetPipeline.

Reproducing the original bug

Following the example at https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0
if we set guess_mode=True

images = pipe(
  prompt,
  negative_prompt=negative_prompt,
  image=image,
  controlnet_conditioning_scale=controlnet_conditioning_scale,
  guess_mode=True
).images

it throws the error,

RuntimeError                              Traceback (most recent call last)
Cell In[57], line 24
     21 image = np.concatenate([image, image, image], axis=2)
     22 image = Image.fromarray(image)
---> 24 images = pipe(
     25     prompt, negative_prompt=negative_prompt, image=image, controlnet_conditioning_scale=controlnet_conditioning_scale, guess_mode=True
     26     ).images

RuntimeError: The size of tensor a (8192) must match the size of tensor b (4096) at non-singleton dimension 1

Fixes

During ControlNet inference for only the conditional batch, also only use the corresponding add_text_embeds and add_time_ids.

Tests

  1. Able to generate some images with guess_mode=True
  2. Ran test_controlnet_sdxl
pytest tests/pipelines/controlnet/test_controlnet_sdxl.py
...
============================================================= 62 passed, 3 skipped, 165 warnings in 97.70s (0:01:37) ==============================================================

Before submitting

  • [] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [Y] Did you read the contributor guideline?
  • [Y] Did you read our philosophy doc (important for complex PRs)?
  • [] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [] Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • [] Did you write any new necessary tests?

Who can review?

Chunk and use single add_text_embeds and add_time_ids when using guess_mode and do_classifier_free_guidance
@sayakpaul
Copy link
Member

Being addressed in #4155. So, closing this one.

Thank you for your contribution!

@sayakpaul sayakpaul closed this Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants