Skip to content

RuntimeError: Input type (c10::Half) and bias type (float) should be the same when running StableDiffusionXLInstructPix2PixPipeline #5510

@kisisjrlly

Description

@kisisjrlly

Describe the bug

when I run the example of StableDiffusionXLInstructPix2PixPipeline, an error has occurred.
As I found that the previous commit #4796 had already resolved this problem, so I used the latest code from the main branch, but there was still the same error

Reproduction

from diffusers import StableDiffusionXLInstructPix2PixPipeline
from diffusers.utils import load_image
import torch
resolution = 768
image = load_image(
     "https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
).resize((resolution, resolution))
edit_instruction = "Turn sky into a cloudy one"

pipe = StableDiffusionXLInstructPix2PixPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16", local_files_only=True
).to("cuda")

edited_image = pipe(
     prompt=edit_instruction,
     image=image,
     height=resolution,
     width=resolution,
     guidance_scale=3.0,
     image_guidance_scale=1.5,
     num_inference_steps=30,
 ).images[0]

Logs

Traceback (most recent call last):
  File "textimage2image.py", line 15, in <module>
    edited_image = pipe(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/zhaoguodong/work/open_source/diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_instruct_pix2pix.py", line 850, in __call__
    image_latents = self.prepare_image_latents(
  File "/home/zhaoguodong/work/open_source/diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_instruct_pix2pix.py", line 534, in prepare_image_latents
    image_latents = self.vae.encode(image).latent_dist.mode()
  File "/home/zhaoguodong/work/open_source/diffusers/src/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/home/zhaoguodong/work/open_source/diffusers/src/diffusers/models/autoencoder_kl.py", line 274, in encode
    h = self.encoder(x)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/zhaoguodong/work/open_source/diffusers/src/diffusers/models/vae.py", line 112, in forward
    sample = self.conv_in(sample)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same

System Info

pip list:

diffusers                0.21.4
torch                    2.0.1
transformers             4.34.1

same result with latest source code:

diffusers                0.22.0.dev0          /home/zhaoguodong/work/open_source/diffusers/src
or 

image

Who can help?

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingstaleIssues that haven't received updates

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions