Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/api/pipelines/unclip.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,6 @@ The unCLIP model in diffusers comes from kakaobrain's karlo and the original cod

## UnCLIPPipeline
[[autodoc]] pipelines.unclip.pipeline_unclip.UnCLIPPipeline
- __call__
[[autodoc]] pipelines.unclip.pipeline_unclip_image_variation.UnCLIPImageVariationPipeline
- __call__
40 changes: 40 additions & 0 deletions scripts/convert_unclip_txt2img_to_image_variation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import argparse

from diffusers import UnCLIPImageVariationPipeline, UnCLIPPipeline
from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection


if __name__ == "__main__":
parser = argparse.ArgumentParser()

parser.add_argument("--dump_path", default=None, type=str, required=True, help="Path to the output model.")

parser.add_argument(
"--txt2img_unclip",
default="kakaobrain/karlo-v1-alpha",
type=str,
required=False,
help="The pretrained txt2img unclip.",
)

args = parser.parse_args()

txt2img = UnCLIPPipeline.from_pretrained(args.txt2img_unclip)

feature_extractor = CLIPImageProcessor()
image_encoder = CLIPVisionModelWithProjection.from_pretrained("openai/clip-vit-large-patch14")

img2img = UnCLIPImageVariationPipeline(
decoder=txt2img.decoder,
text_encoder=txt2img.text_encoder,
tokenizer=txt2img.tokenizer,
text_proj=txt2img.text_proj,
feature_extractor=feature_extractor,
Comment on lines +1 to +32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very informative, but I'm not sure we store this kind of scripts in the repo. The ones in the folder are usually about converting weights from other checkpoints. What do you think @patil-suraj?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to remove and just put in the PR description! lmk @patil-suraj

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, think no need to have this script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better than not having a script at all. Think it's totally fine to leave it here as is. The main purpose of the scripts is really so that the user can convert the checkpoints themselves - I'm fine with the way it is. Better would be to directly convert from the original checkpoint, but for me this is ok as well and def better than not having anything.

image_encoder=image_encoder,
super_res_first=txt2img.super_res_first,
super_res_last=txt2img.super_res_last,
decoder_scheduler=txt2img.decoder_scheduler,
super_res_scheduler=txt2img.super_res_scheduler,
)

img2img.save_pretrained(args.dump_path)
1 change: 1 addition & 0 deletions src/diffusers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@
StableDiffusionPipeline,
StableDiffusionPipelineSafe,
StableDiffusionUpscalePipeline,
UnCLIPImageVariationPipeline,
UnCLIPPipeline,
VersatileDiffusionDualGuidedPipeline,
VersatileDiffusionImageVariationPipeline,
Expand Down
2 changes: 1 addition & 1 deletion src/diffusers/pipelines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
StableDiffusionUpscalePipeline,
)
from .stable_diffusion_safe import StableDiffusionPipelineSafe
from .unclip import UnCLIPPipeline
from .unclip import UnCLIPImageVariationPipeline, UnCLIPPipeline
from .versatile_diffusion import (
VersatileDiffusionDualGuidedPipeline,
VersatileDiffusionImageVariationPipeline,
Expand Down
1 change: 1 addition & 0 deletions src/diffusers/pipelines/unclip/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@
from ...utils.dummy_torch_and_transformers_objects import UnCLIPPipeline
else:
from .pipeline_unclip import UnCLIPPipeline
from .pipeline_unclip_image_variation import UnCLIPImageVariationPipeline
from .text_proj import UnCLIPTextProjModel
2 changes: 2 additions & 0 deletions src/diffusers/pipelines/unclip/pipeline_unclip.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ class UnCLIPPipeline(DiffusionPipeline):
[CLIPTokenizer](https://huggingface.co/docs/transformers/v4.21.0/en/model_doc/clip#transformers.CLIPTokenizer).
prior ([`PriorTransformer`]):
The canonincal unCLIP prior to approximate the image embedding from the text embedding.
text_proj ([`UnCLIPTextProjModel`]):
Utility class to prepare and combine the embeddings before they are passed to the decoder.
decoder ([`UNet2DConditionModel`]):
The decoder to invert the image embedding into an image.
super_res_first ([`UNet2DModel`]):
Expand Down
Loading