Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
13f1212
Support instruction pix2pix sdxl
kfzyqin Jul 13, 2023
7daa3b2
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 13, 2023
67a401c
Support instruction pix2pix sdxl
kfzyqin Jul 14, 2023
775fb69
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 14, 2023
4514be5
Support instruction pix2pix sdxl
kfzyqin Jul 15, 2023
187fc36
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
ecdd293
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
94df45c
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
fb3bf00
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
a31bdcf
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
1354a95
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
92a71bc
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
8028af8
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
e5d3ec4
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
f6df1e8
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
b6eb449
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
bc4377e
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
94e9e10
Support instruction pix2pix sdxl
kfzyqin Jul 13, 2023
ff0b5cd
Support instruction pix2pix sdxl
kfzyqin Jul 14, 2023
ffb851f
[Community] Implementation of the IADB community pipeline (#3996)
tchambon Jul 13, 2023
926fc6d
add kandinsky to readme table (#4081)
yiyixuxu Jul 13, 2023
7f0a22b
[From Single File] Force accelerate to be installed (#4078)
patrickvonplaten Jul 13, 2023
cc8507d
Support instruction pix2pix sdxl
kfzyqin Jul 15, 2023
d0790d8
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
bfe12e9
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
84f5bc9
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
f827c0a
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
4a76d36
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
f43c0d5
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
83a4476
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
9aa1e83
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
c04e813
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
30fddaf
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
0b6fcd4
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
b806da3
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
d946ba1
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 16, 2023
7c8c2fd
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 16, 2023
9978715
Support instruction pix2pix sdxl
kfzyqin Jul 16, 2023
fed87ba
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 16, 2023
2841046
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 17, 2023
a4f5455
Support instruction pix2pix sdxl
kfzyqin Jul 17, 2023
d2c2647
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 17, 2023
3d32d50
Clean up IP2P SDXL code
kfzyqin Jul 17, 2023
f06ed07
Clean up IP2P SDXL code
kfzyqin Jul 17, 2023
ca844a1
Merge remote-tracking branch 'upstream/main' into main
kfzyqin Jul 18, 2023
2c6b5e2
Merge remote-tracking branch 'upstream/main' into ip2p_sdxl
kfzyqin Jul 18, 2023
5969ec7
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 18, 2023
b1cd1f4
Merge branch 'main' of github.com:harutatsuakiyama/diffusers into ip2…
kfzyqin Jul 18, 2023
ab7004e
[IP2P and SDXL] clean up code
kfzyqin Jul 18, 2023
e3af9d9
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 18, 2023
aaaef7b
[IP2P and SDXL] clean up code
kfzyqin Jul 18, 2023
698649d
[IP2P and SDXL] clean up code
kfzyqin Jul 18, 2023
8450179
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 18, 2023
a4772e7
[IP2P SDXL] Address code reviews
kfzyqin Jul 19, 2023
8f26227
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 19, 2023
b4b9e22
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 19, 2023
c0d6824
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 20, 2023
2a24c8c
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 21, 2023
79c8f61
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
d988708
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 21, 2023
ab6611c
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
5e7b961
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
d3837fa
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
cf72806
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
5e52ebd
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
7bc8652
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
bdde572
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
65dace7
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
5e51868
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
a2a4b1b
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
c1c3206
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 21, 2023
591b53d
[IP2P SDXL] Address code reviews, add docs, tests
kfzyqin Jul 21, 2023
df5d593
Merge branch 'ip2p_sdxl' of github.com:harutatsuakiyama/diffusers int…
kfzyqin Jul 21, 2023
9b52201
[IP2P SDXL] Address code reviews
kfzyqin Jul 21, 2023
3e8f11d
[IP2P SDXL] Address code reviews
kfzyqin Jul 21, 2023
5c1fce0
[IP2P SDXL] Add README_SDXL
kfzyqin Jul 21, 2023
ce8fb29
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 21, 2023
dc02a48
[IP2P SDXL] Address code reviews
kfzyqin Jul 21, 2023
8771c58
[IP2P SDXL] Address code reviews
kfzyqin Jul 21, 2023
133611a
[IP2P SDXL] Fix the copy problems
kfzyqin Jul 21, 2023
d70ef60
Merge branch 'main' into ip2p_sdxl
kfzyqin Jul 24, 2023
97eda35
[IP2P SDXL] Add license
kfzyqin Jul 24, 2023
d0ec947
[IP2P SDXL] Add license
kfzyqin Jul 24, 2023
8268f11
[IP2P SDXL] Add license
kfzyqin Jul 24, 2023
e54fe74
[IP2P SDXL] Address code reivew for selecting VAE andd others
kfzyqin Jul 24, 2023
db44a7a
[IP2P SDXL] Update README_sdxl
kfzyqin Jul 24, 2023
48b86a0
[IP2P SDXL] Update __init__
kfzyqin Jul 24, 2023
7f6d782
[IP2P SDXL] Update dummy_torch_and_transformers_and_invisible_waterma…
kfzyqin Jul 24, 2023
ef48169
address patrick's comments and some additions to readmes.
sayakpaul Jul 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion docs/source/en/training/instructpix2pix.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -208,4 +208,8 @@ speed and quality during performance:
Particularly, `image_guidance_scale` and `guidance_scale` can have a profound impact
on the generated ("edited") image (see [here](https://twitter.com/RisingSayak/status/1628392199196151808?s=20) for an example).

If you're looking for some interesting ways to use the InstructPix2Pix training methodology, we welcome you to check out this blog post: [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd).
If you're looking for some interesting ways to use the InstructPix2Pix training methodology, we welcome you to check out this blog post: [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd).

## Stable Diffusion XL

We support fine-tuning of the UNet shipped in [Stable Diffusion XL](https://huggingface.co/papers/2307.01952) with DreamBooth and LoRA via the `train_dreambooth_lora_sdxl.py` script. Please refer to the docs [here](https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/README_sdxl.md).
6 changes: 5 additions & 1 deletion examples/instruct_pix2pix/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,4 +186,8 @@ speed and quality during performance:
Particularly, `image_guidance_scale` and `guidance_scale` can have a profound impact
on the generated ("edited") image (see [here](https://twitter.com/RisingSayak/status/1628392199196151808?s=20) for an example).

If you're looking for some interesting ways to use the InstructPix2Pix training methodology, we welcome you to check out this blog post: [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd).
If you're looking for some interesting ways to use the InstructPix2Pix training methodology, we welcome you to check out this blog post: [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd).

## Stable Diffusion XL

We support fine-tuning of the UNet shipped in [Stable Diffusion XL](https://huggingface.co/papers/2307.01952) with DreamBooth and LoRA via the `train_dreambooth_lora_sdxl.py` script. Please refer to the docs [here](./README_sdxl.md).
148 changes: 148 additions & 0 deletions examples/instruct_pix2pix/README_sdxl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# InstructPix2Pix SDXL training example

***This is based on the original InstructPix2Pix training example.***

[Stable Diffusion XL](https://huggingface.co/papers/2307.01952) (or SDXL) is the latest image generation model that is tailored towards more photorealistic outputs with more detailed imagery and composition compared to previous SD models. It leverages a three times larger UNet backbone. The increase of model parameters is mainly due to more attention blocks and a larger cross-attention context as SDXL uses a second text encoder.

The `train_instruct_pix2pix_xl.py` script shows how to implement the training procedure and adapt it for Stable Diffusion XL.

***Disclaimer: Even though `train_instruct_pix2pix_xl.py` implements the InstructPix2Pix
training procedure while being faithful to the [original implementation](https://github.com/timothybrooks/instruct-pix2pix) we have only tested it on a [small-scale dataset](https://huggingface.co/datasets/fusing/instructpix2pix-1000-samples). This can impact the end results. For better results, we recommend longer training runs with a larger dataset. [Here](https://huggingface.co/datasets/timbrooks/instructpix2pix-clip-filtered) you can find a large dataset for InstructPix2Pix training.***

## Running locally with PyTorch

### Installing the dependencies

Refer to the original InstructPix2Pix training example for installing the dependencies.

You will also need to get access of SDXL by filling the [form](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9).

### Toy example

As mentioned before, we'll use a [small toy dataset](https://huggingface.co/datasets/fusing/instructpix2pix-1000-samples) for training. The dataset
is a smaller version of the [original dataset](https://huggingface.co/datasets/timbrooks/instructpix2pix-clip-filtered) used in the InstructPix2Pix paper.

Configure environment variables such as the dataset identifier and the Stable Diffusion
checkpoint:

```bash
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-0.9"
export DATASET_ID="fusing/instructpix2pix-1000-samples"
```

Now, we can launch training:

```bash
python train_instruct_pix2pix_xl.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--dataset_name=$DATASET_ID \
--enable_xformers_memory_efficient_attention \
--resolution=256 --random_flip \
--train_batch_size=4 --gradient_accumulation_steps=4 --gradient_checkpointing \
--max_train_steps=15000 \
--checkpointing_steps=5000 --checkpoints_total_limit=1 \
--learning_rate=5e-05 --max_grad_norm=1 --lr_warmup_steps=0 \
--conditioning_dropout_prob=0.05 \
--seed=42
```

Additionally, we support performing validation inference to monitor training progress
with Weights and Biases. You can enable this feature with `report_to="wandb"`:

```bash
python train_instruct_pix2pix_xl.py \
--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-0.9 \
--dataset_name=$DATASET_ID \
--use_ema \
--enable_xformers_memory_efficient_attention \
--resolution=512 --random_flip \
--train_batch_size=4 --gradient_accumulation_steps=4 --gradient_checkpointing \
--max_train_steps=15000 \
--checkpointing_steps=5000 --checkpoints_total_limit=1 \
--learning_rate=5e-05 --lr_warmup_steps=0 \
--conditioning_dropout_prob=0.05 \
--seed=42 \
--val_image_url_or_path="https://datasets-server.huggingface.co/assets/fusing/instructpix2pix-1000-samples/--/fusing--instructpix2pix-1000-samples/train/23/input_image/image.jpg" \
--validation_prompt="make it in japan" \
--report_to=wandb
```

We recommend this type of validation as it can be useful for model debugging. Note that you need `wandb` installed to use this. You can install `wandb` by running `pip install wandb`.

[Here](https://wandb.ai/sayakpaul/instruct-pix2pix/runs/ctr3kovq), you can find an example training run that includes some validation samples and the training hyperparameters.

***Note: In the original paper, the authors observed that even when the model is trained with an image resolution of 256x256, it generalizes well to bigger resolutions such as 512x512. This is likely because of the larger dataset they used during training.***

## Training with multiple GPUs

`accelerate` allows for seamless multi-GPU training. Follow the instructions [here](https://huggingface.co/docs/accelerate/basic_tutorials/launch)
for running distributed training with `accelerate`. Here is an example command:

```bash
accelerate launch --mixed_precision="fp16" --multi_gpu train_instruct_pix2pix.py \
--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-0.9 \
--dataset_name=$DATASET_ID \
--use_ema \
--enable_xformers_memory_efficient_attention \
--resolution=512 --random_flip \
--train_batch_size=4 --gradient_accumulation_steps=4 --gradient_checkpointing \
--max_train_steps=15000 \
--checkpointing_steps=5000 --checkpoints_total_limit=1 \
--learning_rate=5e-05 --lr_warmup_steps=0 \
--conditioning_dropout_prob=0.05 \
--seed=42 \
--val_image_url_or_path="https://datasets-server.huggingface.co/assets/fusing/instructpix2pix-1000-samples/--/fusing--instructpix2pix-1000-samples/train/23/input_image/image.jpg" \
--validation_prompt="make it in japan" \
--report_to=wandb
```

## Inference

Once training is complete, we can perform inference:

```python
import PIL
import requests
import torch
from diffusers import StableDiffusionXLInstructPix2PixPipeline

model_id = "your_model_id" # <- replace this
pipe = StableDiffusionXLInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
generator = torch.Generator("cuda").manual_seed(0)

url = "https://datasets-server.huggingface.co/assets/fusing/instructpix2pix-1000-samples/--/fusing--instructpix2pix-1000-samples/train/23/input_image/image.jpg"


def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image

image = download_image(url)
prompt = "make it Japan"
num_inference_steps = 20
image_guidance_scale = 1.5
guidance_scale = 10

edited_image = pipe(prompt,
image=image,
num_inference_steps=num_inference_steps,
image_guidance_scale=image_guidance_scale,
guidance_scale=guidance_scale,
generator=generator,
).images[0]
edited_image.save("edited_image.png")
```

We encourage you to play with the following three parameters to control
speed and quality during performance:

* `num_inference_steps`
* `image_guidance_scale`
* `guidance_scale`

Particularly, `image_guidance_scale` and `guidance_scale` can have a profound impact
on the generated ("edited") image (see [here](https://twitter.com/RisingSayak/status/1628392199196151808?s=20) for an example).

If you're looking for some interesting ways to use the InstructPix2Pix training methodology, we welcome you to check out this blog post: [Instruction-tuning Stable Diffusion with InstructPix2Pix](https://huggingface.co/blog/instruction-tuning-sd).
Loading