Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
abb22b4
Update `examples` README.md to include the latest examples (#2839)
sayakpaul Mar 27, 2023
1d7b4b6
Ruff: apply same rules as in transformers (#2827)
pcuenca Mar 27, 2023
4c26cb9
[Tests] Fix slow tests (#2846)
patrickvonplaten Mar 27, 2023
7bc2fff
Fix StableUnCLIPImg2ImgPipeline handling of explicitly passed image e…
unishift Mar 27, 2023
b10f527
Helper function to disable custom attention processors (#2791)
pcuenca Mar 27, 2023
fab4f3d
improve stable unclip doc. (#2823)
sayakpaul Mar 28, 2023
58fc824
add: better warning messages when handling multiple conditionings. (#…
sayakpaul Mar 28, 2023
d4f846f
[WIP]Flax training script for controlnet (#2818)
yiyixuxu Mar 28, 2023
81125d8
Make dynamo wrapped modules work with save_pretrained (#2726)
pcuenca Mar 28, 2023
42d9501
[Init] Make sure shape mismatches are caught early (#2847)
patrickvonplaten Mar 28, 2023
c0afca2
updated onnx pndm test (#2811)
kashif Mar 28, 2023
585f621
[Stable Diffusion] Allow users to disable Safety checker if loading m…
Stax124 Mar 28, 2023
8bdf423
fix KarrasVePipeline bug (#2828)
junhsss Mar 28, 2023
0f14335
StableDiffusionLongPromptWeightingPipeline: Do not hardcode pad token…
AkiSakurai Mar 28, 2023
b76d9fd
Remove suggestion to use cuDNN benchmark in docs (#2793)
d1g1t Mar 28, 2023
159a0bf
Remove duplicate sentence in docstrings (#2834)
qqaatw Mar 28, 2023
7d75681
Update the legacy inpainting SD pipeline, to allow calling it with on…
cmdr2 Mar 28, 2023
920a15c
Fix link to LoRA training guide in DreamBooth training guide (#2836)
ushuz Mar 28, 2023
663c654
[WIP][Docs] Use DiffusionPipeline Instead of Child Classes when Loadi…
dg845 Mar 28, 2023
25d927a
Add `last_epoch` argument to `optimization.get_scheduler` (#2850)
felixblanke Mar 28, 2023
4d0f412
[WIP] Check UNet shapes in StableDiffusionInpaintPipeline __init__ (#…
dg845 Mar 28, 2023
53377ef
[2761]: Add documentation for extra_in_channels UNet1DModel (#2817)
nipunjindal Mar 28, 2023
1384546
[Tests] Adds a test to check if `image_embeds` None case is handled p…
sayakpaul Mar 28, 2023
37c8248
Update evaluation.mdx (#2862)
tolgacangoz Mar 28, 2023
3980858
Update overview.mdx (#2864)
tolgacangoz Mar 28, 2023
ef4c2fa
Update alt_diffusion.mdx (#2865)
tolgacangoz Mar 28, 2023
03fe36f
Update paint_by_example.mdx (#2869)
tolgacangoz Mar 28, 2023
628fefb
Update stable_diffusion_safe.mdx (#2870)
tolgacangoz Mar 28, 2023
40a7b86
[Docs] Correct phrasing (#2873)
patrickvonplaten Mar 28, 2023
d82b032
[Examples] Add streaming support to the ControlNet training example i…
sayakpaul Mar 29, 2023
3be4891
feat: allow offset_noise in dreambooth training example (#2826)
yamanahlawat Mar 29, 2023
e47459c
[docs] Performance tutorial (#2773)
stevhliu Mar 29, 2023
b202127
[Docs] add an example use for `StableUnCLIPPipeline` in the pipeline …
sayakpaul Mar 30, 2023
b3d5cc4
add flax requirement (#2894)
yiyixuxu Mar 30, 2023
9062b28
Support fp16 in conversion from original ckpt (#2733)
burgalon Mar 30, 2023
4960976
make style
patrickvonplaten Mar 30, 2023
1d033a9
img2img.multiple.controlnets.pipeline (#2833)
mikegarts Mar 30, 2023
a937e1b
add load textual inversion embeddings to stable diffusion (#2009)
piEsposito Mar 30, 2023
51d970d
[docs] add the Stable diffusion with Jax/Flax Guide into the docs (#2…
yiyixuxu Mar 31, 2023
0df4ad5
Add support `Karras sigmas` for StableDiffusionKDiffusionPipeline (#2…
takuma104 Mar 31, 2023
1055175
Fix textual inversion loading (#2914)
GuiyeC Mar 31, 2023
e1144ac
Fix slow tests text inv (#2915)
patrickvonplaten Mar 31, 2023
f3fbf9b
Fix check_inputs in upscaler pipeline to allow embeds (#2892)
d1g1t Mar 31, 2023
7b6caca
Modify example with intel optimization (#2896)
mengfei25 Mar 31, 2023
b3c437e
[2884]: Fix cross_attention_kwargs in StableDiffusionImg2ImgPipeline …
nipunjindal Mar 31, 2023
d36103a
[Tests] Speed up test (#2919)
patrickvonplaten Mar 31, 2023
419660c
Have fix current pipeline link (#2910)
guspan-tanadi Mar 31, 2023
89b23d9
Update image_variation.mdx (#2911)
tolgacangoz Mar 31, 2023
c433562
Update controlnet.mdx (#2912)
tolgacangoz Mar 31, 2023
a5bdb67
fix importing diffusers without transformers installed
patrickvonplaten Mar 31, 2023
7447f75
Update pipeline_stable_diffusion_controlnet.py (#2917)
patrickvonplaten Mar 31, 2023
cd634a8
Check for all different packages of opencv (#2901)
wfng92 Mar 31, 2023
f23d6eb
fix missing import
patrickvonplaten Mar 31, 2023
723933f
add another import
patrickvonplaten Mar 31, 2023
8c530fc
make style
patrickvonplaten Mar 31, 2023
7139f0e
fix: norm group test for UNet3D. (#2959)
sayakpaul Apr 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ Push the changes to your account using:
$ git push -u origin a-descriptive-name-for-my-changes
```

6. Once you are satisfied (**and the checklist below is happy too**), go to the
6. Once you are satisfied, go to the
webpage of your fork on GitHub. Click on 'Pull request' to send your changes
to the project maintainers for review.

Expand Down
4 changes: 3 additions & 1 deletion docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
- local: quicktour
title: Quicktour
- local: stable_diffusion
title: Stable Diffusion
title: Effective and efficient diffusion
- local: installation
title: Installation
title: Get started
Expand Down Expand Up @@ -52,6 +52,8 @@
title: How to contribute a Pipeline
- local: using-diffusers/using_safetensors
title: Using safetensors
- local: using-diffusers/stable_diffusion_jax_how_to
title: Stable Diffusion in JAX/Flax
- local: using-diffusers/weighted_prompts
title: Weighting Prompts
title: Pipelines for Inference
Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/api/pipelines/alt_diffusion.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.

# AltDiffusion

AltDiffusion was proposed in [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) by Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu
AltDiffusion was proposed in [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679) by Zhongzhi Chen, Guang Liu, Bo-Wen Zhang, Fulong Ye, Qinghong Yang, Ledell Wu.

The abstract of the paper is the following:

Expand All @@ -28,7 +28,7 @@ The abstract of the paper is the following:

## Tips

- AltDiffusion is conceptually exaclty the same as [Stable Diffusion](./api/pipelines/stable_diffusion/overview).
- AltDiffusion is conceptually exactly the same as [Stable Diffusion](./api/pipelines/stable_diffusion/overview).

- *Run AltDiffusion*

Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/api/pipelines/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ from the local path.
each pipeline, one should look directly into the respective pipeline.

**Note**: All pipelines have PyTorch's autograd disabled by decorating the `__call__` method with a [`torch.no_grad`](https://pytorch.org/docs/stable/generated/torch.no_grad.html) decorator because pipelines should
not be used for training. If you want to store the gradients during the forward pass, we recommend writing your own pipeline, see also our [community-examples](https://github.com/huggingface/diffusers/tree/main/examples/community)
not be used for training. If you want to store the gradients during the forward pass, we recommend writing your own pipeline, see also our [community-examples](https://github.com/huggingface/diffusers/tree/main/examples/community).

## Contribution

Expand Down Expand Up @@ -173,7 +173,7 @@ You can also run this example on colab [![Open In Colab](https://colab.research.

### Tweak prompts reusing seeds and latents

You can generate your own latents to reproduce results, or tweak your prompt on a specific result you liked. [This notebook](https://github.com/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb) shows how to do it step by step. You can also run it in Google Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb).
You can generate your own latents to reproduce results, or tweak your prompt on a specific result you liked. [This notebook](https://github.com/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb) shows how to do it step by step. You can also run it in Google Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb)


### In-painting using Stable Diffusion
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/paint_by_example.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ specific language governing permissions and limitations under the License.

## Overview

[Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
[Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227) by Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen.

The abstract of the paper is the following:

Expand Down
6 changes: 3 additions & 3 deletions docs/source/en/api/pipelines/semantic_stable_diffusion.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ The abstract of the paper is the following:

| Pipeline | Tasks | Colab | Demo
|---|---|:---:|:---:|
| [pipeline_semantic_stable_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/semantic_stable_diffusion/pipeline_semantic_stable_diffusion) | *Text-to-Image Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/semantic-image-editing/blob/main/examples/SemanticGuidance.ipynb) | [Coming Soon](https://huggingface.co/AIML-TUDA)
| [pipeline_semantic_stable_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/semantic_stable_diffusion/pipeline_semantic_stable_diffusion.py) | *Text-to-Image Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/semantic-image-editing/blob/main/examples/SemanticGuidance.ipynb) | [Coming Soon](https://huggingface.co/AIML-TUDA)

## Tips

- The Semantic Guidance pipeline can be used with any [Stable Diffusion](./api/pipelines/stable_diffusion/text2img) checkpoint.
- The Semantic Guidance pipeline can be used with any [Stable Diffusion](./stable_diffusion/text2img.mdx) checkpoint.

### Run Semantic Guidance

Expand Down Expand Up @@ -67,7 +67,7 @@ out = pipe(
)
```

For more examples check the colab notebook.
For more examples check the Colab notebook.

## StableDiffusionSafePipelineOutput
[[autodoc]] pipelines.semantic_stable_diffusion.SemanticStableDiffusionPipelineOutput
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ This should take only around 3-4 seconds on GPU (depending on hardware). The out
![img](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/vermeer_disco_dancing.png)


**Note**: To see how to run all other ControlNet checkpoints, please have a look at [ControlNet with Stable Diffusion 1.5](#controlnet-with-stable-diffusion-1.5)
**Note**: To see how to run all other ControlNet checkpoints, please have a look at [ControlNet with Stable Diffusion 1.5](#controlnet-with-stable-diffusion-1.5).

<!-- TODO: add space -->

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ specific language governing permissions and limitations under the License.

## StableDiffusionImageVariationPipeline

[`StableDiffusionImageVariationPipeline`] lets you generate variations from an input image using Stable Diffusion. It uses a fine-tuned version of Stable Diffusion model, trained by [Justin Pinkney](https://www.justinpinkney.com/) (@Buntworthy) at [Lambda](https://lambdalabs.com/)
[`StableDiffusionImageVariationPipeline`] lets you generate variations from an input image using Stable Diffusion. It uses a fine-tuned version of Stable Diffusion model, trained by [Justin Pinkney](https://www.justinpinkney.com/) (@Buntworthy) at [Lambda](https://lambdalabs.com/).

The original codebase can be found here:
[Stable Diffusion Image Variations](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations)
Expand All @@ -28,4 +28,4 @@ Available Checkpoints are:
- enable_attention_slicing
- disable_attention_slicing
- enable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention
4 changes: 2 additions & 2 deletions docs/source/en/api/pipelines/stable_diffusion_safe.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Safe Stable Diffusion can be tested very easily with the [`StableDiffusionPipeli

### Interacting with the Safety Concept

To check and edit the currently used safety concept, use the `safety_concept` property of [`StableDiffusionPipelineSafe`]
To check and edit the currently used safety concept, use the `safety_concept` property of [`StableDiffusionPipelineSafe`]:
```python
>>> from diffusers import StableDiffusionPipelineSafe

Expand All @@ -60,7 +60,7 @@ You may use the 4 configurations defined in the [Safe Latent Diffusion paper](ht

The following configurations are available: `SafetyConfig.WEAK`, `SafetyConfig.MEDIUM`, `SafetyConfig.STRONG`, and `SafetyConfig.MAX`.

### How to load and use different schedulers.
### How to load and use different schedulers

The safe stable diffusion pipeline uses [`PNDMScheduler`] scheduler by default. But `diffusers` provides many other schedulers that can be used with the stable diffusion pipeline such as [`DDIMScheduler`], [`LMSDiscreteScheduler`], [`EulerDiscreteScheduler`], [`EulerAncestralDiscreteScheduler`] etc.
To use a different scheduler, you can either change it via the [`ConfigMixin.from_config`] method or pass the `scheduler` argument to the `from_pretrained` method of the pipeline. For example, to use the [`EulerDiscreteScheduler`], you can do the following:
Expand Down
100 changes: 88 additions & 12 deletions docs/source/en/api/pipelines/stable_unclip.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,35 +32,68 @@ we do not add any additional noise to the image embeddings i.e. `noise_level = 0
* [stabilityai/stable-diffusion-2-1-unclip](https://hf.co/stabilityai/stable-diffusion-2-1-unclip)
* [stabilityai/stable-diffusion-2-1-unclip-small](https://hf.co/stabilityai/stable-diffusion-2-1-unclip-small)
* Text-to-image
* Coming soon!
* [stabilityai/stable-diffusion-2-1-unclip-small](https://hf.co/stabilityai/stable-diffusion-2-1-unclip-small)

### Text-to-Image Generation
Stable unCLIP can be leveraged for text-to-image generation by pipelining it with the prior model of KakaoBrain's open source DALL-E 2 replication [Karlo](https://huggingface.co/kakaobrain/karlo-v1-alpha)

Coming soon!
```python
import torch
from diffusers import UnCLIPScheduler, DDPMScheduler, StableUnCLIPPipeline
from diffusers.models import PriorTransformer
from transformers import CLIPTokenizer, CLIPTextModelWithProjection

prior_model_id = "kakaobrain/karlo-v1-alpha"
data_type = torch.float16
prior = PriorTransformer.from_pretrained(prior_model_id, subfolder="prior", torch_dtype=data_type)

prior_text_model_id = "openai/clip-vit-large-patch14"
prior_tokenizer = CLIPTokenizer.from_pretrained(prior_text_model_id)
prior_text_model = CLIPTextModelWithProjection.from_pretrained(prior_text_model_id, torch_dtype=data_type)
prior_scheduler = UnCLIPScheduler.from_pretrained(prior_model_id, subfolder="prior_scheduler")
prior_scheduler = DDPMScheduler.from_config(prior_scheduler.config)

stable_unclip_model_id = "stabilityai/stable-diffusion-2-1-unclip-small"

pipe = StableUnCLIPPipeline.from_pretrained(
stable_unclip_model_id,
torch_dtype=data_type,
variant="fp16",
prior_tokenizer=prior_tokenizer,
prior_text_encoder=prior_text_model,
prior=prior,
prior_scheduler=prior_scheduler,
)

pipe = pipe.to("cuda")
wave_prompt = "dramatic wave, the Oceans roar, Strong wave spiral across the oceans as the waves unfurl into roaring crests; perfect wave form; perfect wave shape; dramatic wave shape; wave shape unbelievable; wave; wave shape spectacular"

images = pipe(prompt=wave_prompt).images
images[0].save("waves.png")
```
<Tip warning={true}>

For text-to-image we use `stabilityai/stable-diffusion-2-1-unclip-small` as it was trained on CLIP ViT-L/14 embedding, the same as the Karlo model prior. [stabilityai/stable-diffusion-2-1-unclip](https://hf.co/stabilityai/stable-diffusion-2-1-unclip) was trained on OpenCLIP ViT-H, so we don't recommend its use.

</Tip>

### Text guided Image-to-Image Variation

```python
import requests
import torch
from PIL import Image
from io import BytesIO

from diffusers import StableUnCLIPImg2ImgPipeline
from diffusers.utils import load_image
import torch

pipe = StableUnCLIPImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1-unclip", torch_dtype=torch.float16, variation="fp16"
)
pipe = pipe.to("cuda")

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/stable_unclip/tarsila_do_amaral.png"

response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = load_image(url)

images = pipe(init_image).images
images[0].save("fantasy_landscape.png")
images[0].save("variation_image.png")
```

Optionally, you can also pass a prompt to `pipe` such as:
Expand All @@ -69,7 +102,50 @@ Optionally, you can also pass a prompt to `pipe` such as:
prompt = "A fantasy landscape, trending on artstation"

images = pipe(init_image, prompt=prompt).images
images[0].save("fantasy_landscape.png")
images[0].save("variation_image_two.png")
```

### Memory optimization

If you are short on GPU memory, you can enable smart CPU offloading so that models that are not needed
immediately for a computation can be offloaded to CPU:

```python
from diffusers import StableUnCLIPImg2ImgPipeline
from diffusers.utils import load_image
import torch

pipe = StableUnCLIPImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1-unclip", torch_dtype=torch.float16, variation="fp16"
)
# Offload to CPU.
pipe.enable_model_cpu_offload()

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/stable_unclip/tarsila_do_amaral.png"
init_image = load_image(url)

images = pipe(init_image).images
images[0]
```

Further memory optimizations are possible by enabling VAE slicing on the pipeline:

```python
from diffusers import StableUnCLIPImg2ImgPipeline
from diffusers.utils import load_image
import torch

pipe = StableUnCLIPImg2ImgPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1-unclip", torch_dtype=torch.float16, variation="fp16"
)
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/stable_unclip/tarsila_do_amaral.png"
init_image = load_image(url)

images = pipe(init_image).images
images[0]
```

### StableUnCLIPPipeline
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/conceptual/contribution.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ Push the changes to your account using:
$ git push -u origin a-descriptive-name-for-my-changes
```

6. Once you are satisfied (**and the checklist below is happy too**), go to the
6. Once you are satisfied, go to the
webpage of your fork on GitHub. Click on 'Pull request' to send your changes
to the project maintainers for review.

Expand Down
8 changes: 4 additions & 4 deletions docs/source/en/conceptual/evaluation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ for idx in range(len(dataset)):
edited_images.append(edited_image)
```

To measure the directional similarity, we first load CLIP's image and text encoders.
To measure the directional similarity, we first load CLIP's image and text encoders:

```python
from transformers import (
Expand All @@ -329,7 +329,7 @@ image_encoder = CLIPVisionModelWithProjection.from_pretrained(clip_id).to(device

Notice that we are using a particular CLIP checkpoint, i.e., `openai/clip-vit-large-patch14`. This is because the Stable Diffusion pre-training was performed with this CLIP variant. For more details, refer to the [documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix#diffusers.StableDiffusionInstructPix2PixPipeline.text_encoder).

Next, we prepare a PyTorch `nn.module` to compute directional similarity:
Next, we prepare a PyTorch `nn.Module` to compute directional similarity:

```python
import torch.nn as nn
Expand Down Expand Up @@ -410,7 +410,7 @@ It should be noted that the `StableDiffusionInstructPix2PixPipeline` exposes t

We can extend the idea of this metric to measure how similar the original image and edited version are. To do that, we can just do `F.cosine_similarity(img_feat_two, img_feat_one)`. For these kinds of edits, we would still want the primary semantics of the images to be preserved as much as possible, i.e., a high similarity score.

We can use these metrics for similar pipelines such as the[`StableDiffusionPix2PixZeroPipeline`](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero#diffusers.StableDiffusionPix2PixZeroPipeline)`.
We can use these metrics for similar pipelines such as the [`StableDiffusionPix2PixZeroPipeline`](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/pix2pix_zero#diffusers.StableDiffusionPix2PixZeroPipeline).

<Tip>

Expand Down Expand Up @@ -550,7 +550,7 @@ FID results tend to be fragile as they depend on a lot of factors:
* The image format (not the same if we start from PNGs vs JPGs).

Keeping that in mind, FID is often most useful when comparing similar runs, but it is
hard to to reproduce paper results unless the authors carefully disclose the FID
hard to reproduce paper results unless the authors carefully disclose the FID
measurement code.

These points apply to other related metrics too, such as KID and IS.
Expand Down
28 changes: 9 additions & 19 deletions docs/source/en/optimization/fp16.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ We'll discuss how the following settings impact performance and memory.
| | Latency | Speedup |
| ---------------- | ------- | ------- |
| original | 9.50s | x1 |
| cuDNN auto-tuner | 9.37s | x1.01 |
| fp16 | 3.61s | x2.63 |
| channels last | 3.30s | x2.88 |
| traced UNet | 3.21s | x2.96 |
Expand All @@ -31,18 +30,6 @@ We'll discuss how the following settings impact performance and memory.
steps.
</em>

## Enable cuDNN auto-tuner

[NVIDIA cuDNN](https://developer.nvidia.com/cudnn) supports many algorithms to compute a convolution. Autotuner runs a short benchmark and selects the kernel with the best performance on a given hardware for a given input size.

Since we’re using **convolutional networks** (other types currently not supported), we can enable cuDNN autotuner before launching the inference by setting:

```python
import torch

torch.backends.cudnn.benchmark = True
```

### Use tf32 instead of fp32 (on Ampere and later CUDA devices)

On Ampere and later CUDA devices matrix multiplications and convolutions can use the TensorFloat32 (TF32) mode for faster but slightly less accurate computations. By default PyTorch enables TF32 mode for convolutions but not matrix multiplications, and unless a network requires full float32 precision we recommend enabling this setting for matrix multiplications, too. It can significantly speed up computations with typically negligible loss of numerical accuracy. You can read more about it [here](https://huggingface.co/docs/transformers/v4.18.0/en/performance#tf32). All you need to do is to add this before your inference:
Expand All @@ -58,7 +45,10 @@ torch.backends.cuda.matmul.allow_tf32 = True
To save more GPU memory and get more speed, you can load and run the model weights directly in half precision. This involves loading the float16 version of the weights, which was saved to a branch named `fp16`, and telling PyTorch to use the `float16` type when loading them:

```Python
pipe = StableDiffusionPipeline.from_pretrained(
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",

torch_dtype=torch.float16,
Expand All @@ -85,13 +75,13 @@ For even additional memory savings, you can use a sliced version of attention th
each head which can save a significant amount of memory.
</Tip>

To perform the attention computation sequentially over each head, you only need to invoke [`~StableDiffusionPipeline.enable_attention_slicing`] in your pipeline before inference, like here:
To perform the attention computation sequentially over each head, you only need to invoke [`~DiffusionPipeline.enable_attention_slicing`] in your pipeline before inference, like here:

```Python
import torch
from diffusers import StableDiffusionPipeline
from diffusers import DiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
pipe = DiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",

torch_dtype=torch.float16,
Expand Down Expand Up @@ -415,10 +405,10 @@ To leverage it just make sure you have:
- Cuda available
- [Installed the xformers library](xformers).
```python
from diffusers import StableDiffusionPipeline
from diffusers import DiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
pipe = DiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16,
).to("cuda")
Expand Down
Loading