Refactor `attention.py`

`attention.py` has at the moment two concurrent attention implementations which essentially do the exact same thing:

- https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256
and 
- https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/cross_attention.py#L30

Both https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/cross_attention.py#L30 and https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256 are already used for "simple" attention - e.g. the former for Stable Diffusion and the later for the simple DDPM UNet.

We should start deprecating https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256 very soon as it's not viable to keep two attention mechanisms. 

Deprecating this class won't be easy as it essentially means we have to force people to re-upload their weights. Essentially every model checkpoint that made use of https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256 has to eventually re-upload their weights to be kept compatible. 

I would propose to do this in the following way:

- 1) To begin with when https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256 is called the code will convert the weights on the fly to https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/cross_attention.py#L30 with a **very clear** deprecation message that explains in detail how to one can save & re-upload the weights to remove the deprecation message
- 2) Open a mass PR on all checkpoints that make use of https://github.com/huggingface/diffusers/blob/62608a9102e423ad0fe79f12a8ceb1710d2027b2/src/diffusers/models/attention.py#L256 (can be retrieved via the config) to convert the weights to the new format. 
- 3) Update https://github.com/apple/ml-stable-diffusion/ to support the new weight format (cc @pcuenca)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor `attention.py` #1880

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor attention.py #1880

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Refactor `attention.py` #1880