-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
attention.py has at the moment two concurrent attention implementations which essentially do the exact same thing:
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
class AttentionBlock(nn.Module):
andclass CrossAttention(nn.Module):
Both
| class CrossAttention(nn.Module): |
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
| class AttentionBlock(nn.Module): |
We should start deprecating
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
| class AttentionBlock(nn.Module): |
Deprecating this class won't be easy as it essentially means we have to force people to re-upload their weights. Essentially every model checkpoint that made use of
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
| class AttentionBlock(nn.Module): |
I would propose to do this in the following way:
-
- To begin with when is called the code will convert the weights on the fly to
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
class AttentionBlock(nn.Module): with a very clear deprecation message that explains in detail how to one can save & re-upload the weights to remove the deprecation messageclass CrossAttention(nn.Module):
- To begin with when
-
- Open a mass PR on all checkpoints that make use of (can be retrieved via the config) to convert the weights to the new format.
diffusers/src/diffusers/models/attention.py
Line 256 in 62608a9
class AttentionBlock(nn.Module):
- Open a mass PR on all checkpoints that make use of
-
- Update https://github.com/apple/ml-stable-diffusion/ to support the new weight format (cc @pcuenca)