ulysses enabling in native attention path#12563
Conversation
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
d6256ab to
109d2dd
Compare
|
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
|
Also worked for me, thanks |
| enable_gqa=enable_gqa, | ||
| ) | ||
| out = out.permute(0, 2, 1, 3) | ||
| if _parallel_config is None: |
There was a problem hiding this comment.
Here, 'supports_context_parallel=True' should be also added to register @sywangyi @sayakpaul
There was a problem hiding this comment.
@DefTruth that should also fix #12446 (comment) right? Could you give this a check?
There was a problem hiding this comment.
@DefTruth that should also fix #12446 (comment) right? Could you give this a check?
confirm fixed
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
|
@sywangyi can we enable it for Ring x Native, too? I don't see ring's reliance on lse, either. |
no, ring need lse to guarantee the precision see https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_dispatch.py#L1062-L1065 |
|
Yeah but it's always |
|
no, see diffusers/src/diffusers/models/attention_dispatch.py Lines 1066 to 1067 in 325a950 |
|
Ah sorry for the oversight. Thanks for clarifying. |


fix the corrupted output when enable ulysses with native attention. since native attention is widely use in no-cuda plaform. and ulysses does not rely on lse. so ulysses attention still could be used in native attention path.