Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| theta = theta * ntk_factor | ||
| freqs = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=freqs_dtype)[: (dim // 2)] / dim)) / linear_factor # [D/2] | ||
| freqs = freqs.to(pos.device) | ||
| freqs = ( |
There was a problem hiding this comment.
@sayakpaul
this is the only code change from #9321
There was a problem hiding this comment.
(nit): We could add a comment on emphasizing the impact of creating torch.arange() on device so as to prevent syncs. This will be helpful references for us going forward. What do you think?
We could do similar for the changes introduced in https://github.com/huggingface/diffusers/pull/9307/files as well.
There was a problem hiding this comment.
this did not cause the sync, though
sayakpaul
left a comment
There was a problem hiding this comment.
Thanks. Just a single minor comment.
| theta = theta * ntk_factor | ||
| freqs = 1.0 / (theta ** (torch.arange(0, dim, 2, dtype=freqs_dtype)[: (dim // 2)] / dim)) / linear_factor # [D/2] | ||
| freqs = freqs.to(pos.device) | ||
| freqs = ( |
There was a problem hiding this comment.
(nit): We could add a comment on emphasizing the impact of creating torch.arange() on device so as to prevent syncs. This will be helpful references for us going forward. What do you think?
We could do similar for the changes introduced in https://github.com/huggingface/diffusers/pull/9307/files as well.
* update * fix --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
create the
freqstensor on device to avoid potential sync