[ T5] fix fp16 loading issue#20878
Merged
younesbelkada merged 4 commits intohuggingface:mainfrom Dec 26, 2022
Merged
Conversation
younesbelkada
commented
Dec 22, 2022
|
The documentation is not available anymore as the PR was closed or merged. |
8bf5c89 to
43006f0
Compare
sgugger
approved these changes
Dec 23, 2022
Collaborator
sgugger
left a comment
There was a problem hiding this comment.
Thanks for fixing! LGTM with just one nit.
| force_upcast_dtype = torch.float32 | ||
|
|
||
| # For backward compatibility with older versions of `accelerate` | ||
| if set_module_tensor_to_device.__code__.co_argcount == 5: |
Collaborator
There was a problem hiding this comment.
Slight nit: can we use the signature and parameter names using inspect? It would be clearer to read. Also add a TODO that this should become a version check at the next version of Accelerate (I will take care of it after next release).
- remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`
MKhalusova
pushed a commit
to MKhalusova/transformers
that referenced
this pull request
Dec 28, 2022
* fix fp16 loading issue * add backward compatibility * better refactor * better readability - remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`
silverriver
pushed a commit
to silverriver/transformers
that referenced
this pull request
Jan 6, 2023
* fix fp16 loading issue * add backward compatibility * better refactor * better readability - remove `force_upcast_dtype` as it is used once - use `inspect` - add `TODO`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR mainly fixes https://github.com/huggingface/transformers/actions/runs/3754402958/jobs/6378652143
Since the PR huggingface/accelerate#920 has been merged, the fix proposed in #20760 seems to not work anymore using the main branch of
acceleratefor some specific cases.To reproduce (use the main branch of
accelerate):Why?
I believe this is because the aforementioned PR introduced a new argument
dtypeon the functionset_module_tensor_to_device, if this argument is set toNone(by default), the target value is automatically set to thedtypeof the old tensor - which slightly breaks some assumptions made in #20760I believe upstreaming this change on
modeling_utilsby adding the support of this new argument should be the fix. As some users might not use the latest version of accelerate, I added a small hack to make this change backward compatible, but I am not sure if this is the best solutionTested this fix on the main branch of
accelerate,accelerate==0.15.0and all relevant tests passcc @sgugger