-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Add variant to transformers #21332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add variant to transformers #21332
Conversation
Bumps [onnx](https://github.com/onnx/onnx) from 1.11.0 to 1.13.0. - [Release notes](https://github.com/onnx/onnx/releases) - [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md) - [Commits](onnx/onnx@v1.11.0...v1.13.0) --- updated-dependencies: - dependency-name: onnx dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
62865ff to
818fa15
Compare
|
The documentation is not available anymore as the PR was closed or merged. |
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick first pass. I'm not sure about all the design implications for transformers, so I just pointed out small comments and suggestions. Will review and test in depth after design is frozen :)
src/transformers/modeling_utils.py
Outdated
|
|
||
| tf_weights_1_path = get_local_path(TF_WEIGHTS_NAME + ".index") | ||
| tf2_weights_2_path = get_local_path(TF2_WEIGHTS_NAME) | ||
| flax_weights_path = get_local_path(FLAX_WEIGHTS_NAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do the same thing for Flax weights (add the suffix) in modeling_flax_utils, I suppose (at least in diffusers).
src/transformers/modeling_utils.py
Outdated
| filename = WEIGHTS_NAME + variant_suffix | ||
| resolved_archive_file = cached_file( | ||
| pretrained_model_name_or_path, WEIGHTS_NAME, **cached_file_kwargs | ||
| pretrained_model_name_or_path, WEIGHTS_NAME + variant_suffix, **cached_file_kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just filename here?
src/transformers/modeling_utils.py
Outdated
| f" same name. Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a" | ||
| f" directory containing a file named {WEIGHTS_NAME}, {TF2_WEIGHTS_NAME}, {TF_WEIGHTS_NAME} or" | ||
| f" {FLAX_WEIGHTS_NAME}." | ||
| f" directory containing a file named {WEIGHTS_NAME + variant_suffix}, {TF2_WEIGHTS_NAME}," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add {WEIGHTS_NAME} too if we accept the suggestion mentioned before.
julien-c
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytorch_model.{variant}.bin sounds better to me, to keep the file-extension (not so important for .bin, but more important for .h5, .safetensors or any other format)
Note that this is different from pytorch_model.bin.index.json (sharding scheme) as that file is only a slice (shard) of a bigger valid file, i.e. is not really valid by itself
Even for |
sgugger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PoC! I left a couple of comments.
141cfa2 to
d9b31ba
Compare
examples/research_projects/decision_transformer/requirements.txt
Outdated
Show resolved
Hide resolved
Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
|
Failing test is unrelated. Think this PR is good for merge. @Wauplin @julien-c good for you? The resulting folder structure now looks as described in the PR statement: #21332 (comment) |
julien-c
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't checked the code, but the file structure lgtm!
Wauplin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well ! :)
LysandreJik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the PR @patrickvonplaten!
|
Thanks for the reviews! Merging |
|
cc @sgugger would it be possible to add this feature to I'd like to use it for BLIP-2. For the moment it seems the only way to do this is calling |
|
Happy to review a PR. |
What does this PR do?
This PR adds a
"variant"keyword argument to PyTorch'sfrom_pretrainedandsave_pretrainedso that multiple weight variants can be saved in the model repo.You can try it out by running:
From this repo: https://huggingface.co/huggingface/the-no-branch-repo/tree/main/text_encoder . The repo is a dummy stable diffusion model and folder structure looks as follows:
cc @pcuenca @patil-suraj @sgugger @LysandreJik @julien-c @osanseviero
[Update] This PR should be ready for merge