Skip to content

Conversation

@patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Jan 27, 2023

What does this PR do?

This PR adds a "variant" keyword argument to PyTorch's from_pretrained and save_pretrained so that multiple weight variants can be saved in the model repo.

You can try it out by running:

from transformers import CLIPTextModel

path = "huggingface/the-no-branch-repo"  # or ./text_encoder if local

print("This should work!:")
model = CLIPTextModel.from_pretrained(path, subfolder="text_encoder", variant="no_ema")
print("This should work!:")
model = CLIPTextModel.from_pretrained(path, subfolder="text_encoder", variant="fp16")
print("This should work!:")
model = CLIPTextModel.from_pretrained(path, subfolder="text_encoder")
print("This should NOT work!:")
model = CLIPTextModel.from_pretrained(path, subfolder="text_encoder", variant="other")

From this repo: https://huggingface.co/huggingface/the-no-branch-repo/tree/main/text_encoder . The repo is a dummy stable diffusion model and folder structure looks as follows:

├── feature_extractor
│   └── preprocessor_config.json
├── load.py
├── model_index.json
├── safety_checker
│   ├── config.json
│   └── pytorch_model.bin
├── save.py
├── scheduler
│   └── scheduler_config.json
├── text_encoder
│   ├── config.json
│   ├── pytorch_model.bin
│   ├── pytorch_model.fp16.bin
│   └── pytorch_model.no_ema.bin
├── tokenizer
│   ├── merges.txt
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   └── vocab.json
├── unet
│   ├── config.json
│   └── diffusion_pytorch_model.bin
└── vae
    ├── config.json
    └── diffusion_pytorch_model.bin

cc @pcuenca @patil-suraj @sgugger @LysandreJik @julien-c @osanseviero

[Update] This PR should be ready for merge

Bumps [onnx](https://github.com/onnx/onnx) from 1.11.0 to 1.13.0.
- [Release notes](https://github.com/onnx/onnx/releases)
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md)
- [Commits](onnx/onnx@v1.11.0...v1.13.0)

---
updated-dependencies:
- dependency-name: onnx
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@patrickvonplaten patrickvonplaten changed the title Add variant to transformers [RFC] Add variant to transformers Jan 27, 2023
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 27, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick first pass. I'm not sure about all the design implications for transformers, so I just pointed out small comments and suggestions. Will review and test in depth after design is frozen :)


tf_weights_1_path = get_local_path(TF_WEIGHTS_NAME + ".index")
tf2_weights_2_path = get_local_path(TF2_WEIGHTS_NAME)
flax_weights_path = get_local_path(FLAX_WEIGHTS_NAME)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do the same thing for Flax weights (add the suffix) in modeling_flax_utils, I suppose (at least in diffusers).

filename = WEIGHTS_NAME + variant_suffix
resolved_archive_file = cached_file(
pretrained_model_name_or_path, WEIGHTS_NAME, **cached_file_kwargs
pretrained_model_name_or_path, WEIGHTS_NAME + variant_suffix, **cached_file_kwargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just filename here?

f" same name. Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a"
f" directory containing a file named {WEIGHTS_NAME}, {TF2_WEIGHTS_NAME}, {TF_WEIGHTS_NAME} or"
f" {FLAX_WEIGHTS_NAME}."
f" directory containing a file named {WEIGHTS_NAME + variant_suffix}, {TF2_WEIGHTS_NAME},"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add {WEIGHTS_NAME} too if we accept the suggestion mentioned before.

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytorch_model.{variant}.bin sounds better to me, to keep the file-extension (not so important for .bin, but more important for .h5, .safetensors or any other format)

Note that this is different from pytorch_model.bin.index.json (sharding scheme) as that file is only a slice (shard) of a bigger valid file, i.e. is not really valid by itself

@Wauplin
Copy link
Contributor

Wauplin commented Jan 27, 2023

pytorch_model.{variant}.bin sounds better to me, to keep the file-extension (not so important for .bin, but more important for .h5, .safetensors or any other format)

Even for .bin files, I'd say it's good to keep the file extension as it does not break the LFS property for existing .gitattributes files (see huggingface/the-no-branch-repo where bin files are uploaded as regular).

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PoC! I left a couple of comments.

patrickvonplaten and others added 2 commits January 30, 2023 22:00
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
@patrickvonplaten patrickvonplaten changed the title [RFC] Add variant to transformers Add variant to transformers Jan 30, 2023
@patrickvonplaten
Copy link
Contributor Author

patrickvonplaten commented Jan 31, 2023

Failing test is unrelated. Think this PR is good for merge.

@Wauplin @julien-c good for you?

The resulting folder structure now looks as described in the PR statement: #21332 (comment)

Copy link
Member

@julien-c julien-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't checked the code, but the file structure lgtm!

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well ! :)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR @patrickvonplaten!

@patrickvonplaten
Copy link
Contributor Author

Thanks for the reviews! Merging

@patrickvonplaten patrickvonplaten merged commit 90cddfa into main Feb 1, 2023
@patrickvonplaten patrickvonplaten deleted the variant_saving_format branch February 1, 2023 08:21
@NielsRogge
Copy link
Contributor

cc @sgugger would it be possible to add this feature to push_to_hub as well?

I'd like to use it for BLIP-2. For the moment it seems the only way to do this is calling save_pretrained("...", variant="fp16") and then manually upload the PyTorch checkpoint to the model repo

@sgugger
Copy link
Collaborator

sgugger commented Feb 7, 2023

Happy to review a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants