[resize_embedding] Introduce pad_to_multiple_of and guidance#25088
Merged
ArthurZucker merged 16 commits intohuggingface:mainfrom Aug 17, 2023
Merged
[resize_embedding] Introduce pad_to_multiple_of and guidance#25088ArthurZucker merged 16 commits intohuggingface:mainfrom
resize_embedding] Introduce pad_to_multiple_of and guidance#25088ArthurZucker merged 16 commits intohuggingface:mainfrom
Conversation
4 tasks
|
The documentation is not available anymore as the PR was closed or merged. |
15ff141 to
3cb0d07
Compare
Tokenization] Introduce support for negative index padding when people forget to have a padding tokenresize_embedding] Introduce pad_to_multiple_of and guidance
ArthurZucker
commented
Aug 1, 2023
Collaborator
Author
ArthurZucker
left a comment
There was a problem hiding this comment.
All breaking changes happen in the internal and non exposed part of the functions.
…to pad-tok-negativ
…to pad-tok-negativ
sgugger
approved these changes
Aug 17, 2023
Collaborator
sgugger
left a comment
There was a problem hiding this comment.
Thanks for cleaning this up!
4 tasks
4 tasks
12 tasks
Millu
added a commit
to invoke-ai/InvokeAI
that referenced
this pull request
Nov 10, 2023
## What type of PR is this? (check all applicable) - [ ] Refactor - [ ] Feature - [X] Bug Fix - [X] Optimization - [ ] Documentation Update - [ ] Community Node Submission ## Have you discussed this change with the InvokeAI team? - [X] Yes, with @blessedcoolant - [ ] No, because: ## Have you updated all relevant documentation? - [ ] Yes - [ ] No ## Description This PR updates Transformers to the most recent version and fixes the value `pad_to_multiple_of` for `text_encoder.resize_token_embeddings` which was introduced with huggingface/transformers#25088 in Transformers 4.32.0. According to the [Nvidia Documentation](https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc), `Performance is better when equivalent matrix dimensions M, N, and K are aligned to multiples of 8 bytes (or 64 bytes on A100) for FP16` This fixes the following error that was popping up before every invocation starting with Transformers 4.32.0 `You are resizing the embedding layer without providing a pad_to_multiple_of parameter. This means that the new embedding dimension will be None. This might induce some performance reduction as Tensor Cores will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc` This is my first "real" fix PR, so I hope this is fine. Please inform me if there is anything wrong with this. I am glad to help. Have a nice day and thank you! ## Related Tickets & Documents <!-- For pull requests that relate or close an issue, please include them below. For example having the text: "closes #1234" would connect the current pull request to issue 1234. And when we merge the pull request, Github will automatically close the issue. --> - Related Issue: huggingface/transformers#26303 - Related Discord discussion: https://discord.com/channels/1020123559063990373/1154152783579197571 - Closes # ## QA Instructions, Screenshots, Recordings <!-- Please provide steps on how to test changes, any hardware or software specifications as well as any other pertinent information. --> ## Added/updated tests? - [ ] Yes - [ ] No : _please replace this line with details on why tests have not been included_ ## [optional] Are there any post deployment tasks we need to perform?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes #22312.
After internal discussions, it appears that adding the possibility to pad with
-1totokenizersis not really feasible ( nor is it desirable).However, what we can do is by default resize the embedding layer to the nearest size that is optimal for the dtype of the model following this.
Motivations:
_get_resized_embeddingsis not exposed, and thus making this automatic can be a big silent win.Cons:
config.optimise_resizemight be needed?