Skip to content

[resize_embedding] Introduce pad_to_multiple_of and guidance#25088

Merged
ArthurZucker merged 16 commits intohuggingface:mainfrom
ArthurZucker:pad-tok-negativ
Aug 17, 2023
Merged

[resize_embedding] Introduce pad_to_multiple_of and guidance#25088
ArthurZucker merged 16 commits intohuggingface:mainfrom
ArthurZucker:pad-tok-negativ

Conversation

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker commented Jul 25, 2023

What does this PR do?

Fixes #22312.
After internal discussions, it appears that adding the possibility to pad with -1 to tokenizers is not really feasible ( nor is it desirable).
However, what we can do is by default resize the embedding layer to the nearest size that is optimal for the dtype of the model following this.

Motivations:

  • the _get_resized_embeddings is not exposed, and thus making this automatic can be a big silent win.
  • if properly documented, should not really have issues.

Cons:

  • it is not backward compatible, so some kind of config.optimise_resize might be needed?
  • it is hidden and people might not really get why tokenizer.vocab_size will be different than the model's embedding dimension.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Jul 25, 2023

The documentation is not available anymore as the PR was closed or merged.

@ArthurZucker ArthurZucker changed the title [Tokenization] Introduce support for negative index padding when people forget to have a padding token [resize_embedding] Introduce pad_to_multiple_of and guidance Aug 1, 2023
Copy link
Copy Markdown
Collaborator Author

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All breaking changes happen in the internal and non exposed part of the functions.

@ArthurZucker ArthurZucker marked this pull request as ready for review August 1, 2023 08:42
@ArthurZucker ArthurZucker requested a review from sgugger August 17, 2023 14:20
Copy link
Copy Markdown
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning this up!

Millu added a commit to invoke-ai/InvokeAI that referenced this pull request Nov 10, 2023
## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [X] Bug Fix
- [X] Optimization
- [ ] Documentation Update
- [ ] Community Node Submission


## Have you discussed this change with the InvokeAI team?
- [X] Yes, with @blessedcoolant 
- [ ] No, because:

      
## Have you updated all relevant documentation?
- [ ] Yes
- [ ] No


## Description

This PR updates Transformers to the most recent version and fixes the
value `pad_to_multiple_of` for `text_encoder.resize_token_embeddings`
which was introduced with
huggingface/transformers#25088 in Transformers
4.32.0.

According to the [Nvidia
Documentation](https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc),
`Performance is better when equivalent matrix dimensions M, N, and K are
aligned to multiples of 8 bytes (or 64 bytes on A100) for FP16`
This fixes the following error that was popping up before every
invocation starting with Transformers 4.32.0
`You are resizing the embedding layer without providing a
pad_to_multiple_of parameter. This means that the new embedding
dimension will be None. This might induce some performance reduction as
Tensor Cores will not be available. For more details about this, or help
on choosing the correct value for resizing, refer to this guide:
https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc`

This is my first "real" fix PR, so I hope this is fine. Please inform me
if there is anything wrong with this. I am glad to help.

Have a nice day and thank you!


## Related Tickets & Documents

<!--
For pull requests that relate or close an issue, please include them
below. 

For example having the text: "closes #1234" would connect the current
pull
request to issue 1234.  And when we merge the pull request, Github will
automatically close the issue.
-->

- Related Issue:
huggingface/transformers#26303
- Related Discord discussion:
https://discord.com/channels/1020123559063990373/1154152783579197571
- Closes #

## QA Instructions, Screenshots, Recordings

<!-- 
Please provide steps on how to test changes, any hardware or 
software specifications as well as any other pertinent information. 
-->

## Added/updated tests?

- [ ] Yes
- [ ] No : _please replace this line with details on why tests
      have not been included_

## [optional] Are there any post deployment tasks we need to perform?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LlamaTokenizer has no pad token, leading to failure during batch-tokenization

3 participants