[generate] beam search -- fix output cropping#37080
Merged
ArthurZucker merged 6 commits intohuggingface:mainfrom Mar 28, 2025
Merged
[generate] beam search -- fix output cropping#37080ArthurZucker merged 6 commits intohuggingface:mainfrom
ArthurZucker merged 6 commits intohuggingface:mainfrom
Conversation
Contributor
|
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
gante
commented
Mar 28, 2025
| dct = tok(ARTICLE, return_tensors="pt") | ||
| generated_ids = hf.generate(**dct, num_beams=4) | ||
| result = tok.batch_decode(generated_ids, skip_special_tokens=True)[0] | ||
| result = tok.batch_decode(generated_ids)[0] |
Contributor
Author
There was a problem hiding this comment.
Tests: update beam search tests to also print special tokens
e.g. this updated test fails on main because it is returning extra pad tokens, because of the incorrect crop
ArthurZucker
approved these changes
Mar 28, 2025
Collaborator
ArthurZucker
left a comment
There was a problem hiding this comment.
LGTM thanks for digging and fixing this quickly!
ArthurZucker
pushed a commit
that referenced
this pull request
Mar 28, 2025
* handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment
9 tasks
zucchini-nlp
pushed a commit
to zucchini-nlp/transformers
that referenced
this pull request
May 14, 2025
* handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment
soghomon-b
pushed a commit
to soghomon-b/transformers
that referenced
this pull request
Aug 24, 2025
* handle jagged beams * better comment * bart -- beam search tests print special tokens * more bart test updates * more tests! * better comment
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
VLLM is seeing some output differences in their CI when beam search is being used. The difference can be tracked to the beam search refactor (#35802).
Inspecting the outputs, we can see that there are a few additional pad tokens on the right. This is because the output was not being cropped correctly when the selected beam is shorter than the generation length (i.e. when the highest-scoring beam is NOT from the latest decoding iteration, but rather some previously completed beam).
After #35802: output length = input length + number of decoding iterations
Before #35802 and in this PR: output length = length of the longest selected beam
This PR also changed a few beam search tests to check their special tokens, which would have prevented this bug.