Remove redundant alignment of pad_token_id#5487
Remove redundant alignment of pad_token_id#5487albertvillanova merged 1 commit intohuggingface:mainfrom
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit bcd4e6c. Configure here.
| "in the vocabulary before using it as a padding token." | ||
| ) | ||
| processing_class.pad_token = pad_token | ||
| model.config.pad_token_id = processing_class.pad_token_id |
There was a problem hiding this comment.
Removal relies on feature absent in minimum transformers version
Medium Severity
The removed model.config.pad_token_id = processing_class.pad_token_id assignment relies on Trainer._align_special_tokens() to handle alignment automatically, but this mechanism only exists in transformers v5.5.0+ (main branch). The project's minimum supported version is transformers>=4.56.2, where no such alignment occurs. For models with config.pad_token_id = None (e.g., Llama, Mistral), sequence classification models use config.pad_token_id during forward-pass pooling — without alignment, this can cause ValueError for batch sizes > 1 or incorrect pooling on older supported versions.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit bcd4e6c. Configure here.
There was a problem hiding this comment.
I have checked transformers code:
_align_special_tokenswas implemented in v4.56.0 by:- it was renamed to
align_special_tokensin v5.2.0 by:
|
can you quickly check when |
|
Done @qgallouedec. See: #5487 (comment) |


Remove redundant alignment of
pad_token_id.Note that this alignment is already done by
transformers.Trainer.train()→self.align_special_tokens()/self._align_special_tokens()This PR removes redundant alignment of
pad_token_idbetween the tokenizer and the mdoel config, relying instead on thetransformers.Trainer.align_special_tokens(), called during training.Changes
Padding token management cleanup:
model.config.pad_token_id = tokenizer.pad_token_idinexamples/scripts/prm.py.model.config.pad_token_id = processing_class.pad_token_idin theRewardTrainerinitialization.Note
Low Risk
Low risk cleanup that removes duplicate
pad_token_idassignments and relies on Transformers' built-in special-token alignment during training; behavior could change only if code paths depend on the earlier manual override beforeTrainer.train()runs.Overview
Removes manual alignment of
model.config.pad_token_idwith the tokenizer/processing class in bothexamples/scripts/prm.pyandRewardTrainer, relying on Transformers'Trainerspecial-token alignment instead.This reduces redundant configuration mutation and keeps padding token handling centralized in the upstream training flow.
Reviewed by Cursor Bugbot for commit bcd4e6c. Bugbot is set up for automated code reviews on this repo. Configure here.