Skip to content

Remove redundant alignment of pad_token_id#5487

Merged
albertvillanova merged 1 commit intohuggingface:mainfrom
albertvillanova:rm-redundant_align_special_tokens
Apr 9, 2026
Merged

Remove redundant alignment of pad_token_id#5487
albertvillanova merged 1 commit intohuggingface:mainfrom
albertvillanova:rm-redundant_align_special_tokens

Conversation

@albertvillanova
Copy link
Copy Markdown
Member

@albertvillanova albertvillanova commented Apr 9, 2026

Remove redundant alignment of pad_token_id.

Note that this alignment is already done by transformers.Trainer.train()self.align_special_tokens() / self._align_special_tokens()

This PR removes redundant alignment of pad_token_id between the tokenizer and the mdoel config, relying instead on the transformers.Trainer.align_special_tokens(), called during training.

Changes

Padding token management cleanup:

  • Removed explicit assignment of model.config.pad_token_id = tokenizer.pad_token_id in examples/scripts/prm.py.
  • Removed explicit assignment of model.config.pad_token_id = processing_class.pad_token_id in the RewardTrainer initialization.

Note

Low Risk
Low risk cleanup that removes duplicate pad_token_id assignments and relies on Transformers' built-in special-token alignment during training; behavior could change only if code paths depend on the earlier manual override before Trainer.train() runs.

Overview
Removes manual alignment of model.config.pad_token_id with the tokenizer/processing class in both examples/scripts/prm.py and RewardTrainer, relying on Transformers' Trainer special-token alignment instead.

This reduces redundant configuration mutation and keeps padding token handling centralized in the upstream training flow.

Reviewed by Cursor Bugbot for commit bcd4e6c. Bugbot is set up for automated code reviews on this repo. Configure here.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit bcd4e6c. Configure here.

"in the vocabulary before using it as a padding token."
)
processing_class.pad_token = pad_token
model.config.pad_token_id = processing_class.pad_token_id
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removal relies on feature absent in minimum transformers version

Medium Severity

The removed model.config.pad_token_id = processing_class.pad_token_id assignment relies on Trainer._align_special_tokens() to handle alignment automatically, but this mechanism only exists in transformers v5.5.0+ (main branch). The project's minimum supported version is transformers>=4.56.2, where no such alignment occurs. For models with config.pad_token_id = None (e.g., Llama, Mistral), sequence classification models use config.pad_token_id during forward-pass pooling — without alignment, this can cause ValueError for batch sizes > 1 or incorrect pooling on older supported versions.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit bcd4e6c. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have checked transformers code:

@qgallouedec
Copy link
Copy Markdown
Member

can you quickly check when _align_special_tokens was introduced in trainer? it's not that very old iirc

@albertvillanova
Copy link
Copy Markdown
Member Author

Done @qgallouedec. See: #5487 (comment)

Copy link
Copy Markdown
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@albertvillanova albertvillanova merged commit 8900a14 into huggingface:main Apr 9, 2026
14 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants