Fix forced_bos_token_id not set in generation_config by Addyk-24 · Pull Request #41521 · huggingface/transformers

Addyk-24 · 2025-10-11T12:19:20Z

What does this PR do?

Fixes incorrect target language generation during evaluation/validation in run_translation.py for multilingual translation models (mBART , M2M100).

Problem

When fine-tuning multilingual models, forced_bos_token_id was only set in model.config but not in model.generation_config. During evaluation, model.generate() reads from generation_config, causing generation in wrong language and artificially low BLEU scores.Previously would be ~2-5 (wrong language)

Solution

Set forced_bos_token_id in both model.config and model.generation_config.

Results:

✅ this generated correct language token id with correct target language.
✅ This warning appears if you modify model.config directly for generation. Using model.generation_config removes this warning and ensures Transformers v5+ uses the setting correctly.
✅ All evaluations complete without errors
✅ Using decoder_start_token_id only/both causes empty outputs.
✅ With this fix, the target language ID is automatically handled during generation.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@zach-huggingface @Cyrilvallez

Bmingg · 2025-10-11T21:52:38Z

What's worked for me is that I set model.generation_config.decoder_start_token_id to the target language ID of MBart. When I discovered this bug, I think I checked the forced_bos_token_id of the output of MBart, and it should still be the start token s>. In this case, I think what was missing was the target language ID after start token, if I remembered correctly.

Addyk-24 · 2025-10-12T06:41:53Z

What's worked for me is that I set model.generation_config.decoder_start_token_id to the target language ID of MBart. When I discovered this bug, I think I checked the forced_bos_token_id of the output of MBart, and it should still be the start token s>. In this case, I think what was missing was the target language ID after start token, if I remembered correctly.

Thanks for sharing your experience! I ran custom comprehensive tests to verify the correct fix, and here's what I found:

Test Performed :

I tested 4 different approaches on facebook/mbart-large-50-one-to-many-mmt translating en_XX → de_DE:

Baseline (no fix): First 5 token IDs:
- [2, 250002, 64681, 4, 1199] -> this generated wrong target language
Setting forced_bos_token_id in generation_config (my current fix):
- First 5 token IDs: [2, 250003, 54029, 4, 1225] -> this generated correct language token id with correct target language.
Setting decoder_start_token_id only:
- First 5 token IDs: [250003, 2] -> Generated: "" (empty/broken output)
Setting both:
- First 5 token IDs: [250003, 250003, 2] -> Generated: "" (broken - duplicates language token)

Below is the fix that i have done :

Result :

The forced_bos_token_id parameter controls what comes after the EOS token (which is the target language ID).
While decoder_start_token_id seems to break the generation flow when set to the language ID.

Conclusion

Based on tests, this PR ensures correct target language generation for mBART by applying generation_config.forced_bos_token_id.
Using decoder_start_token_id instead causes invalid or empty outputs.
With this fix, the target language ID is automatically handled during generation.

Cyrilvallez · 2025-10-13T14:00:20Z

@Addyk-24 do you mind reverting all unrelated changes please? 🤗 I.e. all style changes (newlines etc) so that we can see the clear diff

Addyk-24 · 2025-10-13T16:36:29Z

@Addyk-24 do you mind reverting all unrelated changes please? 🤗 I.e. all style changes (newlines etc) so that we can see the clear diff

@Cyrilvallez Done! I've reverted all unrelated formatting changes. The PR now only includes the fix for setting forced_bos_token_id in generation_config, so the language ID is handled automatically instead of manually. I've also performed 4 custom tests to verify this behavior. Please let me know if any further adjustments are needed. Thanks 🤗

Cyrilvallez · 2025-10-15T10:03:08Z

Thanks! cc @gante here as well, it's a simple change from config to generation_config, maybe worth checking it out to see if it should be upstreamed to generate!

Fix: set forced_bos_token_id via generation_config

5a1857e

Addyk-24 closed this Oct 13, 2025

Addyk-24 force-pushed the fix/model_generation_config_fix branch from 1198033 to 3927ffe Compare October 13, 2025 16:23

Addyk-24 reopened this Oct 13, 2025

This was referenced Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#43

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix forced_bos_token_id not set in generation_config#41521

Fix forced_bos_token_id not set in generation_config#41521
Addyk-24 wants to merge 1 commit intohuggingface:mainfrom
Addyk-24:fix/model_generation_config_fix

Addyk-24 commented Oct 11, 2025 •

edited

Loading

Uh oh!

Bmingg commented Oct 11, 2025 •

edited

Loading

Uh oh!

Addyk-24 commented Oct 12, 2025

Uh oh!

Cyrilvallez commented Oct 13, 2025

Uh oh!

Addyk-24 commented Oct 13, 2025 •

edited

Loading

Uh oh!

Cyrilvallez commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Addyk-24 commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Problem

Solution

Before submitting

Who can review?

Uh oh!

Bmingg commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Addyk-24 commented Oct 12, 2025

Test Performed :

Result :

Conclusion

Uh oh!

Cyrilvallez commented Oct 13, 2025

Uh oh!

Addyk-24 commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Cyrilvallez commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Addyk-24 commented Oct 11, 2025 •

edited

Loading

Bmingg commented Oct 11, 2025 •

edited

Loading

Addyk-24 commented Oct 13, 2025 •

edited

Loading