Skip to content

FIX: enable load_best_model_at_end within SaveStrategy.BEST and initialize metric_for_best_model as loss when SaveStrategy.BEST#40221

Open
thechaos16 wants to merge 4 commits intohuggingface:mainfrom
thechaos16:main
Open

FIX: enable load_best_model_at_end within SaveStrategy.BEST and initialize metric_for_best_model as loss when SaveStrategy.BEST#40221
thechaos16 wants to merge 4 commits intohuggingface:mainfrom
thechaos16:main

Conversation

@thechaos16
Copy link
Copy Markdown

What does this PR do?

  • enable load_best_model_at_end when even if save_strategy is best
    • beforehand, it doesn't show any error, but it doesn't load the best, but the last (because self.state.best_global_step wasn't updated)
  • enable metric_for_best_model when save_strategy is best
    • metric_for_best_model is initialized as loss when load_best_model_at_end is enabled, but not if save_strategy is set as best. I think these two behaviors should be the same.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@zach-huggingface, @SunMarc and @qgallouedec

@thechaos16 thechaos16 force-pushed the main branch 2 times, most recently from 39e3020 to b35a06d Compare August 16, 2025 14:57
…alize metric_for_best_model as loss when SaveStrategy.BEST

Signed-off-by: thechaos16 <thechaos16@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant