Fix: NotebookProgressCallback crash when evaluating with the Trainer#44949
Fix: NotebookProgressCallback crash when evaluating with the Trainer#44949SunMarc merged 6 commits intohuggingface:mainfrom
Conversation
e61e591 to
e25f2a6
Compare
|
does this fix your issue @HenrikEilers ? |
|
@SunMarc Just checking in, happy to make any changes if needed. If the original reporter isn’t available, I can also provide more details or tests to help validate the fix. |
I cant really test it right now but if it now allows for trainer.evaluate() to be called after training then yes |
|
Is there anything left to do for us to get the pr aproved? |
|
Just checking in again, this PR should resolve the original issue by allowing trainer.evaluate() after training, as discussed above. Tests are passing on my side, and I’m happy to add more coverage if needed. @SunMarc I'd really appreciate a review when you have time :) |
| if self.training_tracker is None: | ||
| return control |
There was a problem hiding this comment.
Ok this will work but we are not outputing anything in this case no like tt.write_line(values). Can you check what is the output that we get and if it makes sense, maybe we should add it.
There was a problem hiding this comment.
Good point, I’ve updated on_evaluate to display the metrics as a standalone HTML table when there's no training tracker, so the user still sees the output. I also included the first_column computation into on_evaluate directly so it doesn't depend on `on_train_begin having run.
There was a problem hiding this comment.
Yeah, Model preparation time shouldn't show up. We shouldn't necessarily filter this one in particular but find out why it shows up when it doesn't when calling evaluate after train. Thanks for testing !
There was a problem hiding this comment.
Found it. The trainer adds eval_model_preparation_time to the metrics dict when the model hasn't been prepared yet (self.accelerator._models is empty), which only happens when evaluate() is called before train(). After training, the model is already prepared so the metric is never added. The fix is just adding metrics.pop(f\"{metric_key_prefix}_model_preparation_time\", None) alongside the other filtered metrics.
08f50cb to
7fcca92
Compare
|
|
||
| def on_evaluate(self, args, state, control, metrics=None, **kwargs): | ||
| tt = _require(self.training_tracker, "on_train_begin must be called before on_evaluate") | ||
| self.first_column = "Epoch" if args.eval_strategy == IntervalStrategy.EPOCH else "Step" |
There was a problem hiding this comment.
why we need to overwrite that ? we shouldn't have to
There was a problem hiding this comment.
Because on_evaluate can also be called before on_train_begin (which is the bug this PR fixes), but self.first_column wouldn't exist yet since it's only initialized in on_train_begin. Another option would be to move the initialization to __init__ with a default of "Step" instead, so on_evaluate doesn't need to overwrite it. Defaulting to "Step" could make sense here since if training hasn't started, there are no epochs to reference. So I can do that if you prefer it.
There was a problem hiding this comment.
understood. let's keep what you did, maybe just add a comment about that above
7bcd9b1 to
055eb9f
Compare
SunMarc
left a comment
There was a problem hiding this comment.
Thanks ! Just fix the issue about the logs that shows up incorrectly when doing evaluation first and we can merge this
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
055eb9f to
2d98716
Compare
…uggingface#44949) * Fix NotebookProgressCallback to allow evaluate() before and after train * Add unit test for NotebookProgressCallback evaluating before and after training * Skip NotebookProgressCallback tests when IPython is not installed * Display eval metrics when training tracker is None on NotebookProgressCallback * Add is_ipython_available and require_ipython test decorator * Filter model_preparation_time metric and add code comments in on_eval


What does this PR do?
Fixes #44936
This PR fixes an issue with
NotebookProgressCallbackin theTrainerwhere calling evaluate() before or after training would crash due to the training tracker beingNone. The callback now properly handles evaluation even if training has not yet started or if it has already finished, ensuring metrics can be computed and displayed.Previously, the
on_evaluatemethod assumed thatself.training_trackerwas always initialized, but:self.training_trackerhas not being initialised byon_train_beginyet.on_train_endsetsself.training_trackertoNone, so callingon_evaluateafterwards would fail.Fix:
on_evaluatenow checks whether self.training_tracker exists before using it, and safely handles cases where it isNone. This prevents crashes and ensures evaluation can run regardless of training state.Additionally, new unit tests were added to ensure that evaluation works in this scenario, and existing notebook callback tests were updated to cover this case. This improves robustness of notebook-based workflows, especially in Jupyter or Colab environments.
Code Agent Policy
Before submitting
Who can review?
@SunMarc
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.