Skip to content

Fix: NotebookProgressCallback crash when evaluating with the Trainer#44949

Merged
SunMarc merged 6 commits intohuggingface:mainfrom
Charly21r:fix-evaluate-after-train
Apr 13, 2026
Merged

Fix: NotebookProgressCallback crash when evaluating with the Trainer#44949
SunMarc merged 6 commits intohuggingface:mainfrom
Charly21r:fix-evaluate-after-train

Conversation

@Charly21r
Copy link
Copy Markdown
Contributor

@Charly21r Charly21r commented Mar 23, 2026

What does this PR do?

Fixes #44936

This PR fixes an issue with NotebookProgressCallback in the Trainer where calling evaluate() before or after training would crash due to the training tracker being None. The callback now properly handles evaluation even if training has not yet started or if it has already finished, ensuring metrics can be computed and displayed.

Previously, the on_evaluate method assumed that self.training_tracker was always initialized, but:

  • Before training: self.training_tracker has not being initialised by on_train_begin yet.
  • After training: on_train_end sets self.training_tracker to None, so calling on_evaluate afterwards would fail.

Fix: on_evaluate now checks whether self.training_tracker exists before using it, and safely handles cases where it is None. This prevents crashes and ensures evaluation can run regardless of training state.

Additionally, new unit tests were added to ensure that evaluation works in this scenario, and existing notebook callback tests were updated to cover this case. This improves robustness of notebook-based workflows, especially in Jupyter or Colab environments.

Code Agent Policy

  • I confirm that this is not a pure code agent PR.

Before submitting

Who can review?

@SunMarc

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Charly21r Charly21r force-pushed the fix-evaluate-after-train branch from e61e591 to e25f2a6 Compare March 23, 2026 17:20
@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Mar 24, 2026

does this fix your issue @HenrikEilers ?

@Charly21r
Copy link
Copy Markdown
Contributor Author

@SunMarc Just checking in, happy to make any changes if needed. If the original reporter isn’t available, I can also provide more details or tests to help validate the fix.

@HenrikEilers
Copy link
Copy Markdown

HenrikEilers commented Mar 27, 2026

does this fix your issue @HenrikEilers ?

I cant really test it right now but if it now allows for trainer.evaluate() to be called after training then yes

@HenrikEilers
Copy link
Copy Markdown

Is there anything left to do for us to get the pr aproved?

@Charly21r
Copy link
Copy Markdown
Contributor Author

Just checking in again, this PR should resolve the original issue by allowing trainer.evaluate() after training, as discussed above. Tests are passing on my side, and I’m happy to add more coverage if needed.

@SunMarc I'd really appreciate a review when you have time :)

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! Left a comment

Comment thread src/transformers/utils/notebook.py Outdated
Comment on lines +354 to +355
if self.training_tracker is None:
return control
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this will work but we are not outputing anything in this case no like tt.write_line(values). Can you check what is the output that we get and if it makes sense, maybe we should add it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I’ve updated on_evaluate to display the metrics as a standalone HTML table when there's no training tracker, so the user still sees the output. I also included the first_column computation into on_evaluate directly so it doesn't depend on `on_train_begin having run.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show me what you get ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling evaluate() after train():
Screenshot 2026-04-09 at 19 40 01

Calling evaluate() before train():
Screenshot 2026-04-09 at 19 44 35

I noticed Model Preparation Time shows up as a column when calling it before train, should I filter it out along with the other runtime metrics?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Model preparation time shouldn't show up. We shouldn't necessarily filter this one in particular but find out why it shows up when it doesn't when calling evaluate after train. Thanks for testing !

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it. The trainer adds eval_model_preparation_time to the metrics dict when the model hasn't been prepared yet (self.accelerator._models is empty), which only happens when evaluate() is called before train(). After training, the model is already prepared so the metric is never added. The fix is just adding metrics.pop(f\"{metric_key_prefix}_model_preparation_time\", None) alongside the other filtered metrics.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG !

@Charly21r Charly21r force-pushed the fix-evaluate-after-train branch 2 times, most recently from 08f50cb to 7fcca92 Compare April 8, 2026 06:50
Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a few nits

Comment thread tests/trainer/test_trainer_callback.py Outdated

def on_evaluate(self, args, state, control, metrics=None, **kwargs):
tt = _require(self.training_tracker, "on_train_begin must be called before on_evaluate")
self.first_column = "Epoch" if args.eval_strategy == IntervalStrategy.EPOCH else "Step"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to overwrite that ? we shouldn't have to

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because on_evaluate can also be called before on_train_begin (which is the bug this PR fixes), but self.first_column wouldn't exist yet since it's only initialized in on_train_begin. Another option would be to move the initialization to __init__ with a default of "Step" instead, so on_evaluate doesn't need to overwrite it. Defaulting to "Step" could make sense here since if training hasn't started, there are no epochs to reference. So I can do that if you prefer it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood. let's keep what you did, maybe just add a comment about that above

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@Charly21r Charly21r force-pushed the fix-evaluate-after-train branch from 7bcd9b1 to 055eb9f Compare April 9, 2026 18:12
Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! Just fix the issue about the logs that shows up incorrectly when doing evaluation first and we can merge this

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Charly21r Charly21r force-pushed the fix-evaluate-after-train branch from 055eb9f to 2d98716 Compare April 10, 2026 15:30
Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating !

@SunMarc SunMarc enabled auto-merge April 13, 2026 13:27
@SunMarc SunMarc added this pull request to the merge queue Apr 13, 2026
Merged via the queue into huggingface:main with commit 0b5dbfc Apr 13, 2026
28 checks passed
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
…uggingface#44949)

* Fix NotebookProgressCallback to allow evaluate() before and after train

* Add unit test for NotebookProgressCallback evaluating before and after training

* Skip NotebookProgressCallback tests when IPython is not installed

* Display eval metrics when training tracker is None on NotebookProgressCallback

* Add is_ipython_available and require_ipython test decorator

* Filter model_preparation_time metric and add code comments in on_eval
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

trainer.evaluate() fails after trainer.train()

4 participants