Skip to content

refactor _inner_training_loop to smaller methods#44041

Merged
winglian merged 21 commits intohuggingface:mainfrom
winglian:refactor-inner-training-loop-reorder-only
Feb 23, 2026
Merged

refactor _inner_training_loop to smaller methods#44041
winglian merged 21 commits intohuggingface:mainfrom
winglian:refactor-inner-training-loop-reorder-only

Conversation

@winglian
Copy link
Copy Markdown
Collaborator

What does this PR do?

Alternate PR to #43985 to be a reorder only PR.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that you mostly split the _inner_training_loop into three separate private methods. I was thinking maybe it could the chance to try simplifying and reorganizing _inner_training_loop ? Like move things that feel out of place somewhere else and simplify a logic that feels a bit strange/complicated.

Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
Comment thread src/transformers/trainer.py Outdated
@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 19, 2026

/trl-ci

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 20, 2026

I've tried to simplify/reorganize a bunch of things to make things a bit clearer. I tweaked a bit with the internals but I will probably stop there and first get this merged before it gets too complicated to follow all the changes.
cc @kashif @qgallouedec @winglian

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 20, 2026

/trl-ci

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 20, 2026

Some FSDPv2 tests are not passing on trl , investigating. The issue was the following:

The crash only happens when accelerator._models contains both an unwrapped model (regular Tensors) and an FSDP2-wrapped model (DTensors). GRPOTrainer registers a reward model via accelerator.prepare_model(evaluation_mode=True) during __init__, which adds it to _models without FSDP-wrapping. Regular Trainer tests don't register extra models. This only happens with GRPO and RLOO.

@qgallouedec
Copy link
Copy Markdown
Member

@SunMarc

is it something that is done wrong in grpo/rloo?

Comment thread src/transformers/testing_utils.py
@qgallouedec
Copy link
Copy Markdown
Member

/trl-ci

@qgallouedec
Copy link
Copy Markdown
Member

qgallouedec commented Feb 21, 2026

trl tests look good: https://github.com/huggingface/trl/actions/runs/22246318776 (distributed smoke test is in queue, I don't know why it doesn't start)

Copy link
Copy Markdown
Collaborator Author

@winglian winglian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes all look good. I can't approve the PR since I opened it.

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this merged ! thanks !

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Feb 23, 2026

is it something that is done wrong in grpo/rloo?

No it was just something that I deleted self.accelerator.free_memory() as I thought we shouldn't need it but it turns out we needed that.

@winglian winglian merged commit f26814d into huggingface:main Feb 23, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants