refactor _inner_training_loop to smaller methods by winglian · Pull Request #44041 · huggingface/transformers

winglian · 2026-02-16T15:40:41Z

What does this PR do?

Alternate PR to #43985 to be a reorder only PR.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2026-02-16T15:49:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

I saw that you mostly split the _inner_training_loop into three separate private methods. I was thinking maybe it could the chance to try simplifying and reorganizing _inner_training_loop ? Like move things that feel out of place somewhere else and simplify a logic that feels a bit strange/complicated.

…ng-loop-reorder-only

SunMarc · 2026-02-19T15:31:38Z

/trl-ci

SunMarc · 2026-02-20T13:55:20Z

I've tried to simplify/reorganize a bunch of things to make things a bit clearer. I tweaked a bit with the internals but I will probably stop there and first get this merged before it gets too complicated to follow all the changes.
cc @kashif @qgallouedec @winglian

SunMarc · 2026-02-20T14:15:51Z

/trl-ci

SunMarc · 2026-02-20T17:04:52Z

Some FSDPv2 tests are not passing on trl , investigating. The issue was the following:

The crash only happens when accelerator._models contains both an unwrapped model (regular Tensors) and an FSDP2-wrapped model (DTensors). GRPOTrainer registers a reward model via accelerator.prepare_model(evaluation_mode=True) during __init__, which adds it to _models without FSDP-wrapping. Regular Trainer tests don't register extra models. This only happens with GRPO and RLOO.

qgallouedec · 2026-02-20T21:47:53Z

@SunMarc

is it something that is done wrong in grpo/rloo?

qgallouedec · 2026-02-21T00:18:59Z

/trl-ci

qgallouedec · 2026-02-21T02:07:03Z

trl tests look good: https://github.com/huggingface/trl/actions/runs/22246318776 (distributed smoke test is in queue, I don't know why it doesn't start)

winglian

Latest changes all look good. I can't approve the PR since I opened it.

SunMarc

Let's get this merged ! thanks !

SunMarc · 2026-02-23T15:56:10Z

is it something that is done wrong in grpo/rloo?

No it was just something that I deleted self.accelerator.free_memory() as I thought we shouldn't need it but it turns out we needed that.

refactor _inner_training_loop to smaller methods

a50e35e

SunMarc reviewed Feb 16, 2026

View reviewed changes

winglian and others added 4 commits February 16, 2026 20:25

address PR comments and remove unused args

4bbbac1

move _train_batch_size

48391e9

big update

cda1b84

Merge remote-tracking branch 'origin/main' into refactor-inner-traini…

f52ffbf

…ng-loop-reorder-only

SunMarc added 12 commits February 19, 2026 15:35

move DummyScheduler

c9f3610

style

f4cad42

make it private

46e2041

switch to kwargs_handlers

da07a71

style

c6bee6a

fix resuming

daec115

style

a434736

revert loading after

fa55154

way better now

c18c167

move method

2ced96a

small mistake

7feecab

update

b37736d

winglian mentioned this pull request Feb 19, 2026

integration branch for transformers#44041 axolotl-ai-cloud/axolotl#3420

Closed

winglian force-pushed the refactor-inner-training-loop-reorder-only branch from 6396335 to b37736d Compare February 20, 2026 04:47

SunMarc requested review from kashif and qgallouedec February 20, 2026 13:51

remove tp_size

a697854

account for grpo

cc24432

qgallouedec reviewed Feb 20, 2026

View reviewed changes

Comment thread src/transformers/testing_utils.py

winglian commented Feb 21, 2026

View reviewed changes

final touch

2e97639

SunMarc approved these changes Feb 23, 2026

View reviewed changes

Merge branch 'main' into refactor-inner-training-loop-reorder-only

23a6e8b

winglian merged commit f26814d into huggingface:main Feb 23, 2026
25 checks passed

Conversation

winglian commented Feb 16, 2026

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 16, 2026

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc commented Feb 19, 2026

Uh oh!

SunMarc commented Feb 20, 2026

Uh oh!

SunMarc commented Feb 20, 2026

Uh oh!

SunMarc commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qgallouedec commented Feb 20, 2026

Uh oh!

Uh oh!

qgallouedec commented Feb 21, 2026

Uh oh!

qgallouedec commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winglian left a comment

Choose a reason for hiding this comment

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

SunMarc commented Feb 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SunMarc commented Feb 20, 2026 •

edited

Loading

qgallouedec commented Feb 21, 2026 •

edited

Loading