[`trainer`] fix the GA `model_accepts_loss_kwargs` by ArthurZucker · Pull Request #34915 · huggingface/transformers

ArthurZucker · 2024-11-25T10:33:04Z

What does this PR do?

Fixes #34577
model_accepts_loss_kwargs was wrongly looking at kwarg names, while you should only need kwargs (since the name can vary for FlashAttentionKwargs, LossKwargs etc)

muellerzr

TIL! However, as you can see by the failing test, this doesn't always work 😅 (If we can get it to that's great, I think that's originally why I went with explicit rather than implicit)

HuggingFaceDocBuilderDev · 2024-11-25T11:05:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

techkang · 2024-12-06T03:52:52Z

I think this pr introduced a new bug that if user use user defined loss funcion and model_accepts_loss_kwargs is False, compute_loss function cannot get the num_items_in_batch argument. Finally, user defined compute_loss_func will not receive this argument either.

ArthurZucker · 2024-12-06T04:04:11Z

model_accepts_loss_kwargs is supposed to be determined at init time and only depends on the forward pass of the model

techkang · 2024-12-06T05:28:30Z

I understand now, I misinterpreted the if condition earlier. However, in this PR, when model_accepts_loss_kwargs is True, it won't pass the num_items_in_batch parameter, which would make the GA loss modification functionality ineffective. Why is that?

techkang · 2024-12-06T06:33:40Z

In newest code, run

export RUN_SLOW=True
pytest tests/trainer/test_trainer.py::TrainerIntegrationPrerunTest::test_gradient_accumulation_loss_alignment

will cause error:

======================================================= short test summary info =======================================================
FAILED tests/trainer/test_trainer.py::TrainerIntegrationPrerunTest::test_gradient_accumulation_loss_alignment - AssertionError: 0.9038000000000004 not less than 0.1 : Difference -0.9038000000000004 is not within 0.1
============================================== 1 failed, 2 warnings in 102.43s (0:01:42) ==============================================

ArthurZucker · 2024-12-06T07:41:19Z

Ah shit the if condition is reversed

ArthurZucker · 2024-12-06T07:43:00Z

Opened a PR for a fix, thanks!

ArthurZucker added 3 commits November 25, 2024 11:32

fix

820be50

style

0bc6c80

values

c873667

muellerzr approved these changes Nov 25, 2024

View reviewed changes

fix

f7a082d

ArthurZucker merged commit a928d9c into main Dec 5, 2024

ArthurZucker deleted the fix-ga-fix branch December 5, 2024 15:37

ArthurZucker mentioned this pull request Dec 6, 2024

Nit about model_accepts_loss_kwargs for loss #35113

Closed

This was referenced Dec 6, 2024

fix check for loss_kwargs/GA #35127

Closed

logic was inverted for passing loss_kwargs to forward pass #35128

Draft

qubvel mentioned this pull request Dec 13, 2024

Fix model_accepts_loss_kwargs for timm model #35257

Merged

hiyouga mentioned this pull request Dec 27, 2024

Pass correct num_items_in_batch value into the training_step function #35438

Merged

5 tasks

hiyouga mentioned this pull request Jan 13, 2025

About GA loss in the latest transformers version #35663

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`trainer`] fix the GA `model_accepts_loss_kwargs`#34915

[`trainer`] fix the GA `model_accepts_loss_kwargs`#34915
ArthurZucker merged 4 commits intomainfrom
fix-ga-fix

ArthurZucker commented Nov 25, 2024 •

edited

Loading

Uh oh!

muellerzr left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 25, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ArthurZucker commented Nov 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

muellerzr left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 25, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

techkang commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

ArthurZucker commented Dec 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ArthurZucker commented Nov 25, 2024 •

edited

Loading