Skip to content

fix(flaky): Different approach to make sure loss exists#43804

Merged
tarekziade merged 4 commits intomainfrom
tarekziade-test_load_balancing_loss-flaky
Feb 20, 2026
Merged

fix(flaky): Different approach to make sure loss exists#43804
tarekziade merged 4 commits intomainfrom
tarekziade-test_load_balancing_loss-flaky

Conversation

@tarekziade
Copy link
Copy Markdown
Collaborator

@tarekziade tarekziade commented Feb 6, 2026

What does this PR do?

The difference check was returning False 40% of the times and was reproducible locally

tested with

pytest -svx tests/models/ernie4_5_moe/test_modeling_ernie4_5_moe.py -k test_load_balancing_loss --flake-finder

moved to a deterministic approach

@tarekziade
Copy link
Copy Markdown
Collaborator Author

run-slow: ernie4_5_moe

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 6, 2026

This comment contains run-slow, running the specified jobs:

models: ["models/ernie4_5_moe"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 6, 2026

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 6e9ab092 merge commit
PR 3cb2c2ea branch commit
main 711f2797 base commit

✅ No failing test specific to this PR 🎉 👏 !


# This is to mimic torch.testing.assert_not_close
self.assertNotAlmostEqual(include_padding_result.aux_loss.item(), result.aux_loss.item())
# We make sure that masking can change the loss using a deterministic synthetic example.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume calling load_balancing_loss_func is good enough to validate loss diff, and it could be its own test. But it also ignore the model() path. is this ok?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks alright!

@tarekziade
Copy link
Copy Markdown
Collaborator Author

may be related #43775

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty


# This is to mimic torch.testing.assert_not_close
self.assertNotAlmostEqual(include_padding_result.aux_loss.item(), result.aux_loss.item())
# We make sure that masking can change the loss using a deterministic synthetic example.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks alright!

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: ernie4_5_moe

@tarekziade
Copy link
Copy Markdown
Collaborator Author

run-slow: ernie4_5_moe

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/ernie4_5_moe"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 2059c112 workflow commit (merge commit)
PR 86bc798c branch commit (from PR)
main 1618d44b base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@tarekziade tarekziade merged commit 6b55285 into main Feb 20, 2026
21 checks passed
@tarekziade tarekziade deleted the tarekziade-test_load_balancing_loss-flaky branch February 20, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants