Enforce manual seed to reduce flakiness by tarekziade · Pull Request #43794 · huggingface/transformers

tarekziade · 2026-02-06T09:54:59Z

This patch aims to reduce flakiness in CI tests. We identified two causes of nondeterministic behavior:

Some tests were not using a fixed RNG seed, which reduced determinism.
The cli tests were occasionally triggering I/O errors due to writes on a closed stdout.

This branch was run multiple times and appears to reduce flakiness in all previously unseeded tests. While there’s no deterministic way to prove the improvement, using fixed seeds is still a best practice.

As a follow-up, we could centralize the seed initialization in a shared test fixture, avoiding the need to set it explicitly in individual tests.

HuggingFaceDocBuilderDev · 2026-02-06T10:04:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…pots)

tarekziade · 2026-02-06T15:41:32Z

run-slow: doge, donut, esm, fastspeech2_conformer, mimi, minimax_m2, mistral, mixtral, musicgen, musicgen_melody, nllb_moe, qwen2, qwen2_moe, qwen3, qwen3_moe, recurrent_gemma

github-actions · 2026-02-06T15:42:44Z

This comment contains run-slow, running the specified jobs:

models: ["models/doge", "models/donut", "models/esm", "models/fastspeech2_conformer", "models/mimi", "models/minimax_m2", "models/mistral", "models/mixtral", "models/musicgen", "models/musicgen_melody", "models/nllb_moe", "models/qwen2", "models/qwen2_moe", "models/qwen3", "models/qwen3_moe", "models/recurrent_gemma"]
quantizations: []

Rocketknight1

Fix LGTM! I'm in favour of blindly setting seeds everywhere, especially in tests where we compare two model outputs.

github-actions · 2026-02-06T15:56:18Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	ea868138	merge commit
PR	69eafe69	branch commit
main	b9042c4e	base commit

✅ No failing test specific to this PR 🎉 👏 !

github-actions · 2026-02-06T16:10:42Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: doge, donut, esm, fastspeech2_conformer, mimi, minimax_m2, mistral, mixtral, musicgen, musicgen_melody, nllb_moe, qwen2, qwen2_moe, qwen3, qwen3_moe, recurrent_gemma

This change aims to reduce flakiness in CI tests. We identified two causes of nondeterministic behavior: - Some tests were not using a fixed RNG seed, which reduced determinism. - The cli tests were occasionally triggering I/O errors due to writes on a closed stdout. This branch was run multiple times and appears to reduce flakiness in all previously unseeded tests. While there’s no deterministic way to prove the improvement, using fixed seeds is still a best practice.

tarekziade force-pushed the tarekziade-flaky-test_generate branch 2 times, most recently from 127d3e0 to 95f626c Compare February 6, 2026 15:01

tarekziade mentioned this pull request Feb 6, 2026

preventing I/O errors on closed streams in the cli helper #43797

Closed

tarekziade added 3 commits February 6, 2026 16:33

trying to enforce manual seed to see if that impacts flakiness

413cdd0

use 42 everywhere (the usual hitchhicker ref that's used in several s…

18fd709

…pots)

prevents capture buffers from being closed

69eafe6

tarekziade force-pushed the tarekziade-flaky-test_generate branch from 95f626c to 69eafe6 Compare February 6, 2026 15:33

tarekziade requested a review from Rocketknight1 February 6, 2026 15:35

tarekziade changed the title ~~[WIP] trying to enforce manual seed to see if that impacts flakiness~~ Enforce manual seed to reduce flakiness Feb 6, 2026

Rocketknight1 approved these changes Feb 6, 2026

View reviewed changes

Comment thread tests/models/seamless_m4t/test_modeling_seamless_m4t.py Outdated

tarekziade added 2 commits February 6, 2026 16:56

add a helper in the mixin to reduce test code dupe

3a3a7e1

Merge branch 'main' into tarekziade-flaky-test_generate

e732a13

tarekziade self-assigned this Feb 6, 2026

tarekziade mentioned this pull request Feb 6, 2026

chore(test): add a set_seed pytest fixture #43805

Closed

tarekziade merged commit 0c89522 into main Feb 6, 2026
26 checks passed

tarekziade deleted the tarekziade-flaky-test_generate branch February 6, 2026 16:30

Rocketknight1 mentioned this pull request Feb 6, 2026

More flaky generate tests #43713

Closed

jayzuccarelli mentioned this pull request Feb 8, 2026

chore(tests): add set_seed pytest fixture for determinism #43829

Closed

ydshieh mentioned this pull request Apr 6, 2026

Fix Qwen2IntegrationTest #45268

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enforce manual seed to reduce flakiness#43794

Enforce manual seed to reduce flakiness#43794
tarekziade merged 5 commits intomainfrom
tarekziade-flaky-test_generate

tarekziade commented Feb 6, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Feb 6, 2026

Uh oh!

tarekziade commented Feb 6, 2026

Uh oh!

github-actions Bot commented Feb 6, 2026

Uh oh!

Rocketknight1 left a comment

Uh oh!

Uh oh!

github-actions Bot commented Feb 6, 2026

Uh oh!

github-actions Bot commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tarekziade commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Feb 6, 2026

Uh oh!

tarekziade commented Feb 6, 2026

Uh oh!

github-actions Bot commented Feb 6, 2026

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Feb 6, 2026

CI Results

Commit Info

Uh oh!

github-actions Bot commented Feb 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tarekziade commented Feb 6, 2026 •

edited

Loading