[testing] Fix `JetMoeIntegrationTest` by ydshieh · Pull Request #41377 · huggingface/transformers

ydshieh · 2025-10-06T15:41:10Z

What does this PR do?

tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_generation

cause the pytest process being killed (and sometimes hangs for hours), see the log below.

It loads the model on cpu. Let's use "auto".

The outputs changes after #41324 (which is a fix of #40132), so the tests still fail but not causing the pytest process being killed and job hanging.

There is a new warning showing

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.

cc @ArthurZucker for this output change issue.

Log when running before this PR:

2025-09-30T16:33:22.8891725Z tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_batched_generation
2025-09-30T16:33:22.8892381Z -------------------------------- live log call ---------------------------------
2025-09-30T16:33:22.8893314Z WARNING transformers.generation.configuration_utils:logging.py:328 The following generation flags are not valid and may be ignored: ['temperature']. Set TRANSFORMERS_VERBOSITY=info for more details.
2025-09-30T16:33:23.3206200Z FAILED [ 99%]
2025-09-30T16:39:21.4805460Z Killed
2025-09-30T16:39:21.4847256Z tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_generation

HuggingFaceDocBuilderDev · 2025-10-06T15:51:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vasqu · 2025-10-07T15:28:01Z

Something is fishy, the outputs shouldnt change tbh checking

vasqu

Thx lgtm! Just a small nit. Outputs have changed but I submitted a PR for that in #41423

vasqu · 2025-10-07T16:10:22Z

    def test_model_8b_logits(self):
        input_ids = [1, 306, 4658, 278, 6593, 310, 2834, 338]
-        model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b")
+        model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b", device_map="auto", torch_dtype=torch.bfloat16)


My only nit, don't we want to use fp16 for more consistency or was the model saved in bf16?

I would love to use fp16, but there is memory issue when using "auto", see this internal discussion

https://huggingface.slack.com/archives/C01NE71C4F7/p1758532074369679

and it seems nowadays, many model weights are saved with bf16 (which is normal as we want to train with it)

Argh, gotcha. Fair enough but that's not a good bug 😓 guess we are fine since bf16 dominates

github-actions · 2025-10-08T12:55:43Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: jetmoe

* fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ydshieh requested a review from ArthurZucker October 6, 2025 15:50

fix

104d958

ydshieh force-pushed the fix_jetmoe branch from 25c10fd to 104d958 Compare October 7, 2025 08:54

ydshieh requested review from Cyrilvallez and vasqu October 7, 2025 08:57

vasqu mentioned this pull request Oct 7, 2025

[JetMoe] Fix KV head repetition and padding free #41423

Merged

vasqu approved these changes Oct 7, 2025

View reviewed changes

Merge branch 'main' into fix_jetmoe

be57e45

update

8f23f57

ydshieh enabled auto-merge (squash) October 8, 2025 13:06

ydshieh merged commit e064dc0 into main Oct 8, 2025
20 checks passed

ydshieh deleted the fix_jetmoe branch October 8, 2025 13:11

omsherikar pushed a commit to omsherikar/transformers that referenced this pull request Oct 8, 2025

[testing] Fix JetMoeIntegrationTest (huggingface#41377)

d3bd25e

* fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

AhnJoonSung pushed a commit to AhnJoonSung/transformers that referenced this pull request Oct 12, 2025

[testing] Fix JetMoeIntegrationTest (huggingface#41377)

aeafb46

* fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[testing] Fix `JetMoeIntegrationTest`#41377

[testing] Fix `JetMoeIntegrationTest`#41377
ydshieh merged 3 commits intomainfrom
fix_jetmoe

ydshieh commented Oct 6, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Oct 6, 2025

Uh oh!

vasqu commented Oct 7, 2025

Uh oh!

vasqu left a comment

Uh oh!

vasqu Oct 7, 2025

Uh oh!

ydshieh Oct 7, 2025

Uh oh!

vasqu Oct 7, 2025

Uh oh!

github-actions Bot commented Oct 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ydshieh commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 6, 2025

Uh oh!

vasqu commented Oct 7, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

vasqu Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

ydshieh Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

vasqu Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Oct 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ydshieh commented Oct 6, 2025 •

edited

Loading