Skip to content

[testing] Fix JetMoeIntegrationTest#41377

Merged
ydshieh merged 3 commits intomainfrom
fix_jetmoe
Oct 8, 2025
Merged

[testing] Fix JetMoeIntegrationTest#41377
ydshieh merged 3 commits intomainfrom
fix_jetmoe

Conversation

@ydshieh
Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh commented Oct 6, 2025

What does this PR do?

tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_generation

cause the pytest process being killed (and sometimes hangs for hours), see the log below.

It loads the model on cpu. Let's use "auto".

The outputs changes after #41324 (which is a fix of #40132), so the tests still fail but not causing the pytest process being killed and job hanging.

There is a new warning showing

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.

cc @ArthurZucker for this output change issue.

Log when running before this PR:

2025-09-30T16:33:22.8891725Z tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_batched_generation
2025-09-30T16:33:22.8892381Z -------------------------------- live log call ---------------------------------
2025-09-30T16:33:22.8893314Z WARNING transformers.generation.configuration_utils:logging.py:328 The following generation flags are not valid and may be ignored: ['temperature']. Set TRANSFORMERS_VERBOSITY=info for more details.
2025-09-30T16:33:23.3206200Z FAILED [ 99%]
2025-09-30T16:39:21.4805460Z Killed
2025-09-30T16:39:21.4847256Z tests/models/jetmoe/test_modeling_jetmoe.py::JetMoeIntegrationTest::test_model_8b_generation

@ydshieh ydshieh requested a review from ArthurZucker October 6, 2025 15:50
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Oct 7, 2025

Something is fishy, the outputs shouldnt change tbh checking

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx lgtm! Just a small nit. Outputs have changed but I submitted a PR for that in #41423

def test_model_8b_logits(self):
input_ids = [1, 306, 4658, 278, 6593, 310, 2834, 338]
model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b")
model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b", device_map="auto", torch_dtype=torch.bfloat16)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only nit, don't we want to use fp16 for more consistency or was the model saved in bf16?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to use fp16, but there is memory issue when using "auto", see this internal discussion

https://huggingface.slack.com/archives/C01NE71C4F7/p1758532074369679

and it seems nowadays, many model weights are saved with bf16 (which is normal as we want to train with it)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, gotcha. Fair enough but that's not a good bug 😓 guess we are fine since bf16 dominates

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 8, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: jetmoe

@ydshieh ydshieh enabled auto-merge (squash) October 8, 2025 13:06
@ydshieh ydshieh merged commit e064dc0 into main Oct 8, 2025
20 checks passed
@ydshieh ydshieh deleted the fix_jetmoe branch October 8, 2025 13:11
omsherikar pushed a commit to omsherikar/transformers that referenced this pull request Oct 8, 2025
* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
AhnJoonSung pushed a commit to AhnJoonSung/transformers that referenced this pull request Oct 12, 2025
* fix

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants