Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Something is fishy, the outputs shouldnt change tbh checking |
| def test_model_8b_logits(self): | ||
| input_ids = [1, 306, 4658, 278, 6593, 310, 2834, 338] | ||
| model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b") | ||
| model = JetMoeForCausalLM.from_pretrained("jetmoe/jetmoe-8b", device_map="auto", torch_dtype=torch.bfloat16) |
There was a problem hiding this comment.
My only nit, don't we want to use fp16 for more consistency or was the model saved in bf16?
There was a problem hiding this comment.
I would love to use fp16, but there is memory issue when using "auto", see this internal discussion
https://huggingface.slack.com/archives/C01NE71C4F7/p1758532074369679
and it seems nowadays, many model weights are saved with bf16 (which is normal as we want to train with it)
There was a problem hiding this comment.
Argh, gotcha. Fair enough but that's not a good bug 😓 guess we are fine since bf16 dominates
|
[For maintainers] Suggested jobs to run (before merge) run-slow: jetmoe |
* fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
What does this PR do?
cause the pytest process being killed (and sometimes hangs for hours), see the log below.
It loads the model on
cpu. Let's use "auto".The outputs changes after #41324 (which is a fix of #40132), so the tests still fail but not causing the pytest process being killed and job hanging.
There is a new warning showing
cc @ArthurZucker for this output change issue.
Log when running before this PR: