Gemma4 training with text-only samples#45454
Conversation
|
run-slow: gemma3, gemma4 |
|
This comment contains models: ["models/gemma3", "models/gemma4"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
run-slow: gemma3, gemma4 |
|
This comment contains models: ["models/gemma3", "models/gemma4"] |
CI ResultsCommit Info
Model CI Report❌ 7 new failed tests from this PR 😭
|
|
I guess I recheck after the tests are fixed 😄 |
|
Tests are failing on main, not related. It is the CLIP issue 😢 |
|
There seem some different failures tho even with the clip issue no? E.g. |
|
Ah sorry, have been seeing these failures too much today so didn't read much into test names Fixed and, to be sure, ran gemma3 slow tests by locally patching |
|
Thanks for getting this in quickly. Confirmed the zeros workaround held up through a full text-only SFT run on Gemma 4 31B (6144 steps, no issues), so removing the assertion should be clean for that use case. The Paligemma distinction in the PR description is the right framing for anyone wondering why the check existed in the first place. |
|
run-slow: gemma3, gemma4 |
|
This comment contains models: ["models/gemma3", "models/gemma4"] |
CI ResultsCommit Info
Model CI Report❌ 1 new failed tests from this PR 😭
|
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gemma3, gemma4, git |
What does this PR do?
Fixes #45200
As per title, this error was actually needed only in PG. Other models don't have such prefix/suffix separation when training