Skip to content

Gemma4 training with text-only samples#45454

Merged
zucchini-nlp merged 6 commits intohuggingface:mainfrom
zucchini-nlp:gemma4-token-types
Apr 22, 2026
Merged

Gemma4 training with text-only samples#45454
zucchini-nlp merged 6 commits intohuggingface:mainfrom
zucchini-nlp:gemma4-token-types

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

Fixes #45200

As per title, this error was actually needed only in PG. Other models don't have such prefix/suffix separation when training

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: gemma3, gemma4

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/gemma3", "models/gemma4"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: gemma3, gemma4

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 0f34f2d3 workflow commit (merge commit)
PR 27669773 branch commit (from PR)
main 331ea339 base commit (on main)

⚠️ No test being reported (jobs are skipped or cancelled)!

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/gemma3", "models/gemma4"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN e6fe75b3 workflow commit (merge commit)
PR 0c6ffce6 branch commit (from PR)
main b6f9463e base commit (on main)

Model CI Report

7 new failed tests from this PR 😭

  • gemma3:
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_bidirectional_image_attention (✅ ⟹ ❌)
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch (❌ ⟹ ❌)
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_batch_crops (❌ ⟹ ❌)
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_bf16 (❌ ⟹ ❌)
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_crops (❌ ⟹ ❌)
    tests/models/gemma3/test_modeling_gemma3.py::Gemma3IntegrationTest::test_model_4b_multiimage (❌ ⟹ ❌)

  • gemma4:
    tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_export_text_only (❌ ⟹ ❌)

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Apr 15, 2026

I guess I recheck after the tests are fixed 😄

@zucchini-nlp
Copy link
Copy Markdown
Member Author

Tests are failing on main, not related. It is the CLIP issue 😢

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Apr 15, 2026

There seem some different failures tho even with the clip issue no? E.g. test_modeling_gemma3.py::Gemma3Vision2TextModelTest::test_bidirectional_image_attention

@zucchini-nlp
Copy link
Copy Markdown
Member Author

zucchini-nlp commented Apr 15, 2026

Ah sorry, have been seeing these failures too much today so didn't read much into test names

Fixed and, to be sure, ran gemma3 slow tests by locally patching conversion_mapping. Gemma4 isn't affected by clip and slow CI is passing in bot comment

@dentity007
Copy link
Copy Markdown

Thanks for getting this in quickly. Confirmed the zeros workaround held up through a full text-only SFT run on Gemma 4 31B (6144 steps, no issues), so removing the assertion should be clean for that use case. The Paligemma distinction in the PR description is the right framing for anyone wondering why the check existed in the first place.

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: gemma3, gemma4

@github-actions
Copy link
Copy Markdown
Contributor

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/gemma3", "models/gemma4"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN dab5e345 workflow commit (merge commit)
PR 9d312d13 branch commit (from PR)
main 9dff7ca5 base commit (on main)

Model CI Report

1 new failed tests from this PR 😭

  • gemma4:
    tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_export_text_only (❌ ⟹ ❌)

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3, gemma4, git

@zucchini-nlp zucchini-nlp added this pull request to the merge queue Apr 22, 2026
Merged via the queue into huggingface:main with commit 6979d69 Apr 22, 2026
21 checks passed
@zucchini-nlp zucchini-nlp deleted the gemma4-token-types branch April 22, 2026 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Gemma 4] mm_token_type_ids required for text-only fine-tuning - should default to zeros

5 participants