casually dropping the most capable open weights on the planet by RyanMullins · Pull Request #45192 · huggingface/transformers

RyanMullins · 2026-04-02T14:35:18Z

What does this PR do?

model previously unable to use tools

Code Agent Policy

The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by
code agents. We are currently bottlenecked by our ability to review and respond to them. As a result,
we ask that new users do not submit pure code agent PRs at this time.
You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents
not to open any PRs or issues for the moment.

PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this
repeatedly or maliciously.

This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result,
this policy is likely to be updated regularly in the near future. For more information, please read CONTRIBUTING.md.

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker @Cyrilvallez @eustlb @zucchini-nlp @Rocketknight1

--------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com>

github-actions · 2026-04-02T15:02:32Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, gemma3n, gemma4

ArthurZucker

🚀

HuggingFaceDocBuilderDev · 2026-04-02T15:06:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Cyrilvallez

🚀

ngxson · 2026-04-02T15:21:31Z

🚀

* model previously unable to use tools --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> * fix sign: commit was not added before * re-add latest commit about rms norm * preemptively skip the integration tests for now --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

Gemma-4 support landed in transformers main (huggingface/transformers#45192). Update the version pin from 5.5.0.dev0 to 5.5.0 across loader, Studio version switcher, and the MLX installer. Also fix install_gemma4_mlx.sh which referenced a non-existent v5.5-release branch -- pin it to the correct commit (91b1ab1) instead.

robertgshaw2-redhat · 2026-04-02T19:17:57Z

elite pr title :)

emidoots · 2026-04-02T20:59:03Z

very exciting, congrats!

ydshieh · 2026-04-03T12:26:31Z

        pass


+@unittest.skip("Integration Tests are not up-to-date yet! TODO Cyril: update me pretty pretty please!")


@Cyrilvallez update your pretty pretty please :-)

and there are 2 failures (non integration but slow tests if you are motivated)

FAILED tests/models/gemma4/test_modeling_gemma4.py::Gemma4TextModelTest::test_torch_compile_for_training - torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(u0, 0) (unhinted: Eq(u0, 0)). (Size-like symbols: none) consider using data-dependent friendly APIs such as guard_or_false, guard_or_true and statically_known_true. Caused by: (ransformers/src/transformers/integrations/moe.py:231 in _grouped_mm_fallback_backward) For more information, run with TORCH_LOGS="dynamic" For extended logs when we create symbols, also add TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="u0" If you suspect the guard was triggered from C++, add TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 For more debugging help, see https://docs.google.com/document/d/1HSuTTVvYH1pTew89Rtpeu84Ht3nQEFTYhAX3Ypa_xJs/edit?usp=sharing For C++ stack trace, run with TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo" FAILED tests/models/gemma4/test_modeling_gemma4.py::Gemma4Vision2TextModelTest::test_sdpa_can_dispatch_on_flash - RuntimeError: No available kernel. Aborting execution.

…gface#45192) * model previously unable to use tools --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> * fix sign: commit was not added before * re-add latest commit about rms norm * preemptively skip the integration tests for now --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

Gemma-4 support landed in transformers main (huggingface/transformers#45192). Update the version pin from 5.5.0.dev0 to 5.5.0 across loader, Studio version switcher, and the MLX installer. Also fix install_gemma4_mlx.sh which referenced a non-existent v5.5-release branch -- pin it to the correct commit (91b1ab1) instead.

…gface#45192) * model previously unable to use tools --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> * fix sign: commit was not added before * re-add latest commit about rms norm * preemptively skip the integration tests for now --------- Co-authored-by: Douglas Reid <dougreid@google.com> Co-authored-by: Luciano Martins <lucianomartins@google.com> Co-authored-by: Mayank Chaturvedi <imayank@google.com> Co-authored-by: Phil Culliton <philculliton@google.com> Co-authored-by: Sara Smoot <sarasmoot@google.com> Co-authored-by: Sindhu Raghuram <sindhuraghuram@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Eustache Le Bihan <eustache.lebihan@huggingface.co> Co-authored-by: Joshua Lochner <joshua@huggingface.co> Co-authored-by: Matthew Carrigan <matt@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Raushan Turganbay <raushan@huggingface.co> Co-authored-by: Jeff Dean <jeff@google.com> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>

Gemma4Config (added in huggingface/transformers#45192) and other composite VLM configs (e.g., Qwen2.5-VL) nest attention fields under text_config rather than exposing them on the top-level config. The ulysses monkey patch read model.config.num_attention_heads directly, which raises AttributeError for these models. PreTrainedConfig.get_text_config returns self for text-only models and the text sub-config for VLMs, so this is a no-op for Qwen3/Llama3/DeepSeek and unblocks Gemma4 in transformers 5.6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

RyanMullins and others added 2 commits April 2, 2026 10:24

fix conflict

f43c8a5

xenova mentioned this pull request Apr 2, 2026

casually dropping the most capable open weights on the planet huggingface/transformers.js#1627

Merged

Cyrilvallez added 2 commits April 2, 2026 16:54

fix sign: commit was not added before

fea0cd1

re-add latest commit about rms norm

fbb5fc2

ArthurZucker approved these changes Apr 2, 2026

View reviewed changes

ArthurZucker added the New model label Apr 2, 2026

preemptively skip the integration tests for now

3feea4a

Cyrilvallez approved these changes Apr 2, 2026

View reviewed changes

Cyrilvallez merged commit 91b1ab1 into huggingface:main Apr 2, 2026
23 of 28 checks passed

danielhanchen mentioned this pull request Apr 2, 2026

Pin Gemma-4 transformers requirement to 5.5.0 stable unslothai/unsloth#4784

Merged

github-actions Bot mentioned this pull request Apr 3, 2026

Reddit News Daily 2026-04-03 gitlawr/reddit-daily-news#203

Open

ydshieh reviewed Apr 3, 2026

View reviewed changes

asdat3 mentioned this pull request Apr 13, 2026

transformers serve crashes with AttributeError: 'Gemma4Processor' object has no attribute '_tokenizer' #45406

Closed

4 tasks

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

casually dropping the most capable open weights on the planet#45192

casually dropping the most capable open weights on the planet#45192
Cyrilvallez merged 5 commits intohuggingface:mainfrom
RyanMullins:my-third-model

RyanMullins commented Apr 2, 2026

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 2, 2026

Uh oh!

Cyrilvallez left a comment

Uh oh!

ngxson commented Apr 2, 2026

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 2, 2026

Uh oh!

emidoots commented Apr 2, 2026

Uh oh!

ydshieh Apr 3, 2026

Uh oh!

ydshieh Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

		pass


		@unittest.skip("Integration Tests are not up-to-date yet! TODO Cyril: update me pretty pretty please!")

Conversation

RyanMullins commented Apr 2, 2026

What does this PR do?

Code Agent Policy

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Apr 2, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Apr 2, 2026

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

ngxson commented Apr 2, 2026

Uh oh!

Uh oh!

robertgshaw2-redhat commented Apr 2, 2026

Uh oh!

emidoots commented Apr 2, 2026

Uh oh!

ydshieh Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

ydshieh Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants