Add Gemma4ForSequenceClassification (missing from gemma4 module — Gemma 2/3 have it)

### Feature request

<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(24, 24, 24); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">Please add <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">Gemma4ForSequenceClassification</code> (and ideally <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">Gemma4TextForSequenceClassification</code>) to <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">transformers.models.gemma4</code>. The current <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">__all__</code> in <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">modeling_gemma4.py</code> contains only:</p><div class="codeBlockWrapper_-a7MRw" style="position: relative; margin: 8px 0px; color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(24, 24, 24); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><button class="copyButton_CEmTFw copyButton_-a7MRw" title="Copy code" aria-label="Copy code to clipboard" style="color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; background: none 0% 0% / auto repeat scroll padding-box border-box rgb(31, 31, 31); border-color: rgb(69, 69, 69); border-style: solid; border-width: 1px; border-image: none 100% / 1 / 0 stretch; cursor: pointer; opacity: 0; display: flex; border-radius: 4px; justify-content: center; align-items: center; padding: 4px; transition: opacity 0.15s, background 0.15s; position: absolute; top: 4px; right: 4px;"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true" data-slot="icon" class="copyIcon_CEmTFw"><path fill-rule="evenodd" d="M15.988 3.012A2.25 2.25 0 0 1 18 5.25v6.5A2.25 2.25 0 0 1 15.75 14H13.5v-3.379a3 3 0 0 0-.879-2.121l-3.12-3.121a3 3 0 0 0-1.402-.791 2.252 2.252 0 0 1 1.913-1.576A2.25 2.25 0 0 1 12.25 1h1.5a2.25 2.25 0 0 1 2.238 2.012ZM11.5 3.25a.75.75 0 0 1 .75-.75h1.5a.75.75 0 0 1 .75.75v.25h-3v-.25Z" clip-rule="evenodd"></path><path d="M3.5 6A1.5 1.5 0 0 0 2 7.5v9A1.5 1.5 0 0 0 3.5 18h7a1.5 1.5 0 0 0 1.5-1.5v-5.879a1.5 1.5 0 0 0-.44-1.06L8.44 6.439A1.5 1.5 0 0 0 7.378 6H3.5Z"></path></svg></button><pre style="overflow-x: auto; white-space: pre; box-sizing: border-box; border-radius: 4px; max-width: 100%; margin: 0px; padding: 8px;"><code class="language-python" style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 0px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">__all__ = [
    "Gemma4AudioModel",
    "Gemma4ForCausalLM",
    "Gemma4ForConditionalGeneration",
    "Gemma4Model",
    "Gemma4PreTrainedModel",
    "Gemma4TextModel",
    "Gemma4VisionModel",
]
</code></pre></div><p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(24, 24, 24); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">No classification head variants are exported, which is inconsistent with every prior Gemma release:</p>
Model family | ForSequenceClassification | TextForSequenceClassification
-- | -- | --
gemma | ✅ | N/A (text-only)
gemma2 | ✅ | N/A (text-only)
gemma3 | ✅ | ✅
gemma3n | ✅ | ✅
gemma4 | ❌ | ❌

<p style="white-space: pre-wrap; margin-top: 0.1em; margin-bottom: 0.2em; color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(24, 24, 24); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;">As a result, <code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 2px 4px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">AutoModelForSequenceClassification.from_pretrained("google/gemma-4-E4B", num_labels=3)</code> raises:</p><div class="codeBlockWrapper_-a7MRw" style="position: relative; margin: 8px 0px; color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; white-space: normal; background-color: rgb(24, 24, 24); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;"><button class="copyButton_CEmTFw copyButton_-a7MRw" title="Copy code" aria-label="Copy code to clipboard" style="color: rgb(204, 204, 204); font-family: -apple-system, &quot;system-ui&quot;, &quot;Segoe UI&quot;, Roboto, sans-serif; font-size: 13px; background: none 0% 0% / auto repeat scroll padding-box border-box rgb(31, 31, 31); border-color: rgb(69, 69, 69); border-style: solid; border-width: 1px; border-image: none 100% / 1 / 0 stretch; cursor: pointer; opacity: 1; display: flex; border-radius: 4px; justify-content: center; align-items: center; padding: 4px; transition: opacity 0.15s, background 0.15s; position: absolute; top: 4px; right: 4px;"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 20 20" fill="currentColor" aria-hidden="true" data-slot="icon" class="copyIcon_CEmTFw"><path fill-rule="evenodd" d="M15.988 3.012A2.25 2.25 0 0 1 18 5.25v6.5A2.25 2.25 0 0 1 15.75 14H13.5v-3.379a3 3 0 0 0-.879-2.121l-3.12-3.121a3 3 0 0 0-1.402-.791 2.252 2.252 0 0 1 1.913-1.576A2.25 2.25 0 0 1 12.25 1h1.5a2.25 2.25 0 0 1 2.238 2.012ZM11.5 3.25a.75.75 0 0 1 .75-.75h1.5a.75.75 0 0 1 .75.75v.25h-3v-.25Z" clip-rule="evenodd"></path><path d="M3.5 6A1.5 1.5 0 0 0 2 7.5v9A1.5 1.5 0 0 0 3.5 18h7a1.5 1.5 0 0 0 1.5-1.5v-5.879a1.5 1.5 0 0 0-.44-1.06L8.44 6.439A1.5 1.5 0 0 0 7.378 6H3.5Z"></path></svg></button><pre style="overflow-x: auto; white-space: pre; box-sizing: border-box; border-radius: 4px; max-width: 100%; margin: 0px; padding: 8px;"><code style="font-family: monospace; color: rgb(208, 208, 208); background-color: rgb(60, 60, 60); padding: 0px; border-radius: 3px; word-break: break-word; font-size: 0.9em;">ValueError: Unrecognized configuration class &lt;class 'transformers.models.gemma4.configuration_gemma4.Gemma4Config'&gt;
for this kind of AutoModel: AutoModelForSequenceClassification.</code></pre></div>

### Motivation

AutoModelForSequenceClassification is the standard entry point for fine-tuning causal LMs as classifiers — a common use case across research (domain classification, sentiment, reward modelling) and production (content moderation, intent detection, financial signal prediction). Excluding it from gemma4 forces users to either:

Roll a custom classifier head wrapper (breaks the AutoModel contract and duplicates logic that already exists in GenericForSequenceClassification),
Fine-tune generatively and parse argmax over label tokens (worse calibration, loses softmax-head properties), or
Skip Gemma 4 entirely and stay on Gemma 3 or a different family.
We are currently working around this in a financial-NLP fine-tuning pipeline by subclassing GenericForSequenceClassification and registering the result via AutoModelForSequenceClassification.register(). It works but is exactly the kind of boilerplate the AutoModel registry is supposed to eliminate, and it has to be duplicated by every project that wants to fine-tune Gemma 4 as a classifier.

Gemma 4 E4B is a Mixture-of-Experts model (Gemma4TextExperts + Gemma4TextRouter confirmed in modeling_gemma4.py), which makes it particularly inconvenient for users to hand-roll a classification head: the correct pooling behaviour and last_non_pad_token handling that GenericForSequenceClassification provides for free becomes something every user has to re-implement while also navigating MoE-specific LoRA pitfalls. Adding Gemma4(Text)ForSequenceClassification upstream would spare every downstream user from reinventing the same wheel.

### Your contribution

The implementation should be trivial — Gemma 3 provides the exact template. In transformers/models/gemma3/modeling_gemma3.py:


class Gemma3TextForSequenceClassification(GenericForSequenceClassification, Gemma3PreTrainedModel):
    config: Gemma3TextConfig
    input_modalities = ("text",)
And its multimodal sibling Gemma3ForSequenceClassification, which wires up a Gemma3Model + a nn.Linear(text_config.hidden_size, num_labels) head. The same pattern applied to Gemma 4 should require only a few lines of code plus:

Adding the class(es) to __all__ in modeling_gemma4.py
Registering them in auto/modeling_auto.py against Gemma4Config (and Gemma4TextConfig if it exists)
I'm happy to open a PR if that would help. The change is mechanical but I'd like to confirm first which of the two variants (multimodal-path + text-only, or just text-only) the maintainers prefer — Gemma 3 shipped both, so matching that seems sensible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Gemma4ForSequenceClassification (missing from gemma4 module — Gemma 2/3 have it) #45373

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add Gemma4ForSequenceClassification (missing from gemma4 module — Gemma 2/3 have it) #45373

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions