Make Gemma4ClippableLinear inherit from nn.Linear for PEFT/LoRA compatibility by albertorkive · Pull Request #45388 · huggingface/transformers

albertorkive · 2026-04-12T17:02:39Z

What does this PR do?

Makes Gemma4ClippableLinear inherit from nn.Linear instead of wrapping one via composition, enabling PEFT/LoRA to discover and target vision/audio encoder layers.

Problem: PEFT's LoRA module discovery uses isinstance(module, nn.Linear) to find targetable layers. The current Gemma4ClippableLinear subclasses nn.Module and stores an internal self.linear = nn.Linear(...), so PEFT skips all vision and audio encoder projections (q_proj, k_proj, v_proj, o_proj, ffw layers). Users cannot fine-tune the Gemma4 vision tower with LoRA.

Fix:

Change Gemma4ClippableLinear to inherit from nn.Linear directly
Weight lives as self.weight (standard nn.Linear) instead of self.linear.weight
Clipping behavior is fully preserved
Add _remap_legacy_keys state dict pre-hook for backward compatibility with existing checkpoints that store weight under "linear.weight"
Update weight converter to use new key names
Fix three .linear.weight references in forward methods

Backward compatibility: Existing checkpoints with linear.weight keys load correctly via the pre-hook remap. Verified with strict=True:

# Old checkpoint format
old_sd = {"linear.weight": ..., "input_min": ..., ...}

# Loads into new class without errors
new_module.load_state_dict(old_sd, strict=True)  # works

Files changed

src/transformers/models/gemma4/modular_gemma4.py — source of truth
src/transformers/models/gemma4/modeling_gemma4.py — generated, matching changes
src/transformers/models/gemma4/convert_gemma4_weights.py — updated key names

How to reproduce the bug

from transformers import Gemma4ForConditionalGeneration
import torch.nn as nn

model = Gemma4ForConditionalGeneration.from_pretrained("google/gemma-4-12b-it")

# These are all False — PEFT can't target them
for name, mod in model.named_modules():
    if "vision_tower" in name and hasattr(mod, "linear"):
        print(f"{name}: isinstance(nn.Linear) = {isinstance(mod, nn.Linear)}")
        # Prints: False for every ClippableLinear module

After this PR, all Gemma4ClippableLinear modules pass isinstance(mod, nn.Linear).

…tibility Gemma4ClippableLinear previously subclassed nn.Module and wrapped an internal nn.Linear via composition. This prevented PEFT/LoRA from discovering these layers since it uses isinstance(module, nn.Linear). Change ClippableLinear to inherit from nn.Linear directly, preserving the optional input/output clamping behavior. Add a state dict pre-hook to remap legacy "linear.weight" keys from existing checkpoints to the new "weight" key for backward compatibility. Also update the weight converter and fix three .linear.weight references in forward methods.

Rocketknight1 · 2026-04-13T13:30:08Z

cc @ArthurZucker @Cyrilvallez

ArthurZucker

SGTM I don't know why this was not done earlier (cc @Cyrilvallez as you worked on this might have been a reason?)

ArthurZucker · 2026-04-14T09:05:01Z

+    @staticmethod
+    def _remap_legacy_keys(state_dict, prefix, *args, **kwargs):
+        old_key = prefix + "linear.weight"
+        new_key = prefix + "weight"
+        if old_key in state_dict:
+            state_dict[new_key] = state_dict.pop(old_key)
+


This should be done in the conversion_mapping with our WeightRenaming API!

github-actions · 2026-04-14T10:35:04Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma4

ArthurZucker

better!

ArthurZucker · 2026-04-14T13:01:15Z

    ]

+    mapping["gemma4"] = [
+        WeightRenaming(r"\.linear\.weight", ".weight"),


Valid if all layers used this! I did not check but we might need restriction on layer path?

No other module in the Gemma3/3n/4 tree uses self.linear as an attribute name. Should be safe.

Cyrilvallez · 2026-04-17T08:35:54Z

Unfortunately, we cannot do that as it fully breaks quantization! Quantization methods replace nn.Linear modules with their own, and here it would replace them but then the custom forward will added clipping would be fully lost!

Cyrilvallez · 2026-04-17T08:36:59Z

If you want to use peft, you need to explicitly set which modules you want to target manually. I think we had internal discussions about making that easier, cc @merveenoyan @BenjaminBossan

BenjaminBossan · 2026-04-17T09:56:33Z

Exactly as Cyril said, it's a matter of setting the correct target modules. Changing the parent class is not the solution.

ArthurZucker reviewed Apr 14, 2026

View reviewed changes

Move checkpoint key remap to conversion_mapping

aa3afbc

ArthurZucker reviewed Apr 14, 2026

View reviewed changes

Cyrilvallez closed this Apr 17, 2026

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Gemma4ClippableLinear inherit from nn.Linear for PEFT/LoRA compatibility#45388

Make Gemma4ClippableLinear inherit from nn.Linear for PEFT/LoRA compatibility#45388
albertorkive wants to merge 2 commits intohuggingface:mainfrom
albertorkive:gemma4-clippable-linear-lora

albertorkive commented Apr 12, 2026

Uh oh!

Rocketknight1 commented Apr 13, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Apr 14, 2026

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Apr 14, 2026

Uh oh!

albertorkive Apr 14, 2026

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

BenjaminBossan commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

albertorkive commented Apr 12, 2026

What does this PR do?

Files changed

How to reproduce the bug

Uh oh!

Rocketknight1 commented Apr 13, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

albertorkive Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

Cyrilvallez commented Apr 17, 2026

Uh oh!

BenjaminBossan commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants