Gemma4 resizing per layer inputs#45324
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| def get_input_embeddings(self): | ||
| return self.model.get_input_embeddings() | ||
|
|
||
| def set_input_embeddings(self, value): | ||
| self.model.set_input_embeddings(value) | ||
|
|
There was a problem hiding this comment.
same as base class, so no need to override
| # The tying happens from decoder to lm-head, but when resizing | ||
| # the resized embed is assigned only to the head. Then tying weights | ||
| # again reverts everything back. So we have to update decoder here | ||
| if self.config.tie_word_embeddings: | ||
| self.model.decoder.embed_tokens = new_embeddings |
There was a problem hiding this comment.
not a fan of it tbh, but ig it's better than overriding resize
| # Input ids should be expanded to the new maximum size of the vocabulary | ||
| inputs_dict["input_ids"][:, -2] = new_model_vocab_size - 1 | ||
|
|
There was a problem hiding this comment.
if we had this, we'd know that gemma4 resize doesn't work well. Added now
|
@bot /repo |
|
Repo. Consistency bot fixed some files and pushed the changes. |
ArthurZucker
left a comment
There was a problem hiding this comment.
SGTM, its a bit dirty to have to call add hook to module but model specific so fine b y me!
|
[For maintainers] Suggested jobs to run (before merge) run-slow: blip, colmodernvbert, gemma3, gemma3n, gemma4, lfm2_vl, paligemma, qwen3_vl, qwen3_vl_moe, t5gemma |
What does this PR do?
Fixes #45276 and #45335
In gemma4 per-layer inputs have to be resized as long as they aren't part of soft multimodal tokens
Repro for T5 gemma:
Gemma3n has soft mm tokens, and the current state of 3N is not good. I see unused vocab entries in
mm_projection😢 and if when apply simply resizing to per-layer-input, we'll get even more unused entriesCould be done in the correct way if we filter out mm-tokens, but I'd prefer to leave it for now