Skip to content

[gemma4] resize_token_embeddings does not effect to embed_tokens_per_layer or output_embeddings #45276

@KoichiYasuoka

Description

@KoichiYasuoka

System Info

  • transformers version: 5.5.0
  • Platform: Linux-6.6.113+-x86_64-with-glibc2.35
  • Python version: 3.12.13
  • Huggingface_hub version: 1.8.0
  • Safetensors version: 0.7.0
  • Accelerate version: 1.13.0
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.10.0+cu128 (CUDA)
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: Tesla T4

(Google Colaboratory GPU)

Who can help?

@zucchini-nlp @Cyrilvallez

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Reproduce the behavior:

from transformers import Gemma4ForConditionalGeneration
mdl = Gemma4ForConditionalGeneration.from_pretrained("google/gemma-4-E2B-it")
e = mdl.get_input_embeddings()
f = mdl.model.language_model.embed_tokens_per_layer
g = mdl.get_output_embeddings()
print(e.num_embeddings, f.num_embeddings, g.out_features)
assert e.num_embeddings == f.num_embeddings == g.out_features
e = mdl.resize_token_embeddings(e.num_embeddings + 1)
f = mdl.model.language_model.embed_tokens_per_layer
g = mdl.get_output_embeddings()
print(e.num_embeddings, f.num_embeddings, g.out_features)
assert e.num_embeddings == f.num_embeddings == g.out_features

Expected behavior

All e.num_embeddings, f.num_embeddings and g.out_features should be increased to 262145.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions