Prevent Reinitialization of Resized LM Head When `tie_word_embeddings` is False #35141 by sambhavnoobcoder · Pull Request #36221 · huggingface/transformers

sambhavnoobcoder · 2025-02-16T21:04:00Z

Issue Description

When using models with tie_word_embeddings=False, calling resize_token_embeddings() followed by post_init() causes unintended reinitialization of the output embeddings (LM head). This occurs because the newly created LM head during resizing lacks the _is_hf_initialized flag, causing post_init() to treat it as uninitialized and reinitialize its weights.

Solution

Added _is_hf_initialized = True flag to the new LM head in _get_resized_lm_head(). This ensures that post_init() recognizes the module as already initialized and skips reinitialization, preserving the intended weights.

The change is minimal and targeted:

Only affects cases where tie_word_embeddings=False
Maintains backward compatibility
Preserves existing initialization behavior for new tokens

Test Coverage

Core Test: test_model_resize_embeddings.py

test_resize_embeddings_no_reinit

This test verifies:

Takes initial snapshot of LM head weights
Resizes embeddings (+10 tokens)
Verifies original weights preserved after resize
Calls post_init()
Verifies original weights still match initial snapshot

Uses torch.allclose() for exact weight comparison

test_new_tokens_initialization

This test verifies:

Resizes vocabulary (+10 tokens)
Examines only the new token weights
Verifies that new weights:
- Are not all zeros
- Stay within reasonable bounds (abs value < 100)

test_resize_embeddings_with_bias

This test verifies:

Creates model with tie_word_embeddings=False
Adds bias to LM head (which is not present by default)
Takes snapshots of both weights and bias
Resizes embeddings (+10 tokens)
Calls post_init()
Verifies:
- Original weights preserved in resized LM head
- Original bias values preserved in resized LM head
- Uses torch.allclose() for exact comparison of both weights and bias

Test Results

All tests pass successfully, confirming:

Original token embeddings remain unchanged through resize and post_init
New token embeddings are properly initialized
No regressions in existing functionality

Implementation Details

The fix is implemented in src/transformers/modeling_utils.py, adding a single line to mark the new LM head as initialized immediately after creation.

Backwards Compatibility

This change:

Does not modify existing APIs
Maintains current behavior for tied embeddings
Only affects the internal initialization state of resized LM heads

Fixes : #35141

cc: @ArthurZucker @Rocketknight1

Rocketknight1 · 2025-02-25T17:18:46Z

Hi @sambhavnoobcoder, sorry for missing this one earlier! The solution seems good, but can we move the test into an existing file rather than a separate file for just that test? I think it might fit in the test_tokenization_common.py file, although you'll need to remove the lines referring to a specific model.

sambhavnoobcoder added 2 commits February 17, 2025 02:06

added _is_hf_initialized flag for new lm_head

b1414fb

necesseary tests for the chnage

52df0f8

sambhavnoobcoder mentioned this pull request Feb 16, 2025

resizing token embeddings causes output embedding to be reinitialized in post_init when tie_word_embedding is False #35141

Closed

4 tasks

style fixup

fc7c069

sambhavnoobcoder force-pushed the output-embedding-reinitilaization branch from ca71974 to fc7c069 Compare April 23, 2025 07:47

sambhavnoobcoder and others added 9 commits April 23, 2025 13:19

rm files with errors

1a11a91

file changes with test included

c456bb6

Merge branch 'main' into output-embedding-reinitilaization

b04b678

Revert test_tokenization_bert.py to original state from main

e965195

moving all tests to test_tokensiation_common.py

017b446

Add print statement in _get_resized_lm_head method

b7128f8

Revert test_tokenization_common.py to original state

b1175ab

pre-init post init fix .

38b2729

fixing the ruff problems

70d6c97

anuq mentioned this pull request Mar 14, 2026

fix: mark new lm_head params as _is_hf_initialized after resize_token_embeddings #44711

Closed

4 tasks

javierdejesusda mentioned this pull request Mar 28, 2026

Fix resized LM head weights being overwritten by post_init #45079

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent Reinitialization of Resized LM Head When `tie_word_embeddings` is False #35141#36221

Prevent Reinitialization of Resized LM Head When `tie_word_embeddings` is False #35141#36221
sambhavnoobcoder wants to merge 12 commits intohuggingface:mainfrom
sambhavnoobcoder:output-embedding-reinitilaization

sambhavnoobcoder commented Feb 16, 2025

Uh oh!

Rocketknight1 commented Feb 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sambhavnoobcoder commented Feb 16, 2025

Issue Description

Solution

Test Coverage

Core Test: test_model_resize_embeddings.py

Test Results

Implementation Details

Backwards Compatibility

Uh oh!

Rocketknight1 commented Feb 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants