Correctly create tied key mapping in post_init, and dynamic tie weight by Cyrilvallez · Pull Request #42270 · huggingface/transformers

Cyrilvallez · 2025-11-19T00:27:20Z

As we rely more and more in self.all_tied_weight_keys everywhere (i.e. the list of tied keys obtained during post_init) for multiple manipulations (device_map computation, cuda warmup, post-processing of from_pretrained...), it becomes very important that the (few) models containing regex patterns for their _tied_weights_keys mapping have the patterns expanded to fit in all_tied_weight_keys as well, instead of containing simple patterns that are skipped in different ways for all downstream application.
This PR fixes that, by expanding correctly at post_init time, so the mapping are correct params everywhere.
Also allows for recomputing this mapping in tie_weights dynamically, so that it is correct if calling tie_weights after having modified the config

HuggingFaceDocBuilderDev · 2025-11-19T00:36:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

missing a bit of reprensentative doc! Let's take a t5 as example ? Or rtdetr? to ahve a complexe list

ArthurZucker · 2025-11-21T13:05:49Z

+            for prefix, submodule in self.named_modules():
+                if isinstance(submodule, PreTrainedModel):
+                    # Will dynamically check the config if it has changed
+                    submodel_tied_weights = submodule.get_expanded_tied_weights_keys(all_submodels=False)


don't know if we really have to go the inheritance path here?

given that we do named_parameters afterwards

Yes, in order to check the proper subconfig... No better way unfortunately as sometimes we cannot get the subconfig in a proper way

ArthurZucker · 2025-11-21T13:11:50Z

-            source_name = "^" + source_name
-            target_name = "^" + target_name
+        # In this case, the keys stored in `all_tied_weights_keys` are already correct
+        if not recompute_mapping:


to update with setter and getter for tie_words_embedding no?

No, was already checked before!

github-actions · 2025-11-21T15:40:42Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: esm, hubert, idefics, openai, sew, sew_d, unispeech, unispeech_sat, wav2vec2, wavlm

ArthurZucker

Thanks for itiretaing ! i like that its explicit now!

lugimzzz · 2025-12-12T07:10:36Z

@Cyrilvallez @ArthurZucker
I have a concern: with the changes in this PR, a model like Qwen/Qwen3-0.6B — which is a tie-weight model — is now treated as a non–tie-weight model. In its config.json, "tie_word_embeddings": true is set, and lm_head.weight and embed_tokens.weight actually share identical values. However, after this PR, the model is no longer recognized as a tie-weight model.

As a result, when training Qwen/Qwen3-0.6B, the gradients update completely different parameters before and after this PR. I believe the original behavior is more consistent with the model’s intended design.
Reference model: https://huggingface.co/Qwen/Qwen3-0.6B

Cyrilvallez · 2025-12-12T15:10:46Z

Hey @lugimzzz! You can check-out my answer to the same question here! Let me know if you want more clarification!

huggingface#42270) * add dynamic * improve * doc * true dynamic * everywhere * improve * fix * more * small fix * small fix * fix duplicates * fix * doc * fix * improve doc * comment * more doc * style

Fixes huggingface#43883 After huggingface#42270, all_tied_weights_keys is initialized in post_init(), but remote models loaded with trust_remote_code=True don't always call post_init() properly, causing AttributeError when loading models like Molmo. This fix adds defensive checks in two methods: - _adjust_tied_keys_with_tied_pointers(): Initialize empty dict if missing, then detect tied weights via data pointers - mark_tied_weights_as_initialized(): Return early if attribute missing This allows remote models to load successfully while maintaining tied weight detection.

Cyrilvallez added 5 commits November 19, 2025 00:49

add dynamic

19fac39

improve

2aa7e69

doc

ce5dd12

true dynamic

9dec350

everywhere

91d1c67

Cyrilvallez added 6 commits November 19, 2025 01:46

improve

43093c8

fix

c3a4eb7

more

f0a8b0e

small fix

7638848

small fix

5961477

fix duplicates

3ba1be5

Cyrilvallez changed the title ~~Dynamic tie weight, and full mapping in post_init~~ Correctly create tied key mapping in post_init, and dynamic tie weight Nov 19, 2025

Cyrilvallez added 3 commits November 19, 2025 11:42

fix

5b99f94

doc

52dccf7

fix

d9c62a4

ArthurZucker reviewed Nov 21, 2025

View reviewed changes

improve doc

67eb82c

Cyrilvallez added 2 commits November 21, 2025 16:44

comment

421150e

more doc

ac05f93

ArthurZucker approved these changes Nov 21, 2025

View reviewed changes

style

ab87e2e

Cyrilvallez merged commit ce7a5e0 into main Nov 21, 2025
11 of 24 checks passed

Cyrilvallez deleted the dynamic-tie branch November 21, 2025 16:02

krzwaraksa mentioned this pull request Dec 12, 2025

Regression in SAM2 caused by changes in mark_tied_weights_as_initialized #42837

Closed

4 tasks

rromanchuk mentioned this pull request Dec 31, 2025

AttributeError: 'HfMoondream' object has no attribute 'all_tied_weights_keys'. harpreetsahota204/moondream3#1

Open

echarlaix mentioned this pull request Feb 11, 2026

Transformers v5 huggingface/optimum-intel#1589

Merged

lordaarush mentioned this pull request Feb 12, 2026

AttributeError: 'MolmoForCausalLM' object has no attribute 'all_tied_weights_keys' #43883

Closed

4 tasks

lordaarush mentioned this pull request Feb 12, 2026

Fix AttributeError for remote models with trust_remote_code=True #43951

Closed

3 tasks

moehanabi mentioned this pull request Apr 18, 2026

fix: Bump sglang version from 0.5.9 to 0.5.10 sgl-project/SpecForge#529

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctly create tied key mapping in post_init, and dynamic tie weight#42270

Correctly create tied key mapping in post_init, and dynamic tie weight#42270
Cyrilvallez merged 18 commits intomainfrom
dynamic-tie

Cyrilvallez commented Nov 19, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 19, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker Nov 21, 2025

Uh oh!

ArthurZucker Nov 21, 2025

Uh oh!

Cyrilvallez Nov 21, 2025

Uh oh!

ArthurZucker Nov 21, 2025

Uh oh!

Cyrilvallez Nov 21, 2025

Uh oh!

github-actions Bot commented Nov 21, 2025

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

lugimzzz commented Dec 12, 2025

Uh oh!

Cyrilvallez commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Cyrilvallez commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Nov 19, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ArthurZucker Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Nov 21, 2025

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lugimzzz commented Dec 12, 2025

Uh oh!

Cyrilvallez commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Cyrilvallez commented Nov 19, 2025 •

edited

Loading