Skip to content

Fix spurious position_ids warnings for at least 40 architectures#45437

Merged
tomaarsen merged 2 commits intohuggingface:mainfrom
tomaarsen:fix/spurious_pos_ids_warnings
Apr 16, 2026
Merged

Fix spurious position_ids warnings for at least 40 architectures#45437
tomaarsen merged 2 commits intohuggingface:mainfrom
tomaarsen:fix/spurious_pos_ids_warnings

Conversation

@tomaarsen
Copy link
Copy Markdown
Member

What does this PR do?

Supersedes #45385

Code Agent Policy

  • I confirm that this is not a pure code agent PR.

I used an agent to track down how many architectures this affected

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Details

A lot of my older BERT-adjacent models have started getting warnings:

>>> from transformers import AutoModel
>>> model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
Loading weights: 100%|██████████████████████████████████████████████████████████████████| 103/103 [00:00<00:00, 3529.81it/s, Materializing param=pooler.dense.weight] 
BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2                                                                                                    
Key                     | Status     |  |                                                                                                                             
------------------------+------------+--+-                                                                                                                            
embeddings.position_ids | UNEXPECTED |  |                                                                                                                             
                                                                                                                                                                      
Notes:                                                                                                                                                                
- UNEXPECTED    :can be ignored when loading from different task/architecture; not ok if you expect identical arch.

This warning is reasonable: BERT doesn't need any position_ids to be stored in the weights, because it's literally just an torch.arange. After the v5 release, these warnings started popping up. The warnings aren't useful: users might get freaked out, not realising that their models will still work 100% correctly.
I found out that this is an issue on a lot of architectures, so I propose a generic fix, just like with rotary_emb.inv_freq.

I asked an agent to find all architectures with a persistent=False position_ids, then use my https://github.com/huggingface/sentence-transformers/blob/main/tests/base/modules/transformer/transformers_tiny_models.json mapping to grab old (and tiny) models that might emit this warning:

# Architecture Tiny-random model Before fix After fix
1 aimv2 (none on hub)
2 albert hf-internal-testing/tiny-random-AlbertModel warned: embeddings.position_ids not warned
3 align hf-internal-testing/tiny-random-AlignModel warned: text_model.embeddings.position_ids not warned
4 altclip hf-internal-testing/tiny-random-AltCLIPModel warned: text_model.roberta.embeddings.position_ids, vision_model.embeddings.position_ids not warned
5 bert hf-internal-testing/tiny-random-BertLMHeadModel warned: bert.embeddings.position_ids not warned
6 bert_generation (none on hub)
7 big_bird hf-internal-testing/tiny-random-BigBirdModel warned: embeddings.position_ids not warned
8 blip hf-internal-testing/tiny-random-BlipModel warned: text_model.embeddings.position_ids not warned
9 blip_2 hf-internal-testing/tiny-random-Blip2Model not warned not warned
10 bridgetower (none on hub)
11 camembert hf-internal-testing/tiny-random-camembert warned: embeddings.position_ids not warned
12 canine hf-internal-testing/tiny-random-CanineModel warned: char_embeddings.position_ids not warned
13 chinese_clip hf-internal-testing/tiny-random-ChineseCLIPModel warned: text_model.embeddings.position_ids, vision_model.embeddings.position_ids not warned
14 clip hf-internal-testing/tiny-random-CLIPModel warned: text_model.embeddings.position_ids, vision_model.embeddings.position_ids not warned
15 clipseg hf-internal-testing/tiny-random-CLIPSegModel warned: text_model.embeddings.position_ids, vision_model.embeddings.position_ids not warned
16 convbert hf-internal-testing/tiny-random-ConvBertModel warned: embeddings.position_ids not warned
17 data2vec hf-internal-testing/tiny-random-Data2VecTextModel warned: embeddings.position_ids not warned
18 deberta hf-internal-testing/tiny-random-DebertaModel warned: embeddings.position_ids not warned
19 deberta_v2 hf-internal-testing/tiny-random-DebertaV2Model warned: embeddings.position_ids not warned
20 distilbert hf-internal-testing/tiny-random-DistilBertModel not warned not warned
21 electra hf-internal-testing/tiny-random-ElectraModel warned: embeddings.position_ids not warned
22 eomt (none on hub)
23 ernie hf-internal-testing/tiny-random-ErnieModel warned: embeddings.position_ids not warned
24 esm hf-internal-testing/tiny-random-EsmModel warned: embeddings.position_ids not warned
25 evolla (none on hub)
26 flaubert hf-internal-testing/tiny-random-FlaubertModel not warned not warned
27 flava hf-internal-testing/tiny-random-FlavaModel warned: text_model.embeddings.position_ids not warned
28 fnet hf-internal-testing/tiny-random-FNetModel warned: embeddings.position_ids not warned
29 git hf-internal-testing/tiny-random-GitModel warned: embeddings.position_ids, image_encoder.vision_model.embeddings.position_ids not warned
30 groupvit hf-internal-testing/tiny-random-GroupViTModel warned: text_model.embeddings.position_ids not warned
31 ibert hf-internal-testing/tiny-random-IBertModel warned: embeddings.position_ids not warned
32 instructblip (none on hub)
33 instructblipvideo (none on hub)
34 janus (none on hub)
35 jina_embeddings_v3 (none on hub)
36 kosmos2 hf-internal-testing/tiny-random-Kosmos2Model not warned not warned
37 layoutlm hf-internal-testing/tiny-random-LayoutLMModel warned: embeddings.position_ids not warned
38 layoutlmv2 hf-internal-testing/tiny-random-LayoutLMv2Model error (ImportError) error (ImportError)
39 layoutlmv3 hf-internal-testing/tiny-random-LayoutLMv3Model warned: embeddings.position_ids not warned
40 lilt hf-internal-testing/tiny-random-LiltModel warned: embeddings.position_ids not warned
41 markuplm hf-internal-testing/tiny-random-MarkupLMModel warned: embeddings.position_ids not warned
42 megatron_bert hf-internal-testing/tiny-random-MegatronBertModel warned: embeddings.position_ids not warned
43 metaclip_2 (none on hub)
44 mlcd (none on hub)
45 mobilebert hf-internal-testing/tiny-random-MobileBertModel warned: embeddings.position_ids not warned
46 mpnet hf-internal-testing/tiny-random-MPNetModel warned: embeddings.position_ids not warned
47 nomic_bert (none on hub)
48 nystromformer hf-internal-testing/tiny-random-NystromformerModel warned: embeddings.position_ids not warned
49 openai (GPT-1) hf-internal-testing/tiny-random-OpenAIGPTLMHeadModel warned: transformer.position_ids not warned
50 ovis2 (none on hub)
51 owlv2 hf-internal-testing/tiny-random-Owlv2Model not warned not warned
52 owlvit hf-internal-testing/tiny-random-OwlViTModel not warned not warned
53 paddleocr_vl (none on hub)
54 pp_doclayout_v2 (none on hub)
55 rembert hf-internal-testing/tiny-random-RemBertModel warned: embeddings.position_ids not warned
56 roberta hf-internal-testing/tiny-random-RobertaModel warned: embeddings.position_ids not warned
57 roberta_prelayernorm hf-internal-testing/tiny-random-RobertaPreLayerNormModel warned: embeddings.position_ids not warned
58 roc_bert hf-internal-testing/tiny-random-RoCBertModel warned: embeddings.position_ids not warned
59 siglip hf-internal-testing/tiny-random-SiglipModel not warned not warned
60 siglip2 (none on hub)
61 splinter hf-internal-testing/tiny-random-SplinterModel warned: embeddings.position_ids not warned
62 squeezebert hf-internal-testing/tiny-random-SqueezeBertModel warned: embeddings.position_ids not warned
63 videomt (none on hub)
64 vilt hf-internal-testing/tiny-random-ViltModel warned: embeddings.text_embeddings.position_ids not warned
65 visual_bert hf-internal-testing/tiny-random-VisualBertModel warned: embeddings.position_ids not warned
66 x_clip hf-internal-testing/tiny-random-XCLIPModel warned: text_model.embeddings.position_ids, vision_model.embeddings.position_ids not warned
67 xlm hf-internal-testing/tiny-random-XLMModel warned: position_ids not warned
68 xlm_roberta hf-internal-testing/tiny-xlm-roberta not warned not warned
69 xlm_roberta_xl hf-internal-testing/tiny-random-XLMRobertaXLModel warned: embeddings.position_ids not warned
70 xmod hf-internal-testing/tiny-random-XmodModel warned: embeddings.position_ids not warned
71 yoso hf-internal-testing/tiny-random-YosoModel warned: embeddings.position_ids not warned

In short: it found 54 checkpoints, and 45 of those emitted the position_ids warnings. Now, none of them do. This should clean up a lot of the warnings that users are experiencing with my older https://huggingface.co/sentence-transformers models.

Sidenote: bros, clap, mra register position_ids as a persistent buffer. We can (or perhaps should?) probably set these to no longer be persistent.

Who can review?

@Cyrilvallez @zucchini-nlp

  • Tom Aarsen

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: owlv2, owlvit

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45437&sha=9bdb99

Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ig the failing tests aren't related

Comment on lines +4612 to +4615
# Same idea for `position_ids`: used to be a persistent buffer, now `persistent=False` in most models.
has_position_ids_buffers = any(buffer.endswith("position_ids") for buffer, _ in self.named_buffers())
if has_position_ids_buffers:
additional_unexpected_patterns.append(r"(^|\.)position_ids$")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine by me, I don't think there is any model where position ids are so special and must be loaded from ckpt

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, I think there shouldn't be any. Plus, I believe this is only ignoring warnings for where the model isn't expecting it, but the checkpoint has it extra, so these are always cases where the weights are just being ignored (i.e. no actual changes in behaviour, just less warnings).

@tomaarsen tomaarsen added this pull request to the merge queue Apr 16, 2026
Merged via the queue into huggingface:main with commit 341bb45 Apr 16, 2026
29 checks passed
@tomaarsen tomaarsen deleted the fix/spurious_pos_ids_warnings branch April 16, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants