Default auto 🚨 🚨 by ArthurZucker · Pull Request #42805 · huggingface/transformers

ArthurZucker · 2025-12-11T10:35:41Z

What does this PR do?

Superseeds #34919

HuggingFaceDocBuilderDev · 2025-12-11T10:51:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Cyrilvallez

Very happy so break this and have auto by default!!

Cyrilvallez · 2025-12-11T13:49:46Z

-                        dtype = get_state_dict_dtype(state_dict)
+                        dtype = get_state_dict_dtype(state_dict, getattr(config, "dtype", None))


Why do we need to change this? They are both already in branches when we know that config.dtype does not exist anyway

Cyrilvallez · 2025-12-11T13:50:26Z



-def get_state_dict_dtype(state_dict):
+def get_state_dict_dtype(state_dict, config_dtype: Optional[torch.dtype] = None):


I don't think this function needs to be changed, see previous comment

Cyrilvallez · 2025-12-11T13:55:39Z

+        if getattr(self.config, "dtype", None) is None:
+            default_dtype = torch.get_default_dtype()
+            self.config.dtype = default_dtype
+            for sub_config_key in self.config.sub_configs:
+                if (sub_config := getattr(self.config, sub_config_key)) is not None and getattr(
+                    sub_config, "dtype", None
+                ) is None:
+                    sub_config.dtype = default_dtype


Do we need to write it at __init__? In any case, no need to do it on all subconfigs, as all submodels run __init__ during the whole __init__ process (so writing it only to self.config is enough, they will each do it)

github-actions · 2025-12-11T16:41:49Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: beit, bigbird_pegasus, blip_2, data2vec, edgetam, gpt_oss, internvl, sam2, sam3, timm_wrapper

…lt-auto

Cyrilvallez

Let's gooooo the nice defaults 🚀🚨

BenjaminBossan · 2025-12-12T11:51:35Z

I understand that this is a big change that is intended to break some stuff. E.g. for opt-125m, the auto dtype is now float16 instead of float32. However, this PR also makes it so that the dtype attribute is no longer updated when converting the model. Is that also intended?

import torch
from transformers import AutoModelForCausalLM

model_id = "facebook/opt-125m"

# previous commit (8a2a83d574fd461697a29410a36737ed112f8ba7)
# this passes
model = AutoModelForCausalLM.from_pretrained(model_id)
assert model.dtype == torch.float32
model.half()
assert {p.dtype for p in model.parameters()} == {torch.float16}
assert model.dtype == torch.float16, f"not float16, got {model.dtype} instead"  # passes

# after this commit (6217adc6c8f0be7b5374e6a46129ad2214e4c6ed)
model = AutoModelForCausalLM.from_pretrained(model_id)
assert model.dtype == torch.float16  # <= used to be float32
model.float()
assert {p.dtype for p in model.parameters()} == {torch.float32}
assert model.dtype == torch.float32, f"not float32, got {model.dtype} instead"  # fails
# AssertionError: not float32, got torch.float16 instead

Cyrilvallez · 2025-12-12T14:20:22Z

Ha damn, I just merged #42825 but seeing this now... Will open a new one, it indeed looks like relying on the config is a bad idea in those cases!

* default to `"auto"` dtype * the actual change? * up? * style * up? * only sam models were broken with this * fix sams * update * fix sam2 now * up * this? * proper fix * lol * fix * fixes * nit * fix * fix copies * fixes * fix bigbird * revert one bit

ArthurZucker and others added 11 commits November 25, 2024 15:45

default to "auto" dtype

b606208

Merge branch 'main' into default-auto

eff732f

the actual change?

b1c9710

Merge branch 'main' into default-auto

b0cfaef

Merge branch 'main' into default-auto

0c1e418

up?

ec26833

style

a30f347

up?

3b3a8d4

only sam models were broken with this

b402505

fix sams

0899bd8

update

25480c0

ArthurZucker requested a review from Cyrilvallez December 11, 2025 10:43

ArthurZucker added 3 commits December 11, 2025 12:56

fix sam2 now

6f2524f

up

2cf8fa5

this?

104f78e

Cyrilvallez reviewed Dec 11, 2025

View reviewed changes

ArthurZucker added 9 commits December 11, 2025 15:06

proper fix

c6be5a4

lol

12ae873

fix

dcf454d

fixes

7aa9f74

nit

1e1953a

fix

4a9e968

fix copies

3924e1e

fixes

6a1e7df

fix bigbird

06b51ea

ArthurZucker added 2 commits December 11, 2025 17:42

Merge branch 'main' of github.com:huggingface/transformers into defau…

e008f27

…lt-auto

revert one bit

1ba1367

Cyrilvallez approved these changes Dec 11, 2025

View reviewed changes

ArthurZucker merged commit 6217adc into main Dec 11, 2025
26 checks passed

ArthurZucker deleted the default_auto branch December 11, 2025 17:03

ArthurZucker changed the title ~~Default auto~~ Default auto 🚨 🚨 Dec 11, 2025

Cyrilvallez mentioned this pull request Dec 11, 2025

Simplify dtype instantiation #42825

Merged

Cyrilvallez mentioned this pull request Dec 12, 2025

Do not rely on config for inferring model dtype #42838

Merged

albertvillanova mentioned this pull request Dec 23, 2025

CI fails with test dependencies: torch.AcceleratorError: CUDA error: device-side assert huggingface/trl#4741

Closed

This was referenced Jan 2, 2026

CI fails with test dependencies: AssertionError: Parameter has not changed huggingface/trl#4748

Closed

Hotfix: Set float32 as default dtype for testing tiny models huggingface/trl#4770

Merged

This was referenced Jan 28, 2026

CI fails for experimental tests: TypeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'NoneType' huggingface/trl#4917

Closed

Judges dependency llm-blender is not compatible with transformers v5 huggingface/trl#4918

Open

This was referenced Feb 4, 2026

Quantization model behavior changed #43725

Closed

update get_error_factor to cache up with the latest transformers change huggingface/peft#3021

Closed

brozjak2 mentioned this pull request Mar 20, 2026

Outdated documentation after Transformers v5 dtype default behavior changed huggingface/trl#5329

Open

This was referenced Apr 1, 2026

fix bug for janus model image generation #45044

Merged

Fix Qwen2IntegrationTest #45268

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default auto 🚨 🚨 #42805

Default auto 🚨 🚨 #42805
ArthurZucker merged 25 commits intomainfrom
default_auto

ArthurZucker commented Dec 11, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Dec 11, 2025

Uh oh!

Cyrilvallez left a comment

Uh oh!

Cyrilvallez Dec 11, 2025

Uh oh!

Cyrilvallez Dec 11, 2025

Uh oh!

Cyrilvallez Dec 11, 2025

Uh oh!

github-actions Bot commented Dec 11, 2025

Uh oh!

Cyrilvallez left a comment

Uh oh!

Uh oh!

BenjaminBossan commented Dec 12, 2025

Uh oh!

Cyrilvallez commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		dtype = get_state_dict_dtype(state_dict)
		dtype = get_state_dict_dtype(state_dict, getattr(config, "dtype", None))



		def get_state_dict_dtype(state_dict):
		def get_state_dict_dtype(state_dict, config_dtype: Optional[torch.dtype] = None):

Conversation

ArthurZucker commented Dec 11, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 11, 2025

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Dec 11, 2025

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan commented Dec 12, 2025

Uh oh!

Cyrilvallez commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants