Skip to content

fix(config): annotate PreTrainedConfig.dtype as Any to fix pydantic schema generation (#45070)#45129

Closed
IgnazioDS wants to merge 1 commit intohuggingface:mainfrom
IgnazioDS:fix/pretrained-config-dtype-pydantic-annotation
Closed

fix(config): annotate PreTrainedConfig.dtype as Any to fix pydantic schema generation (#45070)#45129
IgnazioDS wants to merge 1 commit intohuggingface:mainfrom
IgnazioDS:fix/pretrained-config-dtype-pydantic-annotation

Conversation

@IgnazioDS
Copy link
Copy Markdown

Problem

Fixes #45070. PreTrainedConfig.dtype was annotated as Union[str, "torch.dtype"] | None. Since torch is only imported under TYPE_CHECKING, pydantic's schema builder encounters the "torch.dtype" forward reference at runtime and fails with:

PydanticUndefinedAnnotation: name 'torch' is not defined

This breaks any user code that includes PreTrainedConfig as a field in a pydantic model (e.g. vLLM speculator configs, config validators).

The repro from the issue:

from pydantic import BaseModel, ConfigDict, Field
from transformers import PretrainedConfig

class MyModelConfig(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)
    sub_config: PretrainedConfig = Field(...)

MyModelConfig.model_rebuild(force=True)  # PydanticUndefinedAnnotation

Fix

Change dtype: Union[str, "torch.dtype"] | None = None to dtype: Any = None as suggested by @zucchini-nlp in the issue.

Any is semantically correct for pydantic's purposes — the field accepts string or torch.dtype values at runtime. The forward reference is now only needed for static type checkers (which still see the correct type through TYPE_CHECKING-gated imports in user code). The runtime behaviour is unchanged.

Test Plan

  • MyModelConfig.model_rebuild(force=True) no longer raises PydanticUndefinedAnnotation
  • PreTrainedConfig can be used as a pydantic field with arbitrary_types_allowed=True
  • Existing dtype handling in __post_init__ still works (assigns torch.dtype objects at runtime)
  • Existing model loading tests pass

🤖 Generated with Claude Code

…chema generation

Fixes huggingface#45070. PreTrainedConfig.dtype was annotated as
Union[str, "torch.dtype"] | None. When torch is only imported under
TYPE_CHECKING, pydantic's schema builder encounters the "torch.dtype"
forward reference at runtime and fails with
PydanticUndefinedAnnotation: name 'torch' is not defined.

The annotation is changed to Any, which is semantically correct for
pydantic's purposes (the field accepts arbitrary values) and avoids the
forward-reference resolution failure. The runtime behaviour is unchanged
— dtype can still hold str or torch.dtype values.

As noted by @zucchini-nlp in the issue, this is the minimal fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v5.4.0 breaks PretrainedConfig field in pydantic model

2 participants