[`auto_docstring`] needs to be only run on __doc__ by ArthurZucker · Pull Request #45056 · huggingface/transformers

ArthurZucker · 2026-03-27T11:36:10Z

What does this PR do?

This is mega long due I wanted to check benches.
Its not super super huge but a win is a win

HuggingFaceDocBuilderDev · 2026-03-27T11:44:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker · 2026-03-27T13:37:24Z

Benchmark Update 4 — Decoration speedup (warm process, without PyTorch)

Setup: same Python process, all imports and caches already warm (inspect signature cache, regex, auto-module). Both branches measured in the same process using explicit sys.path injection to bypass the editable install. 50 rounds × 3 real config classes.

Decoration cost per class

	`@auto_docstring` call cost	what it does
branch	~0.35 µs / class	stores a `_LazyDocClass` closure
main	~1 106 µs / class	generates the full docstring eagerly
ratio	~3 160×	—

branch: 0.001 ms / 3 classes  =  0.35 µs/class   ← just stores a closure
main:   3.317 ms / 3 classes  = 1106 µs/class    ← full generation happens here

Cached cls.__doc__ access after generation: ~60 ns/class on both (identical).

What this means for inference / training

	main	branch
`from transformers import LlamaConfig`	pays ~1 ms to generate doc immediately	pays ~0.35 µs to store a closure
`model.forward(inputs)`	`__doc__` never touched	`__doc__` never touched
`LlamaConfig.__doc__` (explicit access)	~0 ns (already done)	~1 ms (generated once, then cached)
`LlamaConfig.__doc__` again	~60 ns	~60 ns

Inference and training never read __doc__. On main, each from transformers import Xxx pays ~1 ms to generate the docstring whether or not it is ever used. On branch, that cost is deferred and only paid if .__doc__ is explicitly accessed.

Why this does not show up in cold-process import benchmarks

The ~1 ms generation cost is negligible compared to Python startup (~200 ms) + transformers package init (~600 ms) + optional PyTorch import (~1 500 ms). The cold-process noise floor is ~50 ms, so a ~1–5 ms per-class saving is invisible there. The benefit accumulates across all decorated classes but is swamped by startup variance in single-class measurements.

Copilot

Pull request overview

This PR updates auto_docstring to defer class docstring generation until cls.__doc__ is first accessed (while keeping method/function docstrings generated eagerly), and adds benchmark coverage to measure import/doc-access impact.

Changes:

Introduces a lazy class-docstring descriptor and refactors docstring builders into “generate” helper functions.
Keeps method docstrings eager and updates generation to prefer the unwrapped (__wrapped__) function for source docstrings/signatures.
Adds a new tests/benchmarks suite (with a stub benchmark fixture fallback) to measure import/doc-access/from_pretrained timing.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`src/transformers/utils/auto_docstring.py`	Implements lazy class docstrings via a descriptor; refactors generation into helper functions and updates decorator docs.
`tests/benchmarks/test_lazy_docstring_benchmarks.py`	Adds informational benchmarks for import time and docstring access paths, plus an optional slow from_pretrained benchmark.
`tests/benchmarks/conftest.py`	Adds a stub `benchmark` fixture to gracefully skip benchmarks when `pytest-benchmark` isn’t installed.

Copilot · 2026-03-29T12:57:03Z

+    # Capture the raw source-code docstring **before** any lazy machinery is attached so
+    # that the generator closure can use it safely without risking re-entry.
+    original_doc = cls.__dict__.get("__doc__")
+
+    def _generator():
+        return _generate_class_docstring(
+            cls,
+            custom_intro=custom_intro,
+            custom_args=custom_args,
+            checkpoint=checkpoint,
+            _original_doc=original_doc,
+        )
+
+    _apply_lazy_doc(cls, _generator)
    return cls


original_doc = cls.__dict__.get("__doc__") can capture the lazy descriptor itself if auto_class_docstring() is called more than once on the same class (or if the doc was already made lazy elsewhere). In that case _generate_class_docstring(..., _original_doc=original_doc) will later treat a non-str as the raw docstring and can break parsing/formatting. Consider normalizing here (only keep str/None, or if the existing value is _LazyDocClass, reuse its cached value / generator result safely) to make auto_class_docstring idempotent.

Copilot · 2026-03-29T12:57:03Z

+    from transformers.utils.auto_docstring import auto_method_docstring
+
+    def _dummy(x: int, y: int = 0) -> int:
+        r"""x (`int`): First number.\ny (`int`, *optional*): Second number."""


The _dummy docstring is declared as a raw string containing a literal \n, so it will not contain an actual newline. If the goal is to simulate a typical multi-line docstring format for auto_method_docstring, use a real newline (multi-line triple-quoted string) so the benchmark reflects realistic parsing/formatting behavior.

Suggested change

r"""x (`int`): First number.\ny (`int`, *optional*): Second number."""

"""x (`int`): First number.

y (`int`, *optional*): Second number."""

Copilot · 2026-03-29T12:57:04Z

+pytestmark = pytest.mark.skipif(
+    not HAS_BENCHMARK, reason="pytest-benchmark not installed (pip install pytest-benchmark)"
+)


These are benchmarks that assert nothing, but if pytest-benchmark is installed they will run as normal tests in any full pytest invocation. To avoid accidental slow/side-effectful runs, consider adding an additional opt-in guard (e.g., skip unless an env var like RUN_BENCHMARKS=1 is set), or place/rename the file so it isn’t collected by default.

Copilot · 2026-03-29T12:57:04Z

+    # Reset the lazy state so every round re-generates.
+    from transformers.utils.auto_docstring import auto_class_docstring
+
+    def setup():
+        auto_class_docstring(BaseImageProcessor)
+
+    def access():
+        return BaseImageProcessor.__doc__


This benchmark mutates BaseImageProcessor in-place by reapplying auto_class_docstring, which can permanently replace __doc__ for the rest of the process (and affect any subsequent tests in the same session). Consider snapshotting/restoring the original BaseImageProcessor.__dict__.get("__doc__") around the benchmark, or using an isolated throwaway class/module for the benchmark to avoid leaking global state.

Suggested change

# Reset the lazy state so every round re-generates.

from transformers.utils.auto_docstring import auto_class_docstring

def setup():

auto_class_docstring(BaseImageProcessor)

def access():

return BaseImageProcessor.__doc__

# Use a throwaway subclass so we don't mutate the global BaseImageProcessor.

class _BenchBaseImageProcessor(BaseImageProcessor):

pass

# Reset the lazy state so every round re-generates.

from transformers.utils.auto_docstring import auto_class_docstring

def setup():

auto_class_docstring(_BenchBaseImageProcessor)

def access():

return _BenchBaseImageProcessor.__doc__

Copilot · 2026-03-29T12:57:04Z

+class _LazyDocClass:
+    """
+    Descriptor stored directly in ``cls.__dict__['__doc__']`` to defer class docstring
+    generation until the first ``cls.__doc__`` access.
+
+    Python's ``type.__doc__`` C-level getter checks whether the stored value has a
+    ``__get__`` method and, if so, calls it — exactly like normal descriptor dispatch.
+    This lets us intercept ``cls.__doc__`` without changing the class's metaclass.
+
+    On the first access the generator is invoked, the result is cached, and the descriptor
+    replaces itself with the plain string so that all subsequent lookups are zero-overhead.
+    """


The new lazy class-docstring mechanism is a behavior change with subtle interactions (e.g. inspect.getdoc, repeated decoration, and ensuring no generation happens until __doc__ is accessed). There are existing tests/utils/test_auto_docstring.py end-to-end tests, but none that assert the laziness property itself; adding a focused unit test would help prevent regressions.

Cyrilvallez · 2026-04-27T08:53:41Z

cc @yonigozlan!

long due

4803b72

ArthurZucker marked this pull request as ready for review March 27, 2026 11:36

ArthurZucker marked this pull request as draft March 27, 2026 13:09

ArthurZucker marked this pull request as ready for review March 27, 2026 13:47

ArthurZucker added 2 commits March 27, 2026 15:10

fix

13f5646

styling

0806275

ArthurZucker requested review from Cyrilvallez and Copilot March 27, 2026 14:29

Copilot started reviewing on behalf of ArthurZucker March 29, 2026 12:51 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

rpathade mentioned this pull request Mar 29, 2026

[auto_docstring] _process_kwargs_parameters crashes with AttributeError when module uses from __future__ import annotations #45103

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`auto_docstring`] needs to be only run on doc #45056

[`auto_docstring`] needs to be only run on doc #45056
ArthurZucker wants to merge 3 commits intomainfrom
fix-auto-doc

ArthurZucker commented Mar 27, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Mar 27, 2026

Uh oh!

ArthurZucker commented Mar 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Cyrilvallez commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	r"""x (`int`): First number.\ny (`int`, optional): Second number."""
	"""x (`int`): First number.
	y (`int`, optional): Second number."""

Conversation

ArthurZucker commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Mar 27, 2026

Uh oh!

ArthurZucker commented Mar 27, 2026

Benchmark Update 4 — Decoration speedup (warm process, without PyTorch)

Decoration cost per class

What this means for inference / training

Why this does not show up in cold-process import benchmarks

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ArthurZucker commented Mar 27, 2026 •

edited

Loading