rewrite _process_parameter_type in auto_docstring.py to improve usability by Yacklin · Pull Request #42431 · huggingface/transformers

Yacklin · 2025-11-26T18:09:20Z

What does this PR do?

When using kimi model, kimi developers used something like "bool|int" as type hint for parameters in function signature and unfortunately, due to python's nature, format of type hint like "bool|int" is not completely the same as "typing.Union[bool, int]" until python 3.14. One of the biggest differences is when accessing param.annotation.__name__, "bool|int" would raise error that has no attribute "__name__". This error bothers me a lot so i have to fix it. I found a beautiful way to fix it and its backward compatibility could be extended to 3.4 (I just tried it a little bit but you could try versions before 3.4).

Please be advised that param_name and func are unused in this rewrited version. I chose not to remove these two parameters just in case something bad would happen.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@Rocketknight1
(I accidentally deleted the previous PR, here is the new one)

Rocketknight1 · 2025-11-27T15:35:14Z

Approved the doc build run! You should get a link to the docs with your PR in about 30 minutes - can you check that everything is working as expected?

HuggingFaceDocBuilderDev · 2025-11-27T15:43:54Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Yacklin · 2025-11-28T20:45:37Z

Approved the doc build run! You should get a link to the docs with your PR in about 30 minutes - can you check that everything is working as expected?

i think yes. Could you please let me know if anything should be changed in my code for this PR? I tested this rewritten fuction several times as well. Also, i am a little bit new to git and github so sometimes i might mess up PR or commits. Sorry for that in advance. @Rocketknight1

Yacklin · 2025-11-28T21:08:39Z

i am wondering why sometimes tests failed but sometimes didn't? And my changes have nothing to do with these failures

Yacklin · 2025-11-28T21:13:16Z

Approved the doc build run! You should get a link to the docs with your PR in about 30 minutes - can you check that everything is working as expected?

Okay i do find something changed, which i would prefer actually.
current one:

updated one:

Might be a little bit verbose but i think it's okay

Yacklin · 2025-11-28T22:12:49Z

Maybe i should provide more details about this PR.
When using kimi 48B instruct model, the decorator @auto_docstring would be applied to some functions in modelling_kimi like this:

But, auto_docstring would call _process_parameter_type to go through every single parameter's type hint to call this type hint's annotation.__name__. But before python 3.14, bool|int is not the same as Union[bool, int]. It means type hint like "bool|hint" has no attribute __name__. Then _process_parameter_type would raise errors, which prevents users from using kimi 48B instruct models.

Btw, the downstream users of kimi 48B noticed this problem then they changed it but it seems that they didn't notify kimi this problem:

(coming from modelling_kimi.py of cerebras/Kimi-Linear-REAP-35B-A3B-Instruct)

My PR could solve this problem. Otherwise, python 3.14 is needed to use kimi 48B instruct model and i don't think transformers users could switch to python 3.14 easily
@Rocketknight1

Rocketknight1

Hey @Yacklin, I took a look at the code! I think the bug you're fixing is real, and I think the cause is that we do lots of hacky accesses to things like annotation.__name__, like you said. I don't like if ":" in parameter_str though - doing these kinds of string manipulations also feels very hacky. It's likely to create bugs in future if anything changes with typing, or if anyone writes an unusual type hint.

Is there a clean way to correctly resolve all type hints into some kind of canonical form, and then just operate on that, rather than relying on string representations and attribute accesses?

Yacklin · 2025-12-02T16:06:14Z

Hey @Yacklin, I took a look at the code! I think the bug you're fixing is real, and I think the cause is that we do lots of hacky accesses to things like annotation.__name__, like you said. I don't like if ":" in parameter_str though - doing these kinds of string manipulations also feels very hacky. It's likely to create bugs in future if anything changes with typing, or if anyone writes an unusual type hint.

Is there a clean way to correctly resolve all type hints into some kind of canonical form, and then just operate on that, rather than relying on string representations and attribute accesses?

Hi @Rocketknight1 , string manipulation is unavoidable in this function because this function would return something like tuple[str, bool]. But i could try to write an implmentation to minimize string manipulation.

Rocketknight1 · 2025-12-03T15:25:45Z

Hi @Yacklin, some manipulation is unavoidable at the end, but I think you should always be able to get the type hint from param.annotation rather than needing to use str(param). What we really want to do is just grab that annotation in the safest and most Pythonic way possible and turn it into a "canonical" form regardless of whether it was written as Optional[Union[str, bool]] or str | bool | None. I don't think flattening it to a string is the safest way to do that, and will probably get very messy for nested types!

Yacklin · 2025-12-03T19:14:28Z

Hi @Yacklin, some manipulation is unavoidable at the end, but I think you should always be able to get the type hint from param.annotation rather than needing to use str(param). What we really want to do is just grab that annotation in the safest and most Pythonic way possible and turn it into a "canonical" form regardless of whether it was written as Optional[Union[str, bool]] or str | bool | None. I don't think flattening it to a string is the safest way to do that, and will probably get very messy for nested types!

Hi @Rocketknight1 ,

This situation is more complex than i thought. Please read my comments and implementations carefully. Let me know what you think.

import inspect
import types
import typing
from typing import get_origin, Union, get_args, Optional

def _process_parameter_type(param, param_name, func):
    """
    Process and format a parameter's type annotation.

    Args:
        param (`inspect.Parameter`): The parameter from the function signature
        param_name (`str`): The name of the parameter
        func (`function`): The function the parameter belongs to
    """
    """
    the minimum version of python that transformers supports is 3.9.0.
    but this is the source of problem.

    if user who is using python 3.9.x tried to access models that are using features coming from types.UnionType, like int|float, which is introduced in python 3.10,
    error would be raised (even before this function is executed). Python developers came up with an idea that would squash int|float into string "int|float", using 'from __future__ import annotations'.
    But this is something models developers should care, transformers developers could only warn models developers to solve this issue in their own model script.

    the best way to solve this issue is to level up the minimum python version from 3.9 to 3.10

    if user is using python 3.9, this error would be raised before this function is hit.

    unfortunately, most models developers didn't pay attention to it.

    If python 3.9 users tried to access models that are using features coming from types.UnionType, like int|float, which is introduced in python 3.10,
    users have to contact models developers to support backward-compatibility. Transformers team has nothing to do with it, unless upgrading min version from
    3.9 to 3.10, through voting or something.
    """

    # this PR only takes python 3.10 into consideration, unfortunately.

    param_annotation = param.annotation
    print(param_annotation)
    default_value = param.default
    origin = get_origin(param_annotation)
    args = get_args(param_annotation)
    optional_flag = False
    type_hint_str = ""

    # Optional[sth], under the hood is equivalent to Union[sth, None]
    # if the type hint is None, I mean something like age:None, it would be considered optional, very few developers would write it though.
    # if the type hint is Optional[None] or Union[None], it would be considered optional, very few developers would write it though.
    if (
        (origin is typing.Union and type(None) in args)
        or default_value is not inspect._empty
        or (origin is types.UnionType and type(None) in args)
        or param_annotation is types.NoneType
        or param_annotation is None
    ):
        optional_flag = True

    """
    okay here is another problem, to get the string representation of type hint int, param.annotation.__name__ has to be accessed BUT
    types.UnionType, like int|str, before python 3.14 has no attribute __name__, to get the string representation of int|str, access param.annotation ONLY.
    """
    param_str: str = str(param)
    if param_annotation is not inspect._empty:
        if "=" in param_str:
            type_hint_str = (
                param_str.split(":", maxsplit=1)[1].split("=", maxsplit=1)[0].strip()
            )
        else:
            type_hint_str = param_str.split(":", maxsplit=1)[1].strip()

    return type_hint_str, optional_flag

To make sure python 3.9 users could use features like int|str, which is introduced in python 3.10 correctly, what IDE and PEP suggests is to use "from __future__ import annotations", within from __future__ import annotations, you could see it clearly that type hints would all be converted to strings at Runtime.
I don't mean to but this is what python developers did.

cturan · 2025-12-03T21:31:32Z

I can confirm that it fixed the issue and I was able to get it working, thank you.

Yacklin · 2025-12-03T21:34:00Z

I can confirm that it fixed the issue and I was able to get it working, thank you.

But i need to make reviewers happy about my changes, which is harder than writing a PR

Rocketknight1 · 2025-12-04T14:18:20Z

man you're lucky i'm on watch and isn't sgugger anymore or he would have had you guillotined for this level of sass 😅

Rocketknight1 · 2025-12-04T15:15:41Z

Also, quick comment about your implementation: Python 3.9 is now end-of-life. As a result, we're moving to a minimum supported version of 3.10, and it's not necessary to consider Py3.9 compatibility in new code. It's fine for your PR to assume that everyone will be running 3.10 or newer

Yacklin · 2025-12-04T15:54:30Z

man you're lucky i'm on watch and isn't sgugger anymore or he would have had you guillotined for this level of sass 😅

Oh Thanks to let me know hahaha

Yacklin · 2025-12-04T15:55:18Z

Also, quick comment about your implementation: Python 3.9 is now end-of-life. As a result, we're moving to a minimum supported version of 3.10, and it's not necessary to consider Py3.9 compatibility in new code. It's fine for your PR to assume that everyone will be running 3.10 or newer

If you are happy with the implementation i have in the comment, i would put it in my branch later. Thanks to let me know that 3.9 is end of life.

…https://github.com/Yacklin/transformers into rewrite-_process_parameter_type-to-improve-usability

Yacklin · 2025-12-11T18:08:16Z

@Yacklin you gotta calm down lol. I'm the watcher for this entire repo so I have like 50 issues a day and this isn't a priority one. I'll make a fix and hopefully improve the docstring generation in general, but you need to let go of the laptop and touch grass while I do instead of pinging me every two hours!

Sorry i just felt surprised by your implementation but do let me know if you ran into troubles while updating this PR

Rocketknight1 · 2025-12-11T19:32:59Z

cc @Yacklin it seems to be giving the same outputs as the old system now, but cleaned up a lot, and working correctly for | union types (I made a test in GPT-J for that, but we can revert before merging).

Want to take a look and tell me if you're happy with it, and then I'll ping a core maintainer if so?

Yacklin

Hi @Rocketknight1 , I just tested it locally and i noticed that if the type hint is None, like age: None, it will mark it non-optional. It's an edge case but good to let you know.

Yacklin · 2025-12-15T18:16:06Z

-            param_type = re.sub(r"Optional\[(.*?)\]", r"\1", param_type)
+    if param.annotation == inspect.Parameter.empty:
+        return "", False
+    # This is, astonishingly, the right way to do it: https://docs.python.org/3/library/typing.html#typing.Union


You are right. This is best approach and i ignored it lol

Yacklin · 2025-12-15T18:36:48Z

Also, if the parameter has default value, it should be considered optional. It seems that your implementation didn't include this. I am wondering whether this function is supposed to consider default value or not?

Rocketknight1 · 2025-12-16T16:07:35Z

@Yacklin I think that behaviour for type hints like age: None is fine - that type hint should almost never appear, since it indicates that the only acceptable argument value is None. The much more common case is something like int | none or Union[int, None] or Optional[int], which all mean the same thing, and I think it handles that correctly, right?

Yacklin · 2025-12-17T18:05:11Z

@Yacklin I think that behaviour for type hints like age: None is fine - that type hint should almost never appear, since it indicates that the only acceptable argument value is None. The much more common case is something like int | none or Union[int, None] or Optional[int], which all mean the same thing, and I think it handles that correctly, right?

@Rocketknight1 , A quick question: Should default value be considered in this function? I mean if there is a default value for a parameter, this parameter should be optional. If other functions are designed to handle this, simply ignore what i highlighted here.
I just tested it and it seems that the my idea of implementation is slightly different from yours. Since you are a watcher, i will take yours.

Yacklin · 2025-12-17T19:28:44Z

maybe we could revert changes on gptj since this PR doesn't do anything with it.

github-actions · 2025-12-17T19:39:01Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gptj

github-actions · 2025-12-17T19:46:58Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=42431&sha=0f3518

Rocketknight1 · 2025-12-18T14:11:41Z

Yeah, those were mostly for testing. Will revert the GPT-J changes, keep the rest and merge!

…to-improve-usability' into rewrite-_process_parameter_type-to-improve-usability

…sability

Yacklin · 2025-12-18T15:15:36Z

I just realized that when the type hint of a parameter is None, the only acceptable argument is None for this parameter. And it should be marked as non-optional. But since nobody would annotate a parameter with None, the if statement i wrote to handle None is useless. You could remove that line, if necessary. Anyway, thanks for editing!

Yacklin · 2025-12-18T15:19:51Z

@Rocketknight1 , also, the way this function handles default value seems to be not perfect (my bad). If a parameter has no type hint but has default value, it would be considered non-optional. Maybe it would introduce some problems in the future.

…lity (huggingface#42431) * rewrite to improve its usability * rewrite to improve its usability * Clarify comment about function parameter elements * Update implementation of _process_parameter_type * rewrite to improve its usability * Clarify comment about function parameter elements * reformat it a little bit * reformat it a little bit * used a wrong ruff version..... this one should be good * update the string manipulation * Trying for more consistency * make fixup * Try another approach * Don't include "None" in the out_str when we're already setting optional * Add some new-style types to GPT-J to see what happens * Correct use of UnionType * make fixup * Add a little snarky comment about typing just because * Correctly return the same strings as the old function * Drop unnecessary args * Remove redundant args information * add one more elif statement to deal with the case when type hint is None * add if statement to handle the parameter with default value * Revert GPT-J changes * Trigger tests --------- Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com>

rewrite to improve its usability

a9025ed

Yacklin force-pushed the rewrite-_process_parameter_type-to-improve-usability branch from bf593b7 to f951d0c Compare November 28, 2025 21:07

Yacklin force-pushed the rewrite-_process_parameter_type-to-improve-usability branch from f951d0c to 98d26ea Compare November 30, 2025 18:52

Rocketknight1 reviewed Dec 1, 2025

View reviewed changes

Rocketknight1 mentioned this pull request Dec 3, 2025

4.57.3 auto_docstring crashes with AttributeError #42582

Closed

4 tasks

Yacklin and others added 2 commits December 3, 2025 15:20

rewrite to improve its usability

f87cd5f

Clarify comment about function parameter elements

bf887c0

Yacklin force-pushed the rewrite-_process_parameter_type-to-improve-usability branch from 98d26ea to bf887c0 Compare December 3, 2025 19:20

Yacklin requested a review from Rocketknight1 December 3, 2025 19:53

Yacklin and others added 3 commits December 4, 2025 12:13

Update implementation of _process_parameter_type

4771d2a

rewrite to improve its usability

c339617

Clarify comment about function parameter elements

0914915

Yacklin force-pushed the rewrite-_process_parameter_type-to-improve-usability branch from bf887c0 to 0914915 Compare December 4, 2025 16:14

Yacklin added 2 commits December 4, 2025 12:17

Merge branch 'rewrite-_process_parameter_type-to-improve-usability' of …

31a6a47

…https://github.com/Yacklin/transformers into rewrite-_process_parameter_type-to-improve-usability

reformat it a little bit

0a8e1d3

Don't include "None" in the out_str when we're already setting optional

ecb9000

Rocketknight1 added 6 commits December 11, 2025 18:28

Add some new-style types to GPT-J to see what happens

f697852

Correct use of UnionType

58dca01

make fixup

c91f05b

Add a little snarky comment about typing just because

1abc716

Correctly return the same strings as the old function

e9c848c

Drop unnecessary args

d45e7d5

Remove redundant args information

faefabf

Yacklin commented Dec 15, 2025

View reviewed changes

add one more elif statement to deal with the case when type hint is None

a779683

add if statement to handle the parameter with default value

0f35188

Rocketknight1 and others added 3 commits December 18, 2025 14:12

Revert GPT-J changes

4e5a4ad

Merge remote-tracking branch 'origin/rewrite-_process_parameter_type-…

6f87cc4

…to-improve-usability' into rewrite-_process_parameter_type-to-improve-usability

Merge branch 'main' into rewrite-_process_parameter_type-to-improve-u…

a5092e5

…sability

Rocketknight1 approved these changes Dec 18, 2025

View reviewed changes

Trigger tests

e6c67bb

Rocketknight1 merged commit 728f34c into huggingface:main Dec 18, 2025
25 checks passed

Yacklin deleted the rewrite-_process_parameter_type-to-improve-usability branch December 18, 2025 15:15

Conversation

Yacklin commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Rocketknight1 commented Nov 27, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Nov 27, 2025

Uh oh!

Yacklin commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yacklin commented Nov 28, 2025

Uh oh!

Yacklin commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yacklin commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

Yacklin commented Dec 2, 2025

Uh oh!

Rocketknight1 commented Dec 3, 2025

Uh oh!

Yacklin commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cturan commented Dec 3, 2025

Uh oh!

Yacklin commented Dec 3, 2025

Uh oh!

Rocketknight1 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Dec 4, 2025

Uh oh!

Yacklin commented Dec 4, 2025

Uh oh!

Yacklin commented Dec 4, 2025

Uh oh!

Yacklin commented Dec 11, 2025

Uh oh!

Rocketknight1 commented Dec 11, 2025

Uh oh!

Yacklin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Yacklin Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Yacklin commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Dec 16, 2025

Uh oh!

Yacklin commented Dec 17, 2025

Uh oh!

Yacklin commented Dec 17, 2025

Uh oh!

github-actions Bot commented Dec 17, 2025

Uh oh!

github-actions Bot commented Dec 17, 2025

Uh oh!

Rocketknight1 commented Dec 18, 2025

Uh oh!

Uh oh!

Yacklin commented Dec 18, 2025

Uh oh!

Yacklin commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Yacklin commented Nov 26, 2025 •

edited

Loading

Yacklin commented Nov 28, 2025 •

edited

Loading

Yacklin commented Nov 28, 2025 •

edited

Loading

Yacklin commented Nov 28, 2025 •

edited

Loading

Yacklin commented Dec 3, 2025 •

edited

Loading

Rocketknight1 commented Dec 4, 2025 •

edited

Loading

Yacklin left a comment •

edited

Loading

Yacklin commented Dec 15, 2025 •

edited

Loading