Skip to content

email.message.Message: Improve payload methods and Explain why __getitem__ isn't typed to return None#11095

Merged
Avasam merged 8 commits into
python:mainfrom
Avasam:Message.__get_item__
Dec 14, 2023
Merged

email.message.Message: Improve payload methods and Explain why __getitem__ isn't typed to return None#11095
Avasam merged 8 commits into
python:mainfrom
Avasam:Message.__get_item__

Conversation

@Avasam
Copy link
Copy Markdown
Collaborator

@Avasam Avasam commented Dec 1, 2023

Addresses part of #2333

From #2333 (comment):

Taking a look at this test run: https://github.com/Avasam/typeshed/pull/36/files#diff-ced02b01b42b244e9b7bc2e12ef7a215bde89a29072c99ade903f58fb83a3bc7R36

spack would have to change their assert to first store the value in a local variable then assert that variable: https://github.com/spack/spack/blob/develop/lib/spack/spack/oci/oci.py#L138-L142

twine implicitely uses header_factory. https://github.com/pypa/twine/blob/main/twine/commands/check.py#L73C40-L73C54 They could replace with

    _, header = msg.policy.header_store_parse("content-type", value)`
    return msg.get_content_type(), header.params

urllib3 is a false-positive, and an interesting one: https://github.com/urllib3/urllib3/blob/main/src/urllib3/contrib/emscripten/fetch.py#L382C24-L382C24

I don't think the false-positives a None union here would introduce is something we want. Especially when .get exists for that purpose.
I'm sure the explanation in my comment could be made clearer. Feel free to improve it.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be using unions with Any? E.g. we definitely want to handle the payload case, so get_payload() -> _PayloadType | Any seems better

@Avasam
Copy link
Copy Markdown
Collaborator Author

Avasam commented Dec 1, 2023

Should these be using unions with Any? E.g. we definitely want to handle the payload case, so get_payload() -> _PayloadType | Any seems better

Let's try it, I can always revert it. (also relates to #11094 )

@github-actions

This comment has been minimized.

@Avasam
Copy link
Copy Markdown
Collaborator Author

Avasam commented Dec 1, 2023

Could be a true positive given that pip is passing around the base Message type ?
Could also just be a case of a Union return type too complex that may or me not be solved with generic types. Which would mean back to Any for now.

Comment thread stdlib/email/message.pyi
Comment thread stdlib/email/message.pyi Outdated
def get_unixfrom(self) -> str | None: ...
def attach(self, payload: Message) -> None: ...
def get_payload(self, i: int | None = None, decode: bool = False) -> Any: ... # returns _PayloadType | None
def get_payload(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think pip error might be unavoidable, but we can definitely do better using overloads. if i is not None, then return is message or none. if i is None, then I think it's either str | list[str] if decode is False and bytes if decode is True.

I don't actually see how we get list[Message], but this is all quite gross

Copy link
Copy Markdown
Collaborator Author

@Avasam Avasam Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got it all figured out :) This is more than I originally planned for this PR and doesn't address the linked issue, but sure, why not.

The pip error is still there. But should be more accurate. I AM curious about the cases/examples where payload can be a Message (or multipart list[Message]).

Just like for the header, the payload Union issue could be solved with a generic. But that would cascade a lot without defaults. Or overriding HttpMessage's different methods if it's true it should only use strings rather than Header and/or Message

@github-actions

This comment has been minimized.

@Avasam Avasam changed the title Explain why Message.__getitem__ isn't typed to return None email.message.Message: Improve payload methods and Explain why __getitem__ isn't typed to return None Dec 2, 2023
Copy link
Copy Markdown
Collaborator

@hauntsaninja hauntsaninja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! (I think we could have gotten away with nominal typing for set_payload, but duck typing can't hurt!)

Comment thread stdlib/email/message.pyi Outdated
@github-actions

This comment has been minimized.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 7, 2023

Diff from mypy_primer, showing the effect of this PR on open source code:

pip (https://github.com/pypa/pip)
+ src/pip/_internal/metadata/_json.py:82: error: Incompatible types in assignment (expression has type "Message | Any | str | list[Message | str]", target has type "str | list[str]")  [assignment]

@Avasam
Copy link
Copy Markdown
Collaborator Author

Avasam commented Dec 14, 2023

@AlexWaygood @hauntsaninja Is this OK to merge?

@AlexWaygood
Copy link
Copy Markdown
Member

AlexWaygood commented Dec 14, 2023

I haven't looked too deeply at this (and while I'm happy to if you want me to, I probably won't get to it in the next few days, unfortunately), but @hauntsaninja approved it earlier, and I trust both his and your judgement :)

@Avasam
Copy link
Copy Markdown
Collaborator Author

Avasam commented Dec 14, 2023

I feel confortable with this. It's #11114 I'm less certain about. If something comes up it can be updated there or in a new PR.

@Avasam Avasam merged commit c390c70 into python:main Dec 14, 2023
@Avasam Avasam deleted the Message.__get_item__ branch December 14, 2023 23:28
@asottile-sentry
Copy link
Copy Markdown
Contributor

fwiw the annotations to get_payload() broke this usage here -- unclear how to proceed since the usage itself is safe (it's always str) but the typing now complains:

https://github.com/getsentry/sentry/blob/58dda0436f67e728eda3e1caad9766b3d6188b7a/src/sentry/runner/commands/sendmail.py#L15

src/sentry/runner/commands/sendmail.py:15: error: Argument "message" to "send_mail" has incompatible type "Message | str | list[Message | str] | Any"; expected "str"  [arg-type]

@srittau
Copy link
Copy Markdown
Collaborator

srittau commented Mar 11, 2024

@asottile-sentry I had a similar problem. #11574 should fix this by changing the argument type back to Any.

@Avasam
Copy link
Copy Markdown
Collaborator Author

Avasam commented Mar 11, 2024

I think it should be safe to coerce to a string?
Message's own string representation is just the string message. And whilst static type-checking can't infer whether you have a multipart message, I assume you already expect to always have a single-part there. And going back to Any wouldn't be any safer than stringifying a list.


Alternatively, the middle overload

    @overload  # either
    def get_payload(self, i: None = None, decode: Literal[False] = False) -> _PayloadType | _MultipartPayloadType | Any: ...

could be kept as only returning Any (too complex, until we get something like #566 or try to encode the single/multi-part with Generic typing, probably after #11422)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants