Skip to content

Conversation

@bolkedebruin
Copy link
Contributor

@bolkedebruin bolkedebruin commented Nov 17, 2022

It is possible to fully deserialize XCom values in the webserver by specifying a deserialize parameter. This creates a potential security issue as deserialization instantiates arbitrary objects.

Custom XCom backends should implement orm_deserialize_value themselves if they want to retain the behavior. However, it is strongly suggested to deserialize on the client side for the above mentioned reasons.

See also: #26343

cc @uranusjr @ashb @herunyu


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Nov 17, 2022
It is possible to fully deserialize XCom values
in the webserver by specifying a `deserialize` parameter.
This creates a potential security issue as deserialization
instantiates arbitrary objects.

Custom XCom backends should implement `orm_deserialize_value`
themselves if they want to retain the behavior. However,
it is strongly suggested to deserialize on the client side for
the above mentioned reasons.
@bolkedebruin bolkedebruin force-pushed the remove_deserialization branch from e9b6f89 to 06fc728 Compare November 17, 2022 10:39
@ashb
Copy link
Member

ashb commented Nov 17, 2022

The changes we are introducing in #27540 are what makes this behaviour a potential security risk now (though tbf this never "worked" well if you stored xcom as pickle as it wouldn't be able to send the "inflated" value over an HTTP transport anyway.

Trying to decide if we just remove that feature entirely, or if we need to keep it for non-pickle+plain JSON cases (which does sound complex, just talking out loud about options)

@ashb ashb added this to the Airflow 2.5.0 milestone Nov 17, 2022
@bolkedebruin
Copy link
Contributor Author

bolkedebruin commented Nov 17, 2022

You could DOS the webserver with a large pickle without #27540 which is also a security issue :-).

As I mentioned I think the proper way (i.e. architecturally correct) to deserialize over-the-wire is to do this on the client side. After all if using a transport like HTTP we are serializing again.

@ashb
Copy link
Member

ashb commented Nov 17, 2022

As I mentioned I think the proper way to deserialize over-the-wire is to do this on the client side

That implies the client has the credentials to perform that, and doing it on the server almost seems like it was explicitly the reason the OP asked for this feature in #26174

@bolkedebruin
Copy link
Contributor Author

bolkedebruin commented Nov 17, 2022

That's not neccessarily true. With JSON serialization all the information to deserialize is available to you. With deserialize as an option we outsource understanding if it is a risk to deserialize to the client, thus user input and that is not acceptable. As the OP implemented a custom XCom backend they have the possibility to choose server side what to deserialize entirely by overriding orm_deserialize_value. I don't consider that a best practice though.

@uranusjr
Copy link
Member

Sounds to me the most reasonable approach here would be to add a config to allow this feature.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 10, 2023
@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 13, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Feb 27, 2023
@pierrejeambrun pierrejeambrun added the type:bug-fix Changelog: Bug Fixes label Mar 1, 2023
@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Mar 2, 2023
@pierrejeambrun pierrejeambrun removed this from the Airflow 2.5.3 milestone Mar 27, 2023
@pierrejeambrun pierrejeambrun added this to the Airflow 2.5.4 milestone Mar 27, 2023
@ashb ashb modified the milestones: Airflow 2.6.1, Airflow 2.7.0 May 6, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jun 21, 2023
@github-actions github-actions bot closed this Jun 26, 2023
@ephraimbuddy ephraimbuddy removed this from the Airflow 2.7.0 milestone Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues stale Stale PRs per the .github/workflows/stale.yml policy file type:bug-fix Changelog: Bug Fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants