-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Allow serialization of custom objects #26735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow serialization of custom objects #26735
Conversation
We can make the serializer handle arbitrary custom objects provided they implement methods `airflow_serialize` and `airflow_deserialize`.
|
One thought: We need to be careful we don't open up artibrarty object inflation vulnerabilities this way. (There were security problems in Rails where you could give it some session data and it would treat it as YAML, and due to oddness in YAML spec, end up creating arbitrary ruby objects which was used to pop reverse shells on Rails installs.) |
|
Yes, preventing security issues would be the biggest thing. Currently we do that by requiring custom types to be registered via a plugin (e.g. timetables), but if we were to do that for any custom types, it may be easier to use a custom serialiser pattern instead, similar to how |
can you add more detail? i'm interested in what you're talking about but don't follow. separately, concerning security risks... perhaps we need to be specific about the context. suppose we allow custom serialization in the xcom context, not in the base serialization code which is used in many places. if someone wanted to do something malicious, and they had the ability to write a task that sent this malicious object through xcom, why would they need to bother sending it through xcom -- they could do whatever malicious work they wanted in the task itself? we're not talking about e.g. taking user input strings from the web UI for example... and if it's just in the task execution context, it's not run in the scheduler or webserver. |
|
Something like this class MyAirflowPlugin(AirflowPlugin):
custom_serde = MyCustomSerialization()
class MyCustomSerialization:
def serialize(self, obj):
if isinstance(obj, MyObj):
return "MyObj", {"a": obj.a, "b": obj.get_b()}
def deserialize(self, type_, data):
if type_ == "MyObj":
return MyObj(a=data["a"], ...)When the serializer encounters something it doesn’t recognise, it’d call the custom serialiser. Similarly, if the serialised data contains a |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
|
This is handled for some use cases by #27540 I think |
We can make the serializer handle arbitrary custom objects provided they implement methods
airflow_serializeandairflow_deserialize.Why?
Well..... one potential use is it could provide flexibility to handle user-defined objects through XCom.
Additionally, it could be desirable to manage serialization protocol in the objects themselves rather than always having to update the base serialization class.