-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Description
Description
Usage of dill for optional serialization in PythonVirtualenvOperator may be replaced with cloudpickle as its serialization library. This should be a mostly drop-in replacement.
Use case / motivation
Currently, the PythonVirtualenvOperator optionally uses dill in place of stock pickle to serialize advanced types. However, most major distributed compute frameworks have opted to shift to cloudpickle, meaning using dill for Airflow can introduce redundant dependencies for calling out to other distributed compute (e.g., farming compute-heavy tasks out to a remote dask cluster), and can interfere with serialization of tasks for those tools.
Since both dill and cloudpickle are largely drop-in replacements for pickle, the migration should be fairly minor.
Related Issues