I'm investigating if openeo-processes-python would be useful to add to the VITO backend in some form and I found that the set of dependencies of openeo-processes-python is quite heavy.
Installation in a fresh virtual env results in these dependencies:
$ pip freeze
bokeh==2.1.1
click==7.1.2
cloudpickle==1.4.1
dask==2.19.0
distributed==2.19.0
fsspec==0.7.4
HeapDict==1.0.1
Jinja2==2.11.2
llvmlite==0.33.0
locket==0.2.0
MarkupSafe==1.1.1
msgpack==1.0.0
numba==0.50.1
numpy==1.19.0
-e git+git@github.com:Open-EO/openeo-processes-python.git@c5cc64af94ba83872d5f7ee990ce1a64a0cc83c1#egg=openeo_processes
packaging==20.4
pandas==1.0.5
partd==1.1.0
Pillow==7.1.2
psutil==5.7.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
scipy==1.5.0
six==1.15.0
sortedcontainers==2.2.2
tblib==1.6.0
toolz==0.10.0
tornado==6.0.4
typing-extensions==3.7.4.2
xarray==0.15.1
xarray-extras==0.4.2
zict==2.0.0
There is a lot in that list (e.g. bokeh, click, jinja2, msgpack, PyYAML, tornado, tblib, psutil, MarkupSafe, locket ...) that quite far from the core functionality we're looking for: implementation of basic openEO (math) processes.
The direct dependencies are currently just:
|
install_requires = |
|
dask[complete] |
|
numpy |
|
xarray |
|
xarray-extras |
I guess the dask[complete] is the one that drags in all these other dependencies
First: is there an important reason to depend on dask[complete]? These are the only dask related lines in the whole repo:
src/openeo_processes/utils.py:import dask
src/openeo_processes/utils.py: is_dar = isinstance(data, dask.array.core.Array)
So why not just depend on dask[array]?
Furthermore, would there be interest in making these dependencies on dask, xarray optional? The end user can then cherry-pick which calculation "backends" and dependencies he drags into his project. For example:
pip install openeo-processes-python would support just the numpy stuff
pip install openeo-processes-python[xarray] would support xarray as well
pip install openeo-processes-python[dask] would support dask as well
I'm investigating if openeo-processes-python would be useful to add to the VITO backend in some form and I found that the set of dependencies of openeo-processes-python is quite heavy.
Installation in a fresh virtual env results in these dependencies:
There is a lot in that list (e.g. bokeh, click, jinja2, msgpack, PyYAML, tornado, tblib, psutil, MarkupSafe, locket ...) that quite far from the core functionality we're looking for: implementation of basic openEO (math) processes.
The direct dependencies are currently just:
openeo-processes-python/setup.cfg
Lines 29 to 33 in 38c6eea
I guess the
dask[complete]is the one that drags in all these other dependenciesFirst: is there an important reason to depend on
dask[complete]? These are the only dask related lines in the whole repo:So why not just depend on
dask[array]?Furthermore, would there be interest in making these dependencies on dask, xarray optional? The end user can then cherry-pick which calculation "backends" and dependencies he drags into his project. For example:
pip install openeo-processes-pythonwould support just the numpy stuffpip install openeo-processes-python[xarray]would support xarray as wellpip install openeo-processes-python[dask]would support dask as well