-
Notifications
You must be signed in to change notification settings - Fork 0
Description
As part of the Pangeo project, we have been exploring the concept of "cloud optimized netCDF" - building off of "cloud optimized GeoTIFF". Zarr is an open-source Python library and storage spec "providing an implementation of chunked, compressed, N-dimensional arrays." The spec is simple, clearly documented, and well suited for use in cloud object store.
Last year, we (@rabernat, myself, and others from the xarray/dask/pangeo projects) wrote an experimental xarray backend for zarr and we have been testing its use on public clouds over the last year. The community is eager to see some formal effort put behind these concepts.
This proposal would do the following:
- Complete a netCDF+zarr spec (DOC: zarr spec v3: adds optional dimensions and the "netZDF" format zarr-developers/zarr-python#276 got started here)
- Adopt the xarray backend to support any spec changes
- Add missing functionality to the xarray zarr backend such as the ability to append to datasets
Other possible development objectives include:
- cloud api specific stores for zarr (e.g. WIP: google cloud storage class zarr-developers/zarr-python#252)
per: https://twitter.com/rabernat/status/1039210134600396800
NumFOCUS project: Xarray
ESIP member institution: NCAR