Support for pandas DataFrame subclasses

When dask uses partd for eg shuffle operations, the dataframes always come back as a `pandas.DataFrame`, even if a subclass was stored (xref https://github.com/geopandas/dask-geopandas/issues/59#issuecomment-864469674).

For example:

```
import geopandas
gdf = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))

import partd
# dask.dataframe shuffle operations use PandasBlocks
p = partd.PandasBlocks(partd.Dict())

p.append({"gdf": gdf})
res = p.get("gdf")

>>> type(gdf)
pandas.core.frame.DataFrame
>>> type(res)
pandas.core.frame.DataFrame
```

To be able to use dask's shuffle operations with `dask_geopandas`, which uses a pandas subclass as the partition type, the subclass should be preserved in the partd roundtrip (or are there other ways that you can override / dispatch this operation in dask?). 
I was wondering how other dask.dataframe subclasses handle this, but eg `dask_cudf` doesn't seem to support "disk"-based shuffling.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for pandas DataFrame subclasses #52

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support for pandas DataFrame subclasses #52

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions