-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
Hi,
I'm using pyarrow 0.10
I have a dataframe about 90GB size in memory, with one object column contain strings up to 27 characters max.
basket_plateau.to_parquet("basket_plateau.parquet", compression=None) writes this file to disk just fine
basket_plateau = pd.read_parquet("basket_plateau.parquet") fails however.
ArrowIOError: Arrow error: Capacity error: BinaryArray cannot contain more than 2147483646 bytes, have 2147483655
I can reproduce this exact same error when I use pyarrow directly:
pq.write_table(pa.Table.from_pandas(basket_plateau), "basket_plateau.parquet")
basket_plateau= pq.read_table("basket_plateau.parquet")
Kr.
Fred
cast42 and suissemaxx
Metadata
Metadata
Assignees
Labels
No labels