-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Milestone
Description
FixedSizeListArray's seem to be either incorrectly written or read to or from Parquet files.
When reading the parquet file, nulls/Nones are returned where the original values should be.
import pyarrow as pa
import pyarrow.parquet as pq
import numpy as np
np_data = np.arange(20*4).reshape(20, 4).astype(np.float64)
pa_data = pa.FixedSizeListArray.from_arrays(np_data.ravel(), 4)
assert np_data.tolist() == pa_data.tolist()
schema = pa.schema([pa.field("rectangle", pa_data.type)])
table = pa.table({"rectangle": pa_data}, schema=schema)
pq.write_table(table, "test.parquet")
in_table = pq.read_table("test.parquet")
# rectangle is filled with nulls
assert in_table.column("rectangle").to_pylist() == pa_data.tolist()
Reporter: Simon Perkins / @sjperkins
Related issues:
- [C++][Parquet] Add support for schema translation from parquet nodes back to arrow for missing types (is superceded by)
Note: This issue was originally created as ARROW-10232. Please see the migration documentation for further details.
Reactions are currently unavailable