Skip to content

[Python] Expose more metadata in pyarrow.parquet.ParquetFile.metadata #34180

@deanm0000

Description

@deanm0000

Describe the enhancement requested

I'm not sure if this issue pertains to all implementations of arrow including pyarrow or just c++ but related to this #14870

I'm guessing it affects pyarrow as pq.ParquetFile.metadata.to_dict()['row_groups'][0]['columns'][0]['statistics'].keys()
doesn't have the min_value, max_value keys.

So the feature request is to include min_value and max_value in that metadata.

Additionally, I think there's metadata on whether or not a column is sorted (I might be confused on that point) but if there is it'd be good to see that too.

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions