Skip to content

[Rust] [Parquet] Refactor file module to help adding sources #26146

@asfimport

Description

@asfimport

Currently, the Parquet reader is very strongly tied to file system reads. This makes it hard to add other sources. For instance, to implement S3, we would need a reader that loads entire columns at once rather than buffered reads of a few Ko.

To improve modularity, we could try to move as much logic as possible to the generic traits (FileReader, RowGroupReader...) and reduce the code in the implementing structs (SerializedFileReader, SerializedRowGroupReader...) to the part that is specific to file/buffered reads.

Reporter: Rémi Dettai / @rdettai
Assignee: Rémi Dettai / @rdettai

PRs and other links:

Note: This issue was originally created as ARROW-10135. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions