Skip to content

[Python] Make pyarrow.parquet work with the new filesystem interfaces #25773

@asfimport

Description

@asfimport

The place internally where the "legacy" pyarrow.filesystem filesystems are still used is in the pyarrow.parquet module.

It is used in:

  • ParquetWriter

  • ParquetManifest/ParquetDataset

  • write_to_dataset

    For ParquetWriter, we need to update this to work with the new filesystems (since ParquetWriter is not dataset related, and thus won't be deprecated).
    For ParquetManifest/ParquetDataset, it might not need to be updated, since those might get deprecated itself (to be discussed -> ARROW-9720), and when using the use_legacy_dataset=False option, it already uses the new datasets.
    For write_to_dataset, this might depend on how the writing capabilities of the dataset project evolve.

Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Joris Van den Bossche / @jorisvandenbossche

PRs and other links:

Note: This issue was originally created as ARROW-9718. Please see the migration documentation for further details.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions