-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Closed
Copy link
Description
To be able to remove the custom python implementation (ARROW-15868), we first need to deprecate the various aspects.
This issue is meant as a parent issue to keep an overview of the different tasks.
Reporter: Joris Van den Bossche / @jorisvandenbossche
Subtasks:
- [Python] Start with deprecating ParquetDataset custom attributes
- [Python] Start to raise deprecation warnings when using use_legacy_dataset=True in parquet.py
- [Python] Start raising deprecation warnings for ParquetDataset keywords that won't be supported with the new API
- [Python] ParquetDataset deprecation: change Deprecation to FutureWarnings
- [Python] Deprecate the (common_)metadata(_path) attributes of ParquetDataset
- [Python] Change use_legacy_dataset default and deprecate no-longer supported keywords in parquet.write_to_dataset
- [Python] Support row_group_size/chunk_size keyword in pq.write_to_dataset with use_legacy_dataset=False
- [Python] Switch default and deprecate use_legacy_dataset=True in ParquetDataset
Related issues:
- [Python] Remove the legacy ParquetDataset custom python-based implementation #31303 (blocks)
- [Python] Long-term fate of pyarrow.parquet.ParquetDataset #25775 (is a child of)
- [Python] Remove the test usage of legacy dataset #20233 (is required by)
Note: This issue was originally created as ARROW-16119. Please see the migration documentation for further details.