Skip to content

Conversation

@AlenkaF
Copy link
Member

@AlenkaF AlenkaF commented Jan 29, 2024

This PR removes the pyarrow.filesystem and pyarrow.hdfs filesystems that have been deprecated since 2.0.0.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

I assume there will be some declarations in libarrow.pxd that were used in _hdfsio.pyx and now are no longer used, and can be removed.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Jan 30, 2024
@AlenkaF AlenkaF force-pushed the gh-20127-remove-deprecated-filesystem branch from b479920 to e005766 Compare January 30, 2024 09:56
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jan 30, 2024
@AlenkaF AlenkaF marked this pull request as ready for review January 30, 2024 11:25
@pitrou
Copy link
Member

pitrou commented Feb 27, 2024

Is this ready for review @AlenkaF ?

@AlenkaF
Copy link
Member Author

AlenkaF commented Feb 27, 2024

I still need to fix one test which was failing and I added try-except (see #39825 (comment)) but I need to change the approach. Plan to work on it tomorrow!

@AlenkaF AlenkaF force-pushed the gh-20127-remove-deprecated-filesystem branch from 285d909 to 142af85 Compare February 28, 2024 11:16
@AlenkaF
Copy link
Member Author

AlenkaF commented Feb 28, 2024

Ready for review @jorisvandenbossche @pitrou

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @AlenkaF . Here are a couple comments, but this LGTM on the principle.

AlenkaF and others added 2 commits February 28, 2024 17:37
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
Co-authored-by: Antoine Pitrou <pitrou@free.fr>
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 29, 2024
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now! Just a few small nits

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 29, 2024
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Feb 29, 2024
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 4, 2024
@jorisvandenbossche
Copy link
Member

The one failure on Windows is #40337

@jorisvandenbossche jorisvandenbossche merged commit 2b194ad into apache:main Mar 4, 2024
@jorisvandenbossche jorisvandenbossche removed the awaiting change review Awaiting change review label Mar 4, 2024
@github-actions github-actions bot added the awaiting merge Awaiting merge label Mar 4, 2024
@AlenkaF AlenkaF deleted the gh-20127-remove-deprecated-filesystem branch March 4, 2024 12:47
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 2b194ad.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them.

jorisvandenbossche added a commit that referenced this pull request Mar 5, 2024
…sis setup (#40363)

### Rationale for this change

Small follow-up on #39825, which removed the `test_hdfs.py` file itself, but didn't remove it from the hypothesis script

* GitHub Issue: #20127

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
npennequin added a commit to criteo/cluster-pack that referenced this pull request Nov 14, 2024
The current implementation of EnhancedFileSystem is based on the legacy
pyarrow filesystem interface that was removed in pyarrow 16.0.0
(apache/arrow#39825).

We can entirely replace EnhancedFileSystem with fsspec. For HDFS fsspec
relies on the new pyarrow filesystem interface.

Behavior change note: for put, fsspec doesn't preserve file permissions

Resolves #87
jcuquemelle pushed a commit to criteo/cluster-pack that referenced this pull request Nov 21, 2024
The current implementation of EnhancedFileSystem is based on the legacy
pyarrow filesystem interface that was removed in pyarrow 16.0.0
(apache/arrow#39825).

We can entirely replace EnhancedFileSystem with fsspec. For HDFS fsspec
relies on the new pyarrow filesystem interface.

Behavior change note: for put, fsspec doesn't preserve file permissions

Resolves #87
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] Remove the deprecated pyarrow.filesystem legacy implementations

3 participants