-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12983: [C++][Python][R] Properly overflow to chunked array in Python-to-Arrow conversion #10556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to the fix, but quality of life improvement regarding the testing speed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a jira for the changelog https://issues.apache.org/jira/browse/ARROW-13142
|
Since we haven't caught this issue from the R side either I assume there are no (or at least not exercised) large memory tests in the R bindings. @nealrichardson @romainfrancois could you help us out here? |
.github/workflows/python.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just experimental to see whether GHA is able to execute these tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Builds get killed due to OOM.
|
IIRC on the R side the chunker isn't used anyways (this was mentioned in the original PR) |
|
I thought that it has been introduced via 7184c3f (didn't look at the R code). |
|
With According to the GHA docs the hosted macOS runners should have 14GB of RAM available. I'm going to verify that since it would be nice if we could exercise the large memory tests somewhere. |
|
@lidavidm @pitrou The GHA macOS hosted agents indeed provide 14GB of RAM, which means that we can exercise some of the After enabling the large memory tests in the macOS python build the build time has increased from 18 minutes to 22 minutes which seems like a nice tradeoff in exchange of actually running the large memory tests. |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working this out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, merging on green
|
The build failures are unrelated, merging. |
…ython-to-Arrow conversion Still need to port the R changes from apache#10470 Tested locally using: ``` PYARROW_TEST_SLOW=ON PYARROW_TEST_LARGE_MEMORY=ON ./run_test.sh -sv pyarrow/tests/ ``` Closes apache#10556 from kszucs/fff Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>

Tested locally using: