-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-4120: [Python] Testing utility for checking for "macro" memory leaks detectible with psutil.Process #6551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…hrough Arrow memory pools
…ough Arrow memory pools. Tests for ARROW-7956
|
Appveyor build: https://ci.appveyor.com/project/wesm/arrow/builds/31293989 |
|
+1. I'll open a JIRA about running these tests nightly (along with ARROW-4046 for the "large memory" tests, seems like these go together) |
|
The code in https://github.com/dask/distributed/blob/master/distributed/pytest_resourceleaks.py seems superior to what I implemented, but what is here will enable us to begin writing memory leak tests, and later we can improve the "harness" around these tests |
| @pytest.mark.pandas | ||
| def test_deserialize_pandas_arrow_7956(): | ||
| df = pd.DataFrame({'a': np.arange(10000), | ||
| 'b': [pd.util.testing.rands(5) for _ in range(10000)]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future note: we should not use pandas.util.testing, as that is deprecated (my bad for using it in the reproducer example)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for letting me know. Looks like we have a couple lingering usages
python/pyarrow/tests/test_adhoc_memory_leak.py
35: 'b': [pd.util.testing.rands(5) for _ in range(10000)]})
python/pyarrow/tests/test_plasma.py
411: pd.util.testing.assert_frame_equal(df, result)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I already cleaned up most some time ago, but clearly missed the plasma one.
Also adds a unit test for ARROW-7956 that fails in 0.15.0 and 0.15.1 but passes in 0.16.0.
Since these tests are time consuming I set them off by default, but we need to run them regularly. I opened ARROW-8048 about setting up a nightly build to run these.