-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12431: [Python] Mask is inverted when creating FixedSizeBinaryArray #10199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e5656e6
6e5d952
f8ad0d5
b2f936c
7777a17
998ac4e
0d7672c
e06be2e
85993b9
8d6f1bc
7cf6cd1
d0602c7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2714,6 +2714,51 @@ def test_array_masked(): | |
| assert arr.type == pa.int64() | ||
|
|
||
|
|
||
| def test_binary_array_masked(): | ||
| # ARROW-12431 | ||
| masked_basic = pa.array([b'\x05'], type=pa.binary(1), | ||
| mask=np.array([False])) | ||
| assert [b'\x05'] == masked_basic.to_pylist() | ||
|
|
||
| # Fixed Length Binary | ||
| masked = pa.array(np.array([b'\x05']), type=pa.binary(1), | ||
| mask=np.array([False])) | ||
| assert [b'\x05'] == masked.to_pylist() | ||
|
|
||
| masked_nulls = pa.array(np.array([b'\x05']), type=pa.binary(1), | ||
| mask=np.array([True])) | ||
| assert [None] == masked_nulls.to_pylist() | ||
|
|
||
| # Variable Length Binary | ||
| masked = pa.array(np.array([b'\x05']), type=pa.binary(), | ||
| mask=np.array([False])) | ||
| assert [b'\x05'] == masked.to_pylist() | ||
|
|
||
| masked_nulls = pa.array(np.array([b'\x05']), type=pa.binary(), | ||
| mask=np.array([True])) | ||
| assert [None] == masked_nulls.to_pylist() | ||
|
|
||
| # Fixed Length Binary, copy | ||
|
||
| npa = np.array([b'aaa', b'bbb', b'ccc']*10) | ||
| arrow_array = pa.array(npa, type=pa.binary(3), | ||
| mask=np.array([False, False, False]*10)) | ||
| npa[npa == b"bbb"] = b"XXX" | ||
| assert ([b'aaa', b'bbb', b'ccc']*10) == arrow_array.to_pylist() | ||
|
|
||
|
|
||
| def test_binary_array_strided(): | ||
| # Masked | ||
| nparray = np.array([b"ab", b"cd", b"ef"]) | ||
| arrow_array = pa.array(nparray[::2], pa.binary(2), | ||
| mask=np.array([False, False])) | ||
| assert [b"ab", b"ef"] == arrow_array.to_pylist() | ||
|
|
||
| # Unmasked | ||
| nparray = np.array([b"ab", b"cd", b"ef"]) | ||
| arrow_array = pa.array(nparray[::2], pa.binary(2)) | ||
| assert [b"ab", b"ef"] == arrow_array.to_pylist() | ||
|
|
||
|
|
||
| def test_array_invalid_mask_raises(): | ||
| # ARROW-10742 | ||
| cases = [ | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this solve the strided conversion case? If so, perhaps you can add a test for it?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly not, I expected it would, but I wrote some tests and it wasn't enough. That's why I made https://issues.apache.org/jira/browse/ARROW-12667 as a follow up issue. So that I can test it for all various types and make sure it works in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added tests and fix for strided binary arrays (with and without mask)