-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
The C++ docs of SetLookupOptions has this explanation of the skip_nulls option:
/// Whether nulls in `value_set` count for lookup.
///
/// If true, any null in `value_set` is ignored and nulls in the input
/// produce null (IndexIn) or false (IsIn) values in the output.
/// If false, any null in `value_set` is successfully matched in
/// the input.
bool skip_nulls;(from
arrow/cpp/src/arrow/compute/api_scalar.h
Lines 78 to 84 in 8b9f6b9
| /// Whether nulls in `value_set` count for lookup. | |
| /// | |
| /// If true, any null in `value_set` is ignored and nulls in the input | |
| /// produce null (IndexIn) or false (IsIn) values in the output. | |
| /// If false, any null in `value_set` is successfully matched in | |
| /// the input. | |
| bool skip_nulls; |
However, for IsIn this explanation doesn't seem to hold in practice:
In [16]: arr = pa.array([1, 2, None])
In [17]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=True)
Out[17]:
<pyarrow.lib.BooleanArray object at 0x7fcf666f9408>
[
true,
false,
true
]
In [18]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=False)
Out[18]:
<pyarrow.lib.BooleanArray object at 0x7fcf666b13a8>
[
true,
false,
true
]This documentation was added in #7695 (ARROW-8989)/
.
BTW, for "index_in", it works as documented:
In [19]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=True)
Out[19]:
<pyarrow.lib.Int32Array object at 0x7fcf666f04c8>
[
0,
null,
null
]
In [20]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=False)
Out[20]:
<pyarrow.lib.Int32Array object at 0x7fcf666f0ee8>
[
0,
null,
1
]Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Antoine Pitrou / @pitrou
PRs and other links:
Note: This issue was originally created as ARROW-10663. Please see the migration documentation for further details.