-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Closed
Copy link
Description
From the Python side, you can create an "invalid" UnionArray:
binary = pa.array([b'a', b'b', b'c', b'd'], type='binary')
int64 = pa.array([1, 2, 3], type='int64')
types = pa.array([0, 1, 0, 0, 2, 1, 0], type='int8') # <- value of 2 is out of bound for number of childs
value_offsets = pa.array([0, 0, 2, 1, 1, 2, 3], type='int32')
a = pa.UnionArray.from_dense(types, value_offsets, [binary, int64])Eg on conversion to python this leads to a segfault:
In [7]: a.to_pylist()
Segmentation fault (core dumped)On the other hand, doing an explicit validation does not give an error:
In [8]: a.validate()Should the validation raise errors for this case? (the C++ ValidateVisitor for UnionArray does nothing)
(so that this can be called from the Python API to avoid creating invalid arrays / segfaults there)
Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Antoine Pitrou / @pitrou
PRs and other links:
Note: This issue was originally created as ARROW-6157. Please see the migration documentation for further details.