-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-4975: [C++] Support concatenation of UnionArrays #11843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tricky indeed. I think just concatenating the dense union children (instead of trying to slice out only those elements which are referenced by an offset) is fine.
| {child_one_sliced, child_two_sliced})); | ||
| ASSERT_OK(expected_sliced->ValidateFull()); | ||
| AssertArraysEqual(*expected_sliced, *concat_sliced_arrays); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also test concatenation of an array 1) which is not sliced, but whose children are sliced/have an offset? 2) which is sliced, whose children additionally have an offset?
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, thanks!
This makes it more readable. Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
bkietz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
CI failures seemed like unrelated flakes so I'm restarting those jobs to see if we can get to all-green
|
merging |
|
Benchmark runs are scheduled for baseline = e903a21 and contender = a93c493. a93c493 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This PR adds support for concatenation of union arrays.
For sparse union arrays this is trivial: the type buffers and child arrays are concatenated like the other concatenate implementations.
For dense union arrays the following approach is used:
Does this make sense or should we slice child arrays (when required) and reflect this in the concatenated offsets buffer?
This PR also removes a check in
DenseUnionArray::Makethat rejected empty offsets buffers. This made it impossible to construct empty dense union arrays. I discussed removing this check with @bkietz.