-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-5588: [C++] Better support for building union arrays #4781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-5588: [C++] Better support for building union arrays #4781
Conversation
bkietz
commented
Jul 2, 2019
- Simplify DenseUnionBuilder
- Add SparesUnionBuilder
- MakeBuilder can now produce a {Sparse,Dense}UnionBuilder
- ArrayFromJSON can now produce union arrays
|
|
I'll disable that test until I can address the issue in ListBuilder |
566014c to
7b2b41d
Compare
7b2b41d to
a20f739
Compare
a20f739 to
a1c6225
Compare
cpp/src/arrow/array/builder_union.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it simplify to require for children.size() the same as max_type_code()? If you're not, you're asking for trouble. The user can always fill with NullBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can let children_[i] be the builder for type_id == i instead of the builder which will finish into child_data[i]. The latter seems like it might be an implicit contract though; I'll investigate
cpp/src/arrow/array/builder_union.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be unnecessary if we maintain an invariant on children_.
a1c6225 to
bab294c
Compare
|
@fsaintjacques Travis is green https://travis-ci.org/bkietz/arrow/builds/556909415 |
fsaintjacques
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some small changes and nothing serious. Good job on simplifying the indirection in the builders.
cpp/src/arrow/array-dict-test.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToString? Applies to other reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a hack to get more info out of the assertion failure than "types are unequal".
NB: serialize.cc is probably broken
cdcd799 to
0efe91d
Compare
|
Ok, it passes locally with lint, so I'll proceed to merge. |
- Simplify DenseUnionBuilder
- Add SparesUnionBuilder
- MakeBuilder can now produce a {Sparse,Dense}UnionBuilder
- ArrayFromJSON can now produce union arrays
Author: Benjamin Kietzman <bengilgit@gmail.com>
Closes #4781 from bkietz/5588-Better-support-for-building-UnionArrays and squashes the following commits:
38e2828 <Benjamin Kietzman> iwyu #include <limits>
0efe91d <Benjamin Kietzman> address review comments
17e6e27 <Benjamin Kietzman> construct offset_builder_ with a MemoryPool
4131fe3 <Benjamin Kietzman> separate child builder array indexable by type_id
fd64c1b <Benjamin Kietzman> rewrite union builder to share a base class, let children_ be indexed by type_id
37de5f2 <Benjamin Kietzman> explicit uint8_t for msvc
673916e <Benjamin Kietzman> Disable ListOfDictionary test until ListBuilder is updated
cf1c5be <Benjamin Kietzman> revert changes to reader.cc
5742db9 <Benjamin Kietzman> debugging: highlight the broken case and a similar one
5b1ec93 <Benjamin Kietzman> improve doccomments, dedupe test code
33fade1 <Benjamin Kietzman> Adding support for DenseUnions to ArrayFromJSON
6245c82 <Benjamin Kietzman> add SparseUnionBuilder and MakeBuilder case
8d4f36d <Benjamin Kietzman> add tests for building lists where the value builder has mutable type
351905d <Benjamin Kietzman> add test for lazily typed union builder
7902d12 <Benjamin Kietzman> first pass at updating DenseUnionBuilder
d20055a <Benjamin Kietzman> minor refactors, adding some asserts