-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-1280: [C++] add fixed size list type #4278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fsaintjacques
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It think it's worth adding support in the integration tests as Java already support this type:
https://github.com/apache/arrow/blob/master/integration/integration_test.py
cpp/src/arrow/array.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the explicit int64_t i as parameter. A bit better for the ABI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though it's independent of index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather remove the argument. Or you can provide two separate overrides. A variadic function looks bizarre here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing it to be called with an argument means I don't have to rewrite pretty-print :)
cpp/src/arrow/array.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth having common abstract class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's not much opportunity for code reuse in arrays/builders. The only advantage I can think of would be easier reuse of code which consumes ListArray (but only code which doesn't directly access the offsets buffer). Those algorithms could use ArrayData BasicListArray::GetView(int64_t i) or similar. This seems like speculative generality to me; I'd be more open to doing it if you can think of existing code which would benefit from being refactored this way.
FixedSizeListScalar and ListScalar are identical but also trivial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @bkietz. Also code reuse can be achieved through templating (we generally care about performing non-virtual calls anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
45e309c to
b47e059
Compare
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Overall this looks good. Still some comments below.
Side question: should support for FixedSizeListType added to ArrayFromJSON? Perhaps as a separate PR?
cpp/src/arrow/ipc/json-internal.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might add a simple test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpp/src/arrow/compare.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if you could dig a bit (or perhaps open a JIRA about this, explaining the issue and/or adding a reproducer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed, this is a bug.
cpp/src/arrow/array-list-test.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this builds an invalid FixedSizeListArray? No values are appended to the child array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does build an invalid array, but the point of this test is just to check that the field name is correctly propagated to the built array
|
@pitrou Added ArrayFromJSON impl |
78982ab to
80f444f
Compare
|
@kou There is a meson-related error on Travis-CI, could you advise? |
|
Looks like there's also a MinGW test failure: |
|
Ok, apparently the Travis-CI failure is unrelated: |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
80f444f to
920a251
Compare
|
Turns out the MinGW failure was sporadic. Merging. |
|
Thanks @bkietz ! @fsaintjacques Integration tests are tracked in https://issues.apache.org/jira/browse/ARROW-1278. |
|
If the MinGW failure is caused again, I'll look into it. |
Adds integration tests for fixed_size_list Also adds support for fixed_size_list to RecordBatchSerializer, which was omitted in #4278 Author: Benjamin Kietzman <bengilgit@gmail.com> Closes #4309 from bkietz/1278-integration-tests-for-fixed-size-list and squashes the following commits: 8b356f3 <Benjamin Kietzman> revert removal of ninja-build from dockerfile e7ed001 <Benjamin Kietzman> fix flake8 error 8ab4efc <Benjamin Kietzman> Adding integration tests for fixed_size_list
Adds integration tests for fixed_size_list Also adds support for fixed_size_list to RecordBatchSerializer, which was omitted in apache#4278 Author: Benjamin Kietzman <bengilgit@gmail.com> Closes apache#4309 from bkietz/1278-integration-tests-for-fixed-size-list and squashes the following commits: 8b356f3 <Benjamin Kietzman> revert removal of ninja-build from dockerfile e7ed001 <Benjamin Kietzman> fix flake8 error 8ab4efc <Benjamin Kietzman> Adding integration tests for fixed_size_list
fixed_size_list(<value field or type>, list_size)