-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-39013: [Go][Integration] Support cABI import/export of StringView #39019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e050c73 to
2505971
Compare
| return ListViewColumn | ||
|
|
||
| def _get_type(self): | ||
| return OrderedDict([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR, we don't need OrderedDict anymore as regular dicts are now ordered by default.
| """ | ||
| def case_wrapper(test_case): | ||
| if serial: | ||
| return case_runner(test_case) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this addition deliberate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: printer.cork() was failing to dump output when the test segfaulted. In any case, corking the printer adds no value when we're running in serial.
felipecrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding more comments.
go/arrow/array/list.go
Outdated
| // input.Len() > 0 && input.NullN() != input.Len() | ||
| func maxListViewOffset32(input arrow.ArrayData) int { | ||
| // input.DataType() is ListViewType if Offset=int32 or LargeListViewType if Offset=int64 | ||
| // input.Len() > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove the NullN() != Len() pre-condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RangeOfUsedValues was causing an error in IPC writing where it was being used to determine how much of the values array to write. This could drop values referenced by views under null bits, which later produced negative offsets. I was trying to rewrite this function to also consider views under null bits (less efficient for concat but still usable for IPC writing). I have reverted the change to RangeOfUsedValues and IPC writing no longer tries to slice the values array. This is less optimal for IPC writing but we can add optimizations back later, for now I'd like to get the integration test passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why it's important to keep the documentation about the pre-condition ... && input.NullN() != input.Len(). The concept of "minumum offset" is undefined when there are 0 offsets to look at (an empty set can't have a minimum) -- the caller can betetr decide what's appropriate to do when all values are null.
felipecrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of noise in this PR. Can you split the refactoring you did from the cABI changes?
|
@pitrou any further issues here / remaining things to be addressed? |
|
if there's no objections or no other changes requested I'll merge this at the end of the day |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 3e182f2. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
…gView (apache#39019) ### Rationale for this change The Go implementation should support import/export of the new data types. This will enable integration testing between the C++ and Go implementations. ### What changes are included in this PR? Added import/export for the new data types and arrays of data of those types. ### Are these changes tested? Yes, they will be covered by the integration tests and existing Go unit tests. ### Are there any user-facing changes? This is a user facing change * Closes: apache#39013 Lead-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Co-authored-by: Matt Topol <zotthewizard@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>
Rationale for this change
The Go implementation should support import/export of the new data types. This will enable integration testing between the C++ and Go implementations.
What changes are included in this PR?
Added import/export for the new data types and arrays of data of those types.
Are these changes tested?
Yes, they will be covered by the integration tests and existing Go unit tests.
Are there any user-facing changes?
This is a user facing change