GH-39013: [Go][Integration] Support cABI import/export of StringView #39019

bkietz · 2023-12-01T01:40:00Z

Rationale for this change

The Go implementation should support import/export of the new data types. This will enable integration testing between the C++ and Go implementations.

What changes are included in this PR?

Added import/export for the new data types and arrays of data of those types.

Are these changes tested?

Yes, they will be covered by the integration tests and existing Go unit tests.

Are there any user-facing changes?

This is a user facing change

Closes: [Go][Integration] Add support for C abi Utf8View to the go implementation #39013

…gView in Go

pitrou · 2023-12-13T17:30:13Z

dev/archery/archery/integration/datagen.py

+        return ListViewColumn
+
+    def _get_type(self):
+        return OrderedDict([


FTR, we don't need OrderedDict anymore as regular dicts are now ordered by default.

pitrou · 2023-12-13T17:32:01Z

dev/archery/archery/integration/runner.py

        """
        def case_wrapper(test_case):
+            if serial:
+                return case_runner(test_case)


Is this addition deliberate?

Yes: printer.cork() was failing to dump output when the test segfaulted. In any case, corking the printer adds no value when we're running in serial.

felipecrv

I'm adding more comments.

dev/archery/archery/integration/datagen.py

felipecrv · 2023-12-13T18:12:10Z

go/arrow/array/list.go

-//	input.Len() > 0 && input.NullN() != input.Len()
-func maxListViewOffset32(input arrow.ArrayData) int {
+//	input.DataType() is ListViewType if Offset=int32 or LargeListViewType if Offset=int64
+//	input.Len() > 0


Why remove the NullN() != Len() pre-condition?

RangeOfUsedValues was causing an error in IPC writing where it was being used to determine how much of the values array to write. This could drop values referenced by views under null bits, which later produced negative offsets. I was trying to rewrite this function to also consider views under null bits (less efficient for concat but still usable for IPC writing). I have reverted the change to RangeOfUsedValues and IPC writing no longer tries to slice the values array. This is less optimal for IPC writing but we can add optimizations back later, for now I'd like to get the integration test passing.

This is why it's important to keep the documentation about the pre-condition ... && input.NullN() != input.Len(). The concept of "minumum offset" is undefined when there are 0 offsets to look at (an empty set can't have a minimum) -- the caller can betetr decide what's appropriate to do when all values are null.

go/arrow/array/list.go

felipecrv

There is a lot of noise in this PR. Can you split the refactoring you did from the cABI changes?

go/arrow/cdata/cdata.go

go/arrow/cdata/cdata_exports.go

go/arrow/compute/fieldref.go

go/arrow/internal/flatbuf/Binary.go

go/arrow/internal/debug/log_on.go

go/arrow/internal/flatbuf/DictionaryBatch.go

go/arrow/array/list.go

zeroshade · 2023-12-15T20:21:58Z

@pitrou any further issues here / remaining things to be addressed?

zeroshade · 2023-12-18T15:28:06Z

if there's no objections or no other changes requested I'll merge this at the end of the day

conbench-apache-arrow · 2023-12-20T10:26:22Z

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 3e182f2.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

…gView (apache#39019) ### Rationale for this change The Go implementation should support import/export of the new data types. This will enable integration testing between the C++ and Go implementations. ### What changes are included in this PR? Added import/export for the new data types and arrays of data of those types. ### Are these changes tested? Yes, they will be covered by the integration tests and existing Go unit tests. ### Are there any user-facing changes? This is a user facing change * Closes: apache#39013 Lead-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Co-authored-by: Matt Topol <zotthewizard@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Matt Topol <zotthewizard@gmail.com>

bkietz requested review from assignUser, kou, raulcd and zeroshade as code owners December 1, 2023 01:40

github-actions bot added Component: Go awaiting committer review Awaiting committer review labels Dec 1, 2023

bkietz and others added 12 commits December 12, 2023 12:21

apacheGH-39013: [Go][Integration] Support cABI import/export of Strin…

56f7134

…gView in Go

amend integration JSON field names

5dd6ea5

debugging, maybe

9766f85

add datagen.py support for list view

a276351

debugging session

b89bd69

fix failing binaryview test

8e80f5a

add c ABI support for list view

c656338

ensure offsets under null bits are not ignored

0c33184

move GetBytes, GetData and other utilities to type_traits.go

33a679f

fmt

44d8f73

GetOffsets

8e4a09c

unused import

2505971

bkietz force-pushed the 39013-go-utf8view-c-abi branch from e050c73 to 2505971 Compare December 12, 2023 17:22

GetData/GetBytes with empty slices

7005868

bkietz requested a review from felipecrv December 13, 2023 17:23

pitrou requested changes Dec 13, 2023

View reviewed changes

github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Dec 13, 2023

fmt

0678158

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 13, 2023

felipecrv reviewed Dec 13, 2023

View reviewed changes

felipecrv requested changes Dec 13, 2023

View reviewed changes

ensure slices have full capacity

14fe8ec

github-actions bot added the awaiting changes Awaiting changes label Dec 14, 2023

bkietz added 3 commits December 15, 2023 08:55

randomize offsets

03a12f9

update Integration.rst

b923766

review comments

e10341a

github-actions bot added Component: Documentation awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Dec 15, 2023

avoid use of reflect.SliceHeader

cb6a4ef

github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Dec 15, 2023

fix listview slicing optimization

297f456

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Dec 15, 2023

zeroshade approved these changes Dec 15, 2023

View reviewed changes

github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Dec 15, 2023

felipecrv reviewed Dec 15, 2023

View reviewed changes

go/arrow/array/list.go Outdated Show resolved Hide resolved

Add pre-condition comment back

e3467e9

felipecrv approved these changes Dec 15, 2023

View reviewed changes

zeroshade merged commit 3e182f2 into apache:main Dec 19, 2023

zeroshade removed the awaiting merge Awaiting merge label Dec 19, 2023

GH-39013: [Go][Integration] Support cABI import/export of StringView #39019

GH-39013: [Go][Integration] Support cABI import/export of StringView #39019

Uh oh!

Conversation

bkietz commented Dec 1, 2023 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

pitrou Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

pitrou Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

bkietz Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

felipecrv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

felipecrv Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

bkietz Dec 14, 2023

Choose a reason for hiding this comment

Uh oh!

felipecrv Dec 15, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

felipecrv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zeroshade commented Dec 15, 2023

Uh oh!

zeroshade commented Dec 18, 2023

Uh oh!

conbench-apache-arrow bot commented Dec 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bkietz commented Dec 1, 2023 •

edited by github-actions bot

Loading