Skip to content

Conversation

@bkietz
Copy link
Member

@bkietz bkietz commented May 8, 2019

  • FixedSizeListType, parameterized by the list size (similar to FixedSizeBinaryType), constructed with fixed_size_list(<value field or type>, list_size)
  • FixedSizeListArray, including Equal and pretty print impls
  • FixedSizeListBuilder, works with MakeBuilder

Copy link
Contributor

@fsaintjacques fsaintjacques left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think it's worth adding support in the integration tests as Java already support this type:

https://github.com/apache/arrow/blob/master/integration/integration_test.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the explicit int64_t i as parameter. A bit better for the ABI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it's independent of index?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather remove the argument. Or you can provide two separate overrides. A variadic function looks bizarre here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing it to be called with an argument means I don't have to rewrite pretty-print :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth having common abstract class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not much opportunity for code reuse in arrays/builders. The only advantage I can think of would be easier reuse of code which consumes ListArray (but only code which doesn't directly access the offsets buffer). Those algorithms could use ArrayData BasicListArray::GetView(int64_t i) or similar. This seems like speculative generality to me; I'd be more open to doing it if you can think of existing code which would benefit from being refactored this way.

FixedSizeListScalar and ListScalar are identical but also trivial.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @bkietz. Also code reuse can be achieved through templating (we generally care about performing non-virtual calls anyway).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@bkietz bkietz force-pushed the 1280-Implement-Fixed-Size-List-type branch from 45e309c to b47e059 Compare May 9, 2019 17:34
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Overall this looks good. Still some comments below.

Side question: should support for FixedSizeListType added to ArrayFromJSON? Perhaps as a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might add a simple test for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if you could dig a bit (or perhaps open a JIRA about this, explaining the issue and/or adding a reproducer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed, this is a bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this builds an invalid FixedSizeListArray? No values are appended to the child array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does build an invalid array, but the point of this test is just to check that the field name is correctly propagated to the built array

@bkietz
Copy link
Member Author

bkietz commented May 9, 2019

@pitrou Added ArrayFromJSON impl

@bkietz bkietz force-pushed the 1280-Implement-Fixed-Size-List-type branch from 78982ab to 80f444f Compare May 10, 2019 19:50
@pitrou
Copy link
Member

pitrou commented May 13, 2019

@kou There is a meson-related error on Travis-CI, could you advise?
See https://travis-ci.org/apache/arrow/jobs/530907671

@pitrou
Copy link
Member

pitrou commented May 13, 2019

Looks like there's also a MinGW test failure:
https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/24461942/job/mper2jipn6kj5bxj

Error: test: csv.gz(TableTest::#save and .load::path:::format::load: auto detect): Arrow::Error::Io: [csv-reader][read]: IOError: zlib inflate failed: invalid distance too far back
c:/Ruby26/lib/ruby/gems/2.6.0/gems/gobject-introspection-3.3.6/lib/gobject-introspection/loader.rb:616:in `invoke'
c:/Ruby26/lib/ruby/gems/2.6.0/gems/gobject-introspection-3.3.6/lib/gobject-introspection/loader.rb:616:in `invoke'
c:/Ruby26/lib/ruby/gems/2.6.0/gems/gobject-introspection-3.3.6/lib/gobject-introspection/loader.rb:533:in `block in define_method'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:158:in `block (2 levels) in load_from_path'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:147:in `block (2 levels) in wrap_input'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:140:in `open_encoding_convert_stream'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:146:in `block in wrap_input'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:125:in `block in open_decompress_input'
C:/projects/arrow/ruby/red-arrow/lib/arrow/block-closable.rb:25:in `open'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:124:in `open_decompress_input'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:145:in `wrap_input'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:157:in `block in load_from_path'
C:/projects/arrow/ruby/red-arrow/lib/arrow/block-closable.rb:25:in `open'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:156:in `load_from_path'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:39:in `load'
C:/projects/arrow/ruby/red-arrow/lib/arrow/csv-loader.rb:26:in `load'
C:/projects/arrow/ruby/red-arrow/lib/arrow/table-loader.rb:158:in `load_as_csv'
C:/projects/arrow/ruby/red-arrow/lib/arrow/table-loader.rb:50:in `load'
C:/projects/arrow/ruby/red-arrow/lib/arrow/table-loader.rb:22:in `load'
C:/projects/arrow/ruby/red-arrow/lib/arrow/table.rb:27:in `load'
C:/projects/arrow/ruby/red-arrow/test/test-table.rb:503:in `block (5 levels) in <class:TableTest>'

@pitrou
Copy link
Member

pitrou commented May 13, 2019

Ok, apparently the Travis-CI failure is unrelated:
https://issues.apache.org/jira/browse/ARROW-5306

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@pitrou pitrou force-pushed the 1280-Implement-Fixed-Size-List-type branch from 80f444f to 920a251 Compare May 13, 2019 13:25
@pitrou
Copy link
Member

pitrou commented May 13, 2019

Turns out the MinGW failure was sporadic. Merging.

@pitrou pitrou closed this in ea271b3 May 13, 2019
@pitrou
Copy link
Member

pitrou commented May 13, 2019

Thanks @bkietz !

@fsaintjacques Integration tests are tracked in https://issues.apache.org/jira/browse/ARROW-1278.

@kou
Copy link
Member

kou commented May 13, 2019

If the MinGW failure is caused again, I'll look into it.

emkornfield pushed a commit that referenced this pull request Jun 14, 2019
Adds integration tests for fixed_size_list
Also adds support for fixed_size_list to RecordBatchSerializer, which was omitted in #4278

Author: Benjamin Kietzman <bengilgit@gmail.com>

Closes #4309 from bkietz/1278-integration-tests-for-fixed-size-list and squashes the following commits:

8b356f3 <Benjamin Kietzman> revert removal of ninja-build from dockerfile
e7ed001 <Benjamin Kietzman> fix flake8 error
8ab4efc <Benjamin Kietzman> Adding integration tests for fixed_size_list
@bkietz bkietz deleted the 1280-Implement-Fixed-Size-List-type branch February 25, 2021 16:10
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
Adds integration tests for fixed_size_list
Also adds support for fixed_size_list to RecordBatchSerializer, which was omitted in apache#4278

Author: Benjamin Kietzman <bengilgit@gmail.com>

Closes apache#4309 from bkietz/1278-integration-tests-for-fixed-size-list and squashes the following commits:

8b356f3 <Benjamin Kietzman> revert removal of ninja-build from dockerfile
e7ed001 <Benjamin Kietzman> fix flake8 error
8ab4efc <Benjamin Kietzman> Adding integration tests for fixed_size_list
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants