Skip to content

Conversation

@bkietz
Copy link
Member

@bkietz bkietz commented May 20, 2019

Implements MapArray as a subclass of ListArray, where each value in the list is a key: item pair. (This naming is not the most natural, but value is taken.)

MapType::keys_sorted() is currently stored but unused- for example MapBuilder does not check inserted keys for correct ordering. MapType is printed as map<utf8, int32> and map<int32, float64, keys_sorted> for unsorted, sorted keys respectively.

Map arrays are created with ArrayFromJSON by providing for each pair an array of length 2 containing the key and the mapped item (example).

@wesm
Copy link
Member

wesm commented May 20, 2019

Will review this when I can. Should we try to inherit MapArray from ListArray? Seems like it might make things easier but I will take a closer look

@bkietz bkietz force-pushed the 1207-Implement-Map-logical-type branch 2 times, most recently from 59bb5e0 to d62efcf Compare May 22, 2019 15:30
@pitrou pitrou changed the title ARROW-1207: C++ implement map logical type ARROW-1207: [C++] implement map logical type May 28, 2019
@pitrou
Copy link
Member

pitrou commented May 28, 2019

@bkietz You should be able to fix the Travis-CI failures by rebasing from master.

@wesm wesm changed the title ARROW-1207: [C++] implement map logical type ARROW-1207: [C++] Implement MapArray, MapBuilder, MapType classes May 28, 2019
@wesm wesm changed the title ARROW-1207: [C++] Implement MapArray, MapBuilder, MapType classes ARROW-1207: [C++] Implement MapArray, MapBuilder, MapType classes, and IPC support May 28, 2019
@bkietz bkietz force-pushed the 1207-Implement-Map-logical-type branch from 64fdad9 to ac878bb Compare May 28, 2019 16:56
@codecov-io
Copy link

Codecov Report

Merging #4352 into master will increase coverage by 0.84%.
The diff coverage is 65.83%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4352      +/-   ##
==========================================
+ Coverage   88.41%   89.26%   +0.84%     
==========================================
  Files         784      639     -145     
  Lines      100266    89293   -10973     
  Branches     1251        0    -1251     
==========================================
- Hits        88651    79705    -8946     
+ Misses      11379     9588    -1791     
+ Partials      236        0     -236
Impacted Files Coverage Δ
cpp/src/arrow/type_traits.h 85.71% <ø> (ø) ⬆️
cpp/src/arrow/visitor.h 50% <ø> (ø) ⬆️
cpp/src/arrow/visitor_inline.h 88.11% <ø> (ø) ⬆️
cpp/src/arrow/ipc/metadata-internal.cc 88.17% <0%> (-1.89%) ⬇️
cpp/src/arrow/util/concatenate.cc 88.05% <0%> (-3.46%) ⬇️
cpp/src/arrow/scalar.cc 48.83% <0%> (-7.92%) ⬇️
cpp/src/arrow/visitor.cc 0% <0%> (ø) ⬆️
cpp/src/arrow/compute/kernels/take.cc 89.32% <0%> (-1.77%) ⬇️
cpp/src/arrow/scalar.h 85.71% <0%> (-2.53%) ⬇️
cpp/src/arrow/compare.cc 79.31% <0%> (-1.53%) ⬇️
... and 164 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d7838e...ac878bb. Read the comment docs.

@bkietz
Copy link
Member Author

bkietz commented May 28, 2019

@pitrou rebased and green

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bkietz . A couple more comments below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"keys_sorted=true" perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer

@bkietz bkietz force-pushed the 1207-Implement-Map-logical-type branch from ac878bb to 45895f3 Compare June 4, 2019 15:50
@bkietz bkietz force-pushed the 1207-Implement-Map-logical-type branch from 80d69bc to 9fb8700 Compare June 11, 2019 13:09
@pitrou
Copy link
Member

pitrou commented Jun 11, 2019

@wesm @bkietz I have a question (perhaps late) about the type layout. Why does the map have a struct child array instead of directly having two child arrays? This means, for example, that a map array can theoretically have null key-value pairs...

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, but see question about the type layout definition.

if (children.size() != 1) {
return Status::Invalid("Map must have exactly 1 child field");
}
if ( // FIXME(bkietz) temporarily disabled: this field is sometimes read nullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know where the issue lies?

@bkietz
Copy link
Member Author

bkietz commented Jun 11, 2019

@pitrou This allows MapArray to be a specialization of ListArray, with corresponding reuse of code (comparison, for example)

@bkietz
Copy link
Member Author

bkietz commented Jun 11, 2019

Null pairs are disallowed with DCHECKs, and neither MapBuilder nor ArrayFromJSON will produce a MapArray with a null pair.

@pitrou
Copy link
Member

pitrou commented Jun 11, 2019

Thanks. I was just curious. Since the definition was already in the format spec, I won't object :-)

@pitrou pitrou closed this in dede1e6 Jun 11, 2019
@pitrou
Copy link
Member

pitrou commented Jun 11, 2019

This is a nice step forward. @bkietz is there a JIRA open for the integration issue? If not, can you open one?

@bkietz
Copy link
Member Author

bkietz commented Jun 11, 2019

@BryanCutler merged

JIRA for integration tests: https://issues.apache.org/jira/browse/ARROW-1279

@BryanCutler
Copy link
Member

Nice! The Java MapVector is merged also, so I think we can just enable the integration test and hope for green :)
Were you going to make a PR @bkietz ?

@BryanCutler
Copy link
Member

Well, I got the integration tests running and there was only 1 error!
C++ producing, Java consuming

==========================================================
Testing file /tmp/tmplmqwtdnv/generated_map.json
==========================================================

14:14:18.822 [main] ERROR org.apache.arrow.tools.Integration - Incompatible files
java.lang.IllegalArgumentException: Map data should be a non-nullable struct type

Looks like the struct field was written as nullable
arrow-map-integration-test-error

@bkietz bkietz deleted the 1207-Implement-Map-logical-type branch February 25, 2021 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants