Skip to content

Conversation

@trxcllnt
Copy link
Contributor

@trxcllnt trxcllnt commented Sep 12, 2019

This PR closes ARROW-5762 by aligning the Map implementation with the spec, enabling its inclusion in the integration tests.

Feature-wise, the biggest change is to the Struct and Map rows. Row is now an abstract base class extended by StructRow and MapRow.

Row implements JS's native Map interface (and will pass row instanceof Map checks). To align with JS Maps, the Symbol.iterator return type changes from IterableIterator<TValue> to IterableIterator<[TKey, TValue]>.

Struct fields are uniform and known in advance, whereas Map fields vary by row. The StructRow will still take advantage of the Object.create() technique to create fast instances using a single StructRow instance as its prototype.

However MapRow must either be dynamic or have its fields defined on construction, so I've changed MapRow to return an ES6 Proxy if available (supported everywhere except IE11) and fall back to Object.defineProperties() if not.

Copy link
Member

@TheNeuralBit TheNeuralBit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM other than the CI failure.. looks like there's an issue with ordering in es2015-umd

return row;
};
})();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run the performance tests to make sure this change doesn't affect perf for the row proxies? I don't think it would, but doesn't hurt to make sure.

function generateMap<T extends Map_>(this: TestDataVectorGenerator, type: T, length = 100, nullCount = length * 0.2 | 0, children = type.children.map((f) => this.visit(f.type, length, nullCount))): GeneratedVector<V<T>> {
const vecs = children.map(({ vector }) => vector);
const cols = children.map(({ values }) => values);
function generateMap<T extends Map_>(this: TestDataVectorGenerator,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a simple test for MapVector to vector-tests.ts as well? The generated data tests are nice for completeness, but I like that vector-tests serve as a form of documentation and a sanity check in case anything goes wrong in this test

@codecov-io
Copy link

Codecov Report

Merging #5371 into master will increase coverage by 1.98%.
The diff coverage is 72.85%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #5371       +/-   ##
===========================================
+ Coverage   87.57%   89.56%    +1.98%     
===========================================
  Files        1041      102      -939     
  Lines      151814     6603   -145211     
  Branches     1447     1495       +48     
===========================================
- Hits       132958     5914   -127044     
+ Misses      18492      678    -17814     
+ Partials      364       11      -353
Impacted Files Coverage Δ
js/src/Arrow.ts 100% <ø> (ø) ⬆️
js/src/visitor/jsonvectorassembler.ts 100% <ø> (ø) ⬆️
js/src/util/vector.ts 97.05% <100%> (+1.22%) ⬆️
js/src/data.ts 91.66% <100%> (ø) ⬆️
js/src/vector/base.ts 89.13% <100%> (ø) ⬆️
js/src/visitor/vectorassembler.ts 82.85% <100%> (ø) ⬆️
js/src/ipc/metadata/message.ts 93.26% <100%> (ø) ⬆️
js/src/vector/struct.ts 100% <100%> (+22.22%) ⬆️
js/src/visitor/get.ts 100% <100%> (ø) ⬆️
js/src/visitor/vectorloader.ts 98.79% <100%> (ø) ⬆️
... and 950 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a2a3de...3a0eca9. Read the comment docs.

@wesm wesm closed this in abc7860 Sep 15, 2019
pprudhvi pushed a commit to pprudhvi/arrow that referenced this pull request Sep 16, 2019
This PR closes [ARROW-5762](https://issues.apache.org/jira/browse/ARROW-5762) by aligning the Map implementation with the spec, enabling its inclusion in the integration tests.

Feature-wise, the biggest change is to the `Struct` and `Map` rows. `Row` is now an abstract base class extended by `StructRow` and `MapRow`.

Row implements [JS's native Map](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map) interface (and will pass `row instanceof Map` checks). To align with JS Maps, the `Symbol.iterator` return type changes from `IterableIterator<TValue>` to `IterableIterator<[TKey, TValue]>`.

`Struct` fields are uniform and known in advance, whereas `Map` fields vary by row. The `StructRow` will still take advantage of the `Object.create()` technique to create fast instances using a single `StructRow` instance as its prototype.

However `MapRow` must either be dynamic or have its fields defined on construction, so I've changed `MapRow` to return an ES6 Proxy if available ([supported everywhere except IE11](https://caniuse.com/#search=Proxy)) and fall back to `Object.defineProperties()` if not.

Closes apache#5371 from trxcllnt/ARROW-5762/map-type and squashes the following commits:

3a0eca9 <ptaylor> ensure generated strings are unique
da2a1ed <ptaylor> Merge branch 'master' into ARROW-5762/map-type
58f244a <ptaylor> reimplement Map type

Authored-by: ptaylor <paul.e.taylor@me.com>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants