Skip to content

Conversation

@wesm
Copy link
Member

@wesm wesm commented Jun 12, 2017

cc @BryanCutler; this ended up being rather tedious.

As one uncertainty: I wasn't sure what to write for the typeLayout field for dictionary fields. We don't use this at all in C++, so let me know what you need in Java (if anything)

wesm added 5 commits June 11, 2017 17:57
Change-Id: Ib91476b24b1bfe11be7607d9bc0a324a6d99ca2d
Change-Id: Id87c9f768b26e0c5548f69f077e370cdefcbc464
Change-Id: I4a341f9fdd6b2fb12311908a30b26c85e76a8781
Change-Id: I30a98d0245971bc156d227206f02215cd027b4a0
Change-Id: I6d1146de9131d323d3ac4c66f74750dad4a82f95
@BryanCutler
Copy link
Member

Since typeLayout is optional in Java, you don't need to write anything and it will default to a 32bit Int. I believe this field is for specifying an Int of a different bitwidth - so does C++ just use a 32bit index?

@wesm
Copy link
Member Author

wesm commented Jun 13, 2017

We don't use the typeLayout at all on read; we use the indexType from the dictionary metadata

Change-Id: I07aa680108570f2ea9cb462376e540fa0038b991
@wesm
Copy link
Member Author

wesm commented Jun 13, 2017

OK, dictionaries are now in the top level. here's an example json:

https://gist.github.com/wesm/5100e41173a3b5e53437b7a887e4383a#file-gistfile1-json

@wesm
Copy link
Member Author

wesm commented Jun 13, 2017

As you can see in the JSON, the typeLayout I am writing is the index type

@wesm
Copy link
Member Author

wesm commented Jun 13, 2017

+1. Merging this and we can reconcile JSON format discrepancies when the Java version is complete and we add the integration tests themselves

@asfgit asfgit closed this in 25ba44c Jun 13, 2017
@BryanCutler
Copy link
Member

BryanCutler commented Jun 13, 2017

Sorry @wesm , I got mixed up between indexType and typeLayout in my response. Java expects the encoded Field to have a typeLayout to describe the dictionary type so it can then proceed to read the dictionary data. How do you that in C++ if that metadata isn't there?

@BryanCutler
Copy link
Member

I'll post my PR soon, there is still a couple things that need to be worked out, but it can read/write JSON so we can start testing.

@wesm wesm deleted the ARROW-460 branch June 13, 2017 18:43
@wesm
Copy link
Member Author

wesm commented Jun 13, 2017

The typeLayout right now is deterministic based on the other type metadata, so we don't need it. I can change the typeLayout to be the dictionary type layout if it causes a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants