-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-5586: [R] convert Array of LIST type to R lists #4575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r/src/array__to_vector.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a relatively costly thing to do. I'm not sure how we can escape it. Do you keep a reference to the generated Slice/Array, or does it get consumed and transformed in something else. See #4366 (comment) for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
::Allocate() makes an R list, and the various ::Ingest.*() fill that list.
I'll look if/how I can do this without actually calling Slice(), might be easy enough to just scan through values_array I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenging part is dealing with the bitmap and null_count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a huge timesink, I think it's ok for now, but be wary that this will be a problem on large Array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I don't think it's too much trouble to skip using Slice and use the bitmap of the values_array etc ... I'll have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 yeah it's about the Array__as_vector(slice) which needs slice as a proper array already. The Converter api is designed to ingest all of an array, so I guess it would need some extra methods to only ingest a slice of an array without. I'll open another issue for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r/src/arrowExports.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this file very error prone to read. I'd recommend going selective import route.
arrowExports.cpp # with ifdef import of impl/mock depending on ARROW_R_WITH_ARROW
arrowExports_impl.cpp
arrowExports_mock.cpp
Don't do this in this PR, open a followup ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I would expect anyone to actually read that file (hence it is supposed to be hidden in pull requests, ...) but I've open that issue:
r/src/array__to_vector.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding support for FixedSizeList?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is new to me, that probably is another issue/pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, but the code will be similar, almost equal, FixedSizeList also export value_offset.
r/src/array.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are "low-level" functions, they don't perform any bound checking, you can segfault with the wrong i. Do they need to be exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷♂ what's the rule about what is export worthy ?
Codecov Report
@@ Coverage Diff @@
## master #4575 +/- ##
==========================================
- Coverage 83.42% 74.98% -8.44%
==========================================
Files 209 56 -153
Lines 16062 3358 -12704
Branches 1253 0 -1253
==========================================
- Hits 13399 2518 -10881
+ Misses 2384 840 -1544
+ Partials 279 0 -279
Continue to review full report at Codecov.
|
8229213 to
300f897
Compare
We don't have the reverse operation yet (convert from an R data structure) to a list array, so those aren't easy to make, but e.g. we can get some with the json reader:
Created on 2019-06-14 by the reprex package (v0.3.0.9000)