-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-9848: [Rust] Implement 0.15 IPC alignment #8174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@maxburke this will likely affect you, please have a look at it before the next release if you can. The Rust writer wasn't writing continuation markers consistently, and this is fixed. Readers should be able to continue reading, but it's worth testing out with your workloads. |
ed4bb31 to
7eee08a
Compare
| pub use self::gen::SparseTensor::*; | ||
| pub use self::gen::Tensor::*; | ||
|
|
||
| static ARROW_MAGIC: [u8; 6] = [b'A', b'R', b'R', b'O', b'W', b'1']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is const better than static here? IIRC I used static for ARROW_MAGIC when const either wasn't a thing yet, or I was unfamiliar with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think this can be const.
|
I'll look into the integration failures during the week, but it's C++ null arrays. I noticed similar failures when I was trying to read a null array from pyarrow. Seems I'm likely not understanding the spec correctly (odd that C++ is the only failure when Java and others pass). |
|
In C++ we are more hardened against metadata integrity issues from all the IPC fuzzing, e.g. https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/metadata_internal.cc#L747 According to the Flatbuffers spec, this field should be length-0, not null. |
|
Thanks Wes, from looking at the cpp source where the error occurs, the 0-length case makes me wonder if this might be a fbs issue in Rust. I'll have a look at whether the Rust impl distinguishes between a null and empty array value. |
Children should not be None/null, but should be an empty list
|
@wesm I found the problem, I was setting children as null instead of an empty list, so not a library issue. Thanks again |
|
@andygrove @paddyhoran @sunchao the changes here still work with the 0.14.x golden files, and 0.15+ as integration tests pass. Would you like to perform an in-depth review, or can I merge the changes? I have a follow-up that enables more tests, which I'd like to try wrap up over/before the weekend |
paddyhoran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do an in-depth review but from a high level this looks good. Let's merge this and keep the momentum going. Great job @nevi-me
| pub use self::gen::SparseTensor::*; | ||
| pub use self::gen::Tensor::*; | ||
|
|
||
| static ARROW_MAGIC: [u8; 6] = [b'A', b'R', b'R', b'O', b'W', b'1']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I think this can be const.
Thanks for the heads-up, @nevi-me, we'll give it a solid runthrough the next time we pull. |
Changes introduced in 0.15.0 changed the buffer alignment by adding a continuation marker to messages.
The previous behaviour was then marked as legacy, while both worked under V4 of the IPC metadata version.
This change catches the Rust implementation up to other languages, and has the consequence that more integration tests now pass.
The change is applied on top of clippy changes, and can be reviewed/merged after the relevant PRs.