-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-6883: [C++][Python] Allow writing dictionary deltas #8811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Acknowledged, will review when I can |
47a1e91 to
3e11e6b
Compare
* Add an ipc::IpcWriteOptions member to govern emission of dictionary deltas. If the option is enabled, deltas are detected by checking whether the new dictionary starts with the last emitted one for the same field. However, for nested dictionaries, deltas are not emitted for the outer dictionary, as the read path doesn't support it. * Add a stats() method to ipc::StreamDecoder * Expose the IPC statistics in Python, and add tests
3e11e6b to
16beb03
Compare
16beb03 to
be9d921
Compare
|
cc @lidavidm |
|
The Flight changes look good to me (though there's still ARROW-10787). |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 here. Sorry for the long delay
cpp/src/arrow/ipc/options.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might help to be explicit to say that otherwise the order of dictionaries may be altered by a passage from sender to receiver (I assume that's what you mean here by "stream compatibility" but others might be scratching their heads)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I simply meant that it's off for compatibility with implementations that don't support delta dictionaries.
Add an ipc::IpcWriteOptions member to govern emission of dictionary deltas.
If the option is enabled, deltas are detected by checking whether the new
dictionary starts with the last emitted one for the same field.
However, for nested dictionaries, deltas are not emitted for the outer dictionary,
as the read path doesn't support it.
Add a stats() method to ipc::StreamDecoder
Expose the IPC statistics in Python, and add tests