-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-10960: [C++][FlightRPC] Default to empty buffer instead of null #8962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dc11e57 to
e52fff3
Compare
|
Thanks for catching & taking a look at this! I think, rather than change the IPC class, we should make the FlightData Protobuf deserializer act more like a real Protobuf parser, and fill in an empty body instead of leaving the buffer as null if there was not a buffer read from the wire. That should probably happen here: arrow/cpp/src/arrow/flight/serialization_internal.cc Lines 376 to 377 in 519e9da
I'll admit that even having written the code, it's hard to trace the path from here to where you currently have the fix... Here is where we invoke the deserializer: arrow/cpp/src/arrow/flight/serialization_internal.cc Lines 429 to 432 in 25c736d
That in turn is called from a peekable reader:
The reader fills in a FlightData which gets passed to the RecordBatchReader here: arrow/cpp/src/arrow/flight/server.cc Line 190 in 519e9da
Calling ReadNext implicitly bounces through this glue class: arrow/cpp/src/arrow/flight/server.cc Lines 96 to 119 in 519e9da
Which finally actually constructs the IPC Message in this PR: arrow/cpp/src/arrow/flight/serialization_internal.cc Lines 382 to 384 in 25c736d
|
|
And FWIW, there's a long-standing issue (ARROW-4419) for testing the Flight implementation against a 'plain gRPC' client/server to catch issues like this; we intend for Flight to still be compatible with regular gRPC clients that haven't been specially optimized. |
|
@lidavidm Thank you for the pointers! That's very helpful; I'm going to try making the fix in the spot you suggested instead and push an update in a bit. |
|
Note I took a look at the Java side of this and the fix may not be so simple, since some code paths in Java expect there to be 0 buffers, and providing a default buffer breaks them; C++ may have the same issue. (Those paths are arguably also wrong, since Protobuf implementations can write the empty field...) |
df24005 to
71cd7bd
Compare
|
I made the same sort of fix in the spot you pointed out, to each of the 3 byte fields in
Yuck. Should I open a new Jira ticket to track the Java side or should this ticket cover the same problem in the Java and C++ implementations?
Hm, isn't the C++ client currently writing the empty field? Or is it writing the field to |
Should be good once CI passes, thank you!
I filed ARROW-10939 and am looking at it right now. I also cross-linked the three outstanding issues on Jira.
Sorry, I was talking about the Java implementation - unilaterally filling in an empty buffer on deserialization when not provided there broke other code which assumed there would be no buffer instead of an empty one. (So what I meant is that while the original code would be fine with any Protobuf implementation that doesn't write out empty fields, it would break with one that did write them out.) |
Yes, I meant that because the C++ client is currently a Protobuf implementation that is writing out an empty field (I think), wouldn't the archery integration tests with the Java server and C++ client on the generated_null test currently be failing? |
71cd7bd to
0e0346d
Compare
Ah sorry - C++ is fine because it only writes out those bytes conditionally, for message types that require a body (e.g. a RecordBatch). But if an implementation wrote out an (empty) body for a Schema, the Java client/server would break.
|
| // Set default values for unspecified FlightData fields | ||
| if (out->app_metadata == nullptr) { | ||
| out->app_metadata = std::make_shared<Buffer>(nullptr, 0); | ||
| } | ||
| if (out->metadata == nullptr) { | ||
| out->metadata = std::make_shared<Buffer>(nullptr, 0); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the test failures might come from this, actually - if there's a metadata buffer, that causes some code to assume this must be a schema or record batch and try to parse it accordingly (but it'll then encounter the empty buffer and fail to parse it).
Also, the app_metadata buffer can be omitted - it's meant to be optional so having an empty vs null one is no big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! So much for trying to be overly helpful 😅 Pushed a change so this just handles body.
0e0346d to
6f1ba0a
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this!
The three CI failures look like build environment flakes (and are failing on master).
(repeating context from the Jira issue)
Problem
ProtoBuf
proto3specifies that if a message does not contain a particular singular element, the field should get the default value. However, when the C++flight-test-integration-servergets aDoPutrequest with aFlightDatamessage for a record batch containing no items, and theFlightDatais missing thedata_bodyfield, the server responds with an error "Expected body in IPC message of type record batch".What happens
If I run the C++
flight-test-integration-serverand the C++flight-test-integration-clientwith thegenerated_null_trivialtest case, the test passes and I see this in wireshark:Note the
data_bodyfield is present but has no value.If I run the Rust
flight-test-integration-clientthat I'm working on developing, it does not send thedata_bodyfield at all if there are no bytes to send. I see this in wireshark:Note the
data_bodyfield is not present.The C++ server then returns the error message
Expected body in IPC message of type record batch, which comes from this check for message body called inReadNextof the record batch stream reader.What I expect to happen
Instead of returning an error message because of a null pointer, the Message should get the default value of empty bytes.
The fix
@shepmaster and I worked on this fix together, but I'm not at all confident this is the exact right fix.
It's hard for me to trace through the code from a
FlightDatadata_bodyto an IPCMessagebody, but this does fix the problem. I'm also not sure what other cases this change might affect.I don't know how to write a unit test for this because the C++ code doesn't generate this case, but there will be a test coming in the form of a Rust flight integration test client.
Feedback much appreciated!