Add octet streaming of sketchs in MSQ#16269
Conversation
cryptoe
left a comment
There was a problem hiding this comment.
Partial review. Will finish the review soon.
|
|
||
| @Override | ||
| public void exceptionCaught(ClientResponse<BytesFullResponseHolder> clientResponse, Throwable e) | ||
| { |
There was a problem hiding this comment.
Why would this be empty ?
Shouldn't we do something here ?
There was a problem hiding this comment.
Similar to org.apache.druid.java.util.http.client.response.BytesFullResponseHandler#exceptionCaught, the response would not be ClientResponse#finished(T). My understanding is that nothing else needs to be done in this case.
|
Did some quick benchmarks. |
cryptoe
left a comment
There was a problem hiding this comment.
Left comments. Overall lgtm once the review comments are addressed.
| final int length = Double.BYTES + 2 * Integer.BYTES + bucketKeyArray.length + serializedSnapshot.length; | ||
|
|
||
| outputStream.write( | ||
| ByteBuffer.allocate(Integer.BYTES + length) |
There was a problem hiding this comment.
Please mention that 4 bytes are for the length.
There was a problem hiding this comment.
Added a comment
|
|
||
| protected abstract byte[] serializeKeyCollector(KeyCollectorSnapshot collectorSnapshot); | ||
|
|
||
| public byte[] serialize(KeyCollectorSnapshot collectorSnapshot) |
There was a problem hiding this comment.
Please add test cases for null and empty collectors.
for all the serde methods.
There was a problem hiding this comment.
Added test cases for empty collectors. We don't allow null collectors, we throw an error if the request tries to get a non existent collector before this point. Added a notnull annotation instead.
There are a few issues with using Jackson serialization in sending datasketches between controller and worker in MSQ. This caused a blowup due to holding multiple copies of the sketch being stored.
This PR aims to resolve this by switching to deserializing the sketch payload without Jackson.
The PR adds a new query parameter used during communication between controller and worker while fetching sketches, "sketchEncoding".
If the value of this parameter is OCTET, the sketch is returned as a binary encoding, done by ClusterByStatisticsSnapshotSerde.
If the value is not the above, the sketch is encoded by Jackson as before.
There are a few issues with using Jackson serialization in sending datasketches between controller and worker in MSQ. This caused a blowup due to holding multiple copies of the sketch being stored.
This PR aims to resolve this by switching to deserializing the sketch payload without Jackson.
The PR adds a new query parameter used during communication between controller and worker while fetching sketches, "sketchEncoding".
ClusterByStatisticsSnapshotSerde.Backward compatibility
Query param
The worker client now sends a query parameter which specifies the desired format with the request to fetch the sketch. If the worker receives this parameter, it returns the response encoded as a byte array, with the response header "Content-Type":"octet-stream". If the flag is missing, it defaults to the older behavior of Jackson serialized array.
The newer worker client checks the header and deserializes the response accordingly.
If a newer controller is running with an older worker, the worker will ignore the query parameter, and the controller will not see the response header, and use Jackson deserialization.
If a newer worker is running with an older controller, the query param will not be sent to the worker, which will again default to using Jackson, which the controller will automatically handle.
The above has been tested with an older WorkerClient and newer WorkerChatHandler, and with an older WorkerChatHandler and WorkerClient.
Byte format
The byte format contains a currently empty header byte, which can be used to store version information. This should allow updating the format in the future, if future versions check the header and deserialize accordingly.
Release note
Benchmarks
Benchmarks for jackson serde vs octet serde. The octet serde seems to be around 40% faster in most cases.
This PR has: