-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-7228. ChecksumByteBufferImpl.update() is expensive #3759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a hack, which uses inflection to reset the isReadOnly field of ByteBuffer to true, so that ChecksumByteBufferImpl.update performs checksum calculation directly without memory copy. Change-Id: Ib571f2d63149e763990880d44278cb2d26c11b55
dineshchitlangia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jojochuang Pretty hack! LGTM.
|
I'll still trying to come up with a better implementation. This is hacky. |
|
I've marked this ready for review - any reason not to commit this? It may not be perfect, but its a lot better than what is there at the moment without this change. |
|
Huum, some acceptance tests are failing with: Which I suspect is related |
|
Various suggestions to suppress the warning - https://stackoverflow.com/questions/46454995/how-to-hide-warning-illegal-reflective-access-in-java-9-without-jvm-argument I suspect it is the warning message that is breaking the acceptance tests, as they are likely expecting a certain output on stdout / stderr and this warning is breaking that expectation. |
Yes, the test expects empty output for these commands. This can be suppressed by:
ozone/hadoop-ozone/dist/src/shell/ozone/ozone Lines 89 to 95 in b3c8484
|
| public void update(ByteBuffer buffer) { | ||
| // this is a hack to not do memory copy. | ||
| try { | ||
| isReadyOnlyField.setBoolean(buffer, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be missing something here, but isn't buffer defined as readonly by us here:
ozone/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/ChecksumByteBuffer.java
Lines 42 to 43 in b3c8484
| default void update(byte[] b, int off, int len) { | |
| update(ByteBuffer.wrap(b, off, len).asReadOnlyBuffer()); |
Instead of using reflection, can't we simply remove the .asReadOnlyBuffer() call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there are other places that are not so simple to change, e.g. this function used by computeChecksum / verifyChecksum:
ozone/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/utils/BufferUtils.java
Lines 68 to 75 in b3c8484
| public static List<ByteBuffer> getReadOnlyByteBuffers( | |
| List<ByteString> byteStrings) { | |
| List<ByteBuffer> buffers = new ArrayList<>(); | |
| for (ByteString byteString : byteStrings) { | |
| buffers.add(byteString.asReadOnlyByteBuffer()); | |
| } | |
| return buffers; | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its the protobuf that originates the buffer as readOnly. When the protobuf bytes are received, the data goes into a protobuf ByteString, which is backed by a byteArray. ByteString has some methods to return a ByteBuffer, one of which is "asReadOnlyByteBuffer". This wraps the original byte array and returns a readonly buffer. At that stage the Java checksum interface only accepts a byte or byteArray. Even the java.util.zip.CRC32 implementation has an implementation to take a ByteBuffer, but this is what it does:
public void update(ByteBuffer buffer) {
int pos = buffer.position();
int limit = buffer.limit();
assert (pos <= limit);
int rem = limit - pos;
if (rem <= 0)
return;
if (buffer instanceof DirectBuffer) {
crc = updateByteBuffer(crc, ((DirectBuffer)buffer).address(), pos, rem);
} else if (buffer.hasArray()) {
crc = updateBytes(crc, buffer.array(), pos + buffer.arrayOffset(), rem);
} else {
byte[] b = new byte[rem];
buffer.get(b);
crc = updateBytes(crc, b, 0, b.length);
}
buffer.position(limit);
}
buffer.hasArray() checks for readOnly and returns false if it is readonly, meaning we cannot get at the array, and it will end up copying it.
|
@jojochuang Is there any estimation when this can be fixed. This change adds a significant boost to the Ozone performance and I wanted to continue investigating the performance further having this issue resolved. |
|
@jojochuang I've added the JVM arg to avoid the warning, plus changed initialization of the I think we should refine the |
We should probably move these to a log message, rather than printing directly to stdout / stderr, which is what I assume e.printStackTrace() does. Otherwise I think we should commit this, as it greatly improves what is there performance wise. |
Done. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ChecksumByteBufferImpl.update() copies the bytebuffer before checksum calculation because the isReadOnly field of ByteBuffer is false.
This is a quick&dirty hack, which uses inflection to reset the isReadOnly field of ByteBuffer to true, so that ChecksumByteBufferImpl.update performs checksum calculation directly without memory copy.
Posting this as a draft. It's a dirty hack and I like it, and it does work without any visible overhead.
What changes were proposed in this pull request?
HDDS-7228
What is the link to the Apache JIRA
(Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HDDS-XXXX. Fix a typo in YYY.)
Please replace this section with the link to the Apache JIRA)
How was this patch tested?
(Please explain how this patch was tested. Ex: unit tests, manual tests)
(If this patch involves UI changes, please attach a screen-shot; otherwise, remove this)