-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-7072: [Java] Support concating validity bits efficiently #5782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
| output.setZero(numBytes1, numBytesOut); | ||
|
|
||
| if (numBits1 % 8 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a mask? I think we have a utility method for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We have bitIndex utility method.
| } | ||
| output.setZero(numBytes1, numBytesOut); | ||
|
|
||
| if (numBits1 % 8 == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please document what this case represents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Document added.
| } | ||
|
|
||
| // the number of bits to fill a full byte after the first input is processed | ||
| int numBitsToFill = 8 - (numBits1 % 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask instead of mod, don't we have a utility method for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revised. Thanks.
| byte curByte = input2.getByte(i); | ||
|
|
||
| // first fill the bits to a full byte | ||
| int byteToFill = (curByte & 0xff) << (8 - numBitsToFill); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the mask here necessary, isn't it already a byte?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we must make sure the 24 high bits in the int is zero.
Without the mask, the high bits will be 1, if the highest bits in the curByte happen to be 1.
| byte curByte = input2.getByte(i); | ||
|
|
||
| // first fill the bits to a full byte | ||
| int byteToFill = (curByte & 0xff) << (8 - numBitsToFill); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is upcast to an int necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the java language specification, a byte will be automatically promoted to an int, and all computations are performed as int. So we cast it to int, and store variables as int, to avoid unnecessary cast.
For details, please see:
https://stackoverflow.com/questions/27582233/why-byte-and-short-values-are-promoted-to-int-when-an-expression-is-evaluated
|
|
||
| // fill remaining bits in the current byte | ||
| int remByte = (curByte & 0xff) >>> numBitsToFill; | ||
| output.setByte(numBytes1 + i, remByte); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like you could avoid one memory access, by keeping this byte cached instead of accessing writing it here and then reading it back again at the beginning of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Revised accordingly. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it still seems like this is setting memory twice per loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I misunderstood your point at first. I have reivsed the code accordingly. Thanks for the great suggestion.
| int numBitsToFill = 8 - (numBits1 % 8); | ||
|
|
||
| // the number of extra bits for the second input, relative to full bytes | ||
| int numRemainingBits = numBits2 % 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this closer to where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Thanks.
| } | ||
| } | ||
|
|
||
| try (ArrowBuf buf1 = allocator.buffer(1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please have a separate test case of each block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I have revised the test case, so this problem no longer exists.
| buf1.setZero(0, buf1.capacity()); | ||
| buf2.setZero(0, buf2.capacity()); | ||
|
|
||
| final int count1 = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these numbers seem arbitray it might be more obvious if you set bit patterns using hex or if you commented on the expected bit patterns instead of having loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I have revised the test cases and give comments about the purpose of each case explicitly.
| if (input1 != output) { | ||
| PlatformDependent.copyMemory(input1.memoryAddress(), output.memoryAddress(), numBytes1); | ||
| } | ||
| output.setZero(numBytes1, numBytesOut); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems redundant except for the last byte if there is a remainder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This is removed, and we pay special attention to set the last byte.
java/vector/src/main/java/org/apache/arrow/vector/BitVectorHelper.java
Outdated
Show resolved
Hide resolved
| BitVectorHelper.concatBits(buf1, count1, buf2, count2, output); | ||
| for (int i = 0; i < count1 + count2; i++) { | ||
| int result = BitVectorHelper.get(output, i); | ||
| if (i < count1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be clearer and less reliant on the input setup to write this as:
outputIdx =0;
for (int i = 0; i < count1; i++, outputIdx++) {
assertEquals(BitVectorHelper.get(output, outputIdx), BitVectorHelper.get(buf1, i))
}
for (int i = 0; i < count2; i++, outputIdx++) {
assertEquals(BitVectorHelper.get(output, outputIdx), BitVectorHelper.get(buf2, i))
}
with this setup, for instance you could "fuzz" by generating random bit strings in addition to the set pattern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. Revised accordingly. Thank you.
| * @param numBits1 the number of bits in the first validity buffer. | ||
| * @param input2 the second validity buffer. | ||
| * @param numBits2 the number of bits in the second validity buffer. | ||
| * @param output the ouput validity buffer. It can be the same one as the first input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be preallocated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I have stated this explicitly in the JavaDoc.
| prevByte = curByte >>> numBitsToFill; | ||
| } | ||
|
|
||
| // clear high bits for the previous byte, as it may be the last byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't seem to make sense anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. Thank you.
| int lastOutputByte = prevByte; | ||
|
|
||
| // the number of extra bits for the second input, relative to full bytes | ||
| int numRemainingBits = bitIndex(numBits2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numTrailingBits I think would be a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks better to me too. Thank you.
| int byteToFill = remByte << (8 - numBitsToFill); | ||
| lastOutputByte |= byteToFill; | ||
|
|
||
| if (numRemainingBits > numBitsToFill) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the logic would be clearer if you moved this if statement. as the last piece of logic after output.setByte on line 397. Otherwise it seems strange be setting the non-ending byte as the last byte.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that it is weird that last two byte are written in reverse order.
I have revised the code to solve this problem. Please take a look. Thanks a lot.
emkornfield
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly looks good a few more comments to talk through then I think this can be merged.
|
+1 thank you. |
For scenarios when we need to concate vectors (like the scenario in ARROW-7048, and delta dictionary), we need a way to concat validity bits. Currently, we have bit level API to read/write individual validity bit. However, it is not efficient , and we need a way to copy more bits at a time. Closes apache#5782 from liyafan82/fly_1106_bits and squashes the following commits: 70c1173 <liyafan82> Resolve comments d7bcfaa <liyafan82> Further improve the performance 4f1b9af <liyafan82> Resolve comments 591cf74 <liyafan82> Support concating validity bits efficiently Authored-by: liyafan82 <fan_li_ya@foxmail.com> Signed-off-by: Micah Kornfield <emkornfield@gmail.com>
For scenarios when we need to concate vectors (like the scenario in ARROW-7048, and delta dictionary), we need a way to concat validity bits.
Currently, we have bit level API to read/write individual validity bit. However, it is not efficient , and we need a way to copy more bits at a time.