-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column #6985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
39280c5 to
fe2f609
Compare
cpp/src/parquet/CMakeLists.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: AVx2 -> AVX2, o detect -> to detect, BIM2 -> BMI2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the comment as it was stale.
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on the correctness of the levels handling (someone competent will have to review). I would like to see some code cleanup and organization changes, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem useful :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be removed now.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to move the low-level handling of levels to a dedicated C++ file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved all everything in the anonymous namespace to level_conversion.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this prevents inlining of the Scalar version which was happening before, but I'm not sure that is a big deal.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"instructions"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
cpp/src/parquet/level_conversion.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"corresponding"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. fixed.
cpp/src/parquet/level_conversion.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense for this to be a helper function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree now that the AppendWord is moved to FirstTimeBitmapWriter
cpp/src/parquet/level_conversion.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then please only define this function if ARROW_HAVE_BMI2 is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, a more specific name (such as BitmapToString).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then just don't define the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. done.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: max_repitition_level -> max_repetition_level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to move these two lines before line 186?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just 1.
cpp/src/parquet/column_reader.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this path is executed only when #if defined(ARROW_HAVE_BMI2) is true, it would be good to add such as code
#if defined(ARROW_HAVE_BMI2)
...
#else
assert(false && "must not execute this without BMI2");
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Most of these tests fail on big-endian platform
[ FAILED ] TestAppendBitmap.TestOffsetOverwritesCorrectBitsOnExistingByte
[ FAILED ] TestAppendBitmap.TestOffsetShiftBitsCorrectly
[ FAILED ] TestAppendBitmap.AllBytesAreWrittenWithEnoughSpace
[ FAILED ] TestAppendBitmap.OnlyApproriateBytesWrittenWhenLessThen8BytesAvailable
[ FAILED ] TestAppendToValidityBitmap.BasicOperation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a substantial refactoring, I tried to provide guards but without a big-endian machine in CI it will be very hard to to catch all of these issues.
cpp/src/parquet/level_conversion.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to use union? bytes[] is used only at line 106.
It looks simple to explicitly extract the lowest 8bit instead of using union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally thought I would have to do more. I removed the union.
|
@pitrou I think I addressed your comments. One of them that went stale was the complexity for "AppendWord", I tried to remove parts that did not seem to affect performance on benchmarks for parquet column reading but I spent some time trying to maximize word level parallelism for the unaligned case, because I think at least for repeated fields I expect this case to be fairly common. As a point of comparison using the AppendWord implementation I put in place for BigEndian shows much smaller improvements. Really this should use CopyBitmap but I didn't want to start moving a move code then I needed to in bit_util.h for this PR. I opened If more comments are needed or you think there is a cleaner way of writing the code, I'm happy for input. @wesm do you want to look at the parquet specific logic/comments to make sure I captured them correctly? |
|
I'll try to have a closer look tomorrow or Tuesday |
|
@wesm wanted to make sure this is still on your radar |
|
Yep sorry thanks for the nudge, will look today |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high level this looks good.
One problem with introducing more SIMD code is that we won't yet have a runtime dispatching strategy. We will need to go through all of our SIMD accelerations in this library and refactor things so that we can build a "fat" binary that includes both AVX2/BMI2-accelerated versions and the non-SIMD versions. That way we can reap the benefits of this work in portable packages like Python wheels / conda packages.
cpp/src/arrow/util/bit_util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
cpp/src/arrow/util/bit_util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is a bit too much bit manipulation for me to carefully scrutinize but I trust you. There are some typos in the comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if there is anything I can add for documentation please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cryptic. Can you rewrite it in a more understandable way? For example (untested):
if (bit_mask_ == 0x1) {
current_byte_ = 0;
} else {
current_byte_ = *(append_position + bytes_for_word - 1);
}|
I think this is fine to merge once most of the typos in the comments are fixed. A rebase will probably fix the Rust lint error |
emkornfield
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebased.
cpp/src/arrow/util/bit_util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
cpp/src/arrow/util/bit_util.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if there is anything I can add for documentation please let me know.
This reverts commit 59e39794974ebade0cff0ba702b6cbc9cbd43b87.
|
@pitrou I think the C GLib build failure looks unrelated to me. I fixed the s390x build. |
|
@wesm I expect your PR to get merged first but if it doesn't consider using PopCount methods in this one. |
|
+1. I'm going to go ahead and merge this and then I'll rebase #7352 |
This change does the following:
With AVX2 enabled this seems to improve benchmarks by 20% for nullable columns (BM_Read..) on my box. I didn't see any impacts in other benchmarkmarks.
See checklist for what is remaining. If possible i'd like to get early feedback on naming and my approach to checking for BMI2.
Still needed: