-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-1073: C++: Adapative integer builder #723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be really nice to have!
We should do a performance shootout against https://github.com/wiseio/paratext/blob/master/src/util/widening_vector.hpp at some point
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static inline here maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below; inline is mostly these days only about linkage and visibility and not so much about if this function should really be inlined.
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be replaced by a bit trick, like val & ~(std::numeric_limits<int32_t>::max() - 1)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be but I'm neither sure which pattern to use and benchmarks are showing that this is not really relevant performance-wise.
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is std::transform the same performance as a for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the background it should actually be that. The only differentiation to a plain for-loop except the savings in lines of code is that with C++17 you could have the compiler/STL run the loop multi-threaded.
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unchecked status
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if a helper template works here also
struct is_smaller {
constexpr bool value = sizeof(old_type) < sizeof(new_type);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is std::less but that also didn't work here.
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release builds?
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dchecks
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DCHECK or only error status?
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static inline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is a private method, this should not be needed, the compiler should automatically take care of inlining.
cpp/src/arrow/builder.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be overkill, but you could potentially collapse these classes with an extra template parameter that switches between signed and unsigned variants. So something like
class AdaptiveIntBuilder : AdaptiveIntBuilderBase<AdaptiveIntBuilder>
and then you could define various static attributes on AdaptiveIntBuilder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though about that and did not see yet how this would reduce code complexity while keeping it understandable.
|
Fixed an unchecked status and added a benchmark for unsigned integers. Performance-wise, the builders are clearly dominated by the expand & copy logic, thus I opened a JIRA to discuss if we should make |
cpp/src/arrow/builder.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does inlining scalar appends make much of a perf impact?
|
I think this is OK except for me comment about inlining. I suppose I can play with the benchmarks after this goes in |
|
Benchmarks about the scalar append: So it seems inlining it makes a significant difference. I will make a clean implementation of that. |
Change-Id: I5309b506174c2fcafd6c168069fa81a5af4122fb
|
Green Appveyor build: https://ci.appveyor.com/project/xhochy/arrow/build/1.0.372 |
wesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, thanks @xhochy!
| while (state.KeepRunning()) { | ||
| AdaptiveIntBuilder builder(default_memory_pool()); | ||
| for (int64_t i = 0; i < size; i++) { | ||
| builder.Append(data[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this benchmark more "realistic" we should add an assertion about the result of Append being OK
No description provided.