-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-37055: [C++] Optimize hash kernels for Dictionary ChunkedArrays #38394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
Benchmark result: https://gist.github.com/js8544/d64f289313c814a2df6a87a945a0b382#file-pr38394-txt Also significant improvement in user's case in python: #37055 (comment) |
|
It was mentioned in #9683 (comment) that we can compute the result of each chunk and then merge them with This PR saves calls of the dictionary unifier, we can also further optimize this by optimizing the unifying process. This will be done once we have a faster hashtable: #38372. |
|
@felipecrv Do you mind having a look at this? |
Soon! |
| ARROW_CHECK_OK(unifier->Unify(*arr_dict, &transpose_map)); | ||
| ARROW_CHECK_OK(unifier->GetResult(&out_dict_type, &out_dict)); | ||
| RETURN_NOT_OK(dictionary_unifier_->Unify(*arr_dict, &transpose_map)); | ||
| RETURN_NOT_OK(dictionary_unifier_->GetResult(&out_dict_type, &out_dict)); | ||
|
|
||
| dictionary_ = out_dict; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need dictionary_ to be a member variable now? Wouldn't it suffice to perform a single DictionaryUnifier::GetResult call at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation for DictionaryUnifier::GetResult says the unifier can't be re-used after a call to GetResult [1].
My suggestion (that I think will work well):
Rename dictionary_ to first_dictionary_ and change Append(arr) to transition through this state machine on each call:
// --------------------------------------------------------------------------------------------------------------
// Current State Next State
// --------------------------------------------------------------------------------------------------------------
// !first_dictionary_ && !dictionary_unifier_ --> first_dictionary_ = arr_dict_
// UNCHANGED dictionary_unifier_
// --------------------------------------------------------------------------------------------------------------
// first_dictionary_ && !dictionary_unifier_ --> if !first_dictionary_.Equals(arr_dict) then
// dictionary_unifier_ = unify(first_dictionary_, arr_dict)
// first_dictionary_ = nullptr
// else
// UNCHANGED first_dictionary_, dictionary_unifier_
// end
// --------------------------------------------------------------------------------------------------------------
dictionary_unifier_ --> dictionary_unifier_ = unify(dictionary_unifier_, arr_dict)You will then have to re-think how dictionary_value_type and dictionary work below.
[1] https://github.com/apache/arrow/blob/main/cpp/src/arrow/array/array_dict.h#L169
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@js8544 to clarify, dictionary_unifier_ = unify(dictionary_unifier_, arr_dict) means something like unifier->unify(arr_dict) in the actual code :)
001a846 to
f09ab5c
Compare
felipecrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any changes other than the rebase.
Still working on it :) |
|
@felipecrv Updated as you suggested. |
felipecrv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
|
@js8544 I won't merge my own suggestions before you have a chance to reply and apply them yourself. If you're OK with them and CI passes with them applied I will merge. |
Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
|
macOS build failing for some unknown reason and this has been rebased very recently, so I'm merging. |
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit ec41209. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. |
…ays (apache#38394) ### Rationale for this change When merging dictionaries across chunks, the hash kernels unnecessarily unify the existing dictionary, dragging down the performance. ### What changes are included in this PR? Reuse the dictionary unifier across chunks. ### Are these changes tested? Yes, with a new benchmark for dictionary chunked arrays. ### Are there any user-facing changes? No. * Closes: apache#37055 Lead-authored-by: Jin Shang <shangjin1997@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Rationale for this change
When merging dictionaries across chunks, the hash kernels unnecessarily unify the existing dictionary, dragging down the performance.
What changes are included in this PR?
Reuse the dictionary unifier across chunks.
Are these changes tested?
Yes, with a new benchmark for dictionary chunked arrays.
Are there any user-facing changes?
No.
value_countsextremely slow for chunkedDictionaryArray#37055