re-use expression vector evaluation results for the same offset in expression vector selectors#10614
Merged
clintropolis merged 9 commits intoapache:masterfrom Jan 13, 2021
Conversation
…ndings to underlying vector offset
jihoonson
reviewed
Dec 1, 2020
Contributor
jihoonson
left a comment
There was a problem hiding this comment.
The code change LGTM, but I'm wondering if we can add some unit tests that verify the returned result is from cache. Maybe it could be easier to write such tests using mocks.
3 tasks
jihoonson
approved these changes
Jan 13, 2021
Contributor
|
LGTM. Thanks @clintropolis. |
Member
Author
|
thanks for review @jihoonson |
JulianJaffePinterest
pushed a commit
to JulianJaffePinterest/druid
that referenced
this pull request
Jan 22, 2021
…pression vector selectors (apache#10614) * cache expression selector results by associating vector expression bindings to underlying vector offset * better coverage, fix floats * style * stupid bot * stupid me * more test * intellij threw me under the bus when it generated those junit methods * narrow interface instead of passing around offset
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR improves
ExpressionVectorValueSelectorandExpressionVectorObjectSelectorto re-use evaluation results for the same selector. While I haven't measured this at all, it makes a relatively big difference whendruid.generic.useDefaultValueForNull=falsejust because the previous behavior meant that the expression was evaluated twice for numeric primitive expressions: once to get the value vector and again to get the null vector. I was aware of this issue but didn't get to it in the first round of additions I did for the vectorized expression stuffs and sort of forgot about it until now 😅 .It works by expanding
Expr.VectorInputBindingto further mimicReadableVectorOffset:which allows slightly modifying the backing of the
ExpressionVectorInputBindingimplementation to be based onReadableVectorOffsetwhich has agetIdmethod, instead ofVectorSizeInspectorwhich only offers current and max vector size.The size inspector provided by
VectorColumnSelectorFactory.getVectorSizeInspectorwas already aReadableVectorOffset, but to avoid exposing the entire offset interface which isn't going to be useful for most callers, I have added a new interface,ReadableVectorInspector, and pushedgetIdto it, reworked the method intogetReadableVectorInspector.With an identifier in place, the selectors can cache evaluation results and re-use them as long as the underlying offset does not change.
This PR has:
Key changed/added classes in this PR
ExpressionVectorValueSelectorExpressionVectorObjectSelectorExpressionVectorInputBindingVectorColumnSelectorFactoryReadableVectorInspector