-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Directly rewrite filters on RHS join columns into LHS equivalents #9818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
937f268
c24e612
64bfecb
d82c4f9
c312e05
cce3497
87ad569
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -28,6 +28,7 @@ | |
| import org.apache.druid.segment.ColumnSelectorFactory; | ||
| import org.apache.druid.segment.vector.VectorColumnSelectorFactory; | ||
|
|
||
| import java.util.Map; | ||
| import java.util.Set; | ||
|
|
||
| public interface Filter | ||
|
|
@@ -162,4 +163,29 @@ default boolean canVectorizeMatcher() | |
| * can be expected to have a bitmap index retrievable via {@link BitmapIndexSelector#getBitmapIndex(String)} | ||
| */ | ||
| Set<String> getRequiredColumns(); | ||
|
|
||
| /** | ||
| * Returns true is this filter is able to return a copy of this filter that is identical to this filter except that it | ||
| * operates on different columns, based on a renaming map. | ||
| */ | ||
| default boolean supportsRequiredColumnRewrite() | ||
| { | ||
| return false; | ||
| } | ||
|
|
||
| /** | ||
| * Return a copy of this filter that is identical to the this filter except that it operates on different columns, | ||
| * based on a renaming map where the key is the column to be renamed in the filter, and the value is the new | ||
| * column name. | ||
| * | ||
| * For example, if I have a filter (A = hello), and I have a renaming map (A -> B), | ||
| * this should return the filter (B = hello) | ||
| * | ||
| * @param columnRewrites Column rewrite map | ||
| * @return Copy of this filter that operates on new columns based on the rewrite map | ||
| */ | ||
| default Filter rewriteRequiredColumns(Map<String, String> columnRewrites) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this interface take a map for future expansion even though all callers pass a map containing only one pair of key and value?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that was the intent |
||
| { | ||
| throw new UnsupportedOperationException("Required column rewrite is not supported by this filter."); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note to self: re-think the interface here. Does this need to be de-coupled from the function above? Does this introduce more chances of developer error that's only detected at runtime? Is there a pattern that guarantees that this is only called if
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. A potential alternative could be having just this interface and its default implementation returns |
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -363,5 +363,26 @@ public SuffixMatch getSuffixMatch() | |
| { | ||
| return suffixMatch; | ||
| } | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why can't a like dim filter be re-written?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like filters can, the new interface is on |
||
| @Override | ||
| public boolean equals(Object o) | ||
| { | ||
| if (this == o) { | ||
| return true; | ||
| } | ||
| if (o == null || getClass() != o.getClass()) { | ||
| return false; | ||
| } | ||
| LikeMatcher that = (LikeMatcher) o; | ||
| return getSuffixMatch() == that.getSuffixMatch() && | ||
| Objects.equals(getPrefix(), that.getPrefix()) && | ||
| Objects.equals(pattern.toString(), that.pattern.toString()); | ||
| } | ||
|
|
||
| @Override | ||
| public int hashCode() | ||
| { | ||
| return Objects.hash(getSuffixMatch(), getPrefix(), pattern.toString()); | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing EqualsVerifierTest for this LikeDimFilter. Thanks for this adding an equals test for LikeMatcher 👍
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, it doesn't look like we have EqualsVerifierTest for any of the |
||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -24,6 +24,7 @@ | |
| import it.unimi.dsi.fastutil.ints.IntList; | ||
| import org.apache.druid.collections.bitmap.ImmutableBitmap; | ||
| import org.apache.druid.common.config.NullHandling; | ||
| import org.apache.druid.java.util.common.IAE; | ||
| import org.apache.druid.java.util.common.Pair; | ||
| import org.apache.druid.query.BitmapResultFactory; | ||
| import org.apache.druid.query.extraction.ExtractionFn; | ||
|
|
@@ -47,6 +48,7 @@ | |
| import org.apache.druid.segment.vector.VectorColumnSelectorFactory; | ||
|
|
||
| import java.util.Comparator; | ||
| import java.util.Map; | ||
| import java.util.Objects; | ||
| import java.util.Set; | ||
|
|
||
|
|
@@ -172,6 +174,39 @@ public Set<String> getRequiredColumns() | |
| return boundDimFilter.getRequiredColumns(); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean supportsRequiredColumnRewrite() | ||
| { | ||
| return true; | ||
| } | ||
|
|
||
| @Override | ||
| public Filter rewriteRequiredColumns(Map<String, String> columnRewrites) | ||
| { | ||
| String rewriteDimensionTo = columnRewrites.get(boundDimFilter.getDimension()); | ||
|
|
||
| if (rewriteDimensionTo == null) { | ||
| throw new IAE( | ||
| "Received a non-applicable rewrite: %s, filter's dimension: %s", | ||
| columnRewrites, | ||
| boundDimFilter.getDimension() | ||
| ); | ||
| } | ||
| BoundDimFilter newDimFilter = new BoundDimFilter( | ||
| rewriteDimensionTo, | ||
| boundDimFilter.getLower(), | ||
| boundDimFilter.getUpper(), | ||
| boundDimFilter.isLowerStrict(), | ||
| boundDimFilter.isUpperStrict(), | ||
| null, | ||
| boundDimFilter.getExtractionFn(), | ||
| boundDimFilter.getOrdering() | ||
| ); | ||
| return new BoundFilter( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. note to self: What is the overhead of creating a new object for the filter? |
||
| newDimFilter | ||
| ); | ||
| } | ||
|
|
||
| private static Pair<Integer, Integer> getStartEndIndexes( | ||
| final BoundDimFilter boundDimFilter, | ||
| final BitmapIndex bitmapIndex | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a test that looks for all implementations of Filter and then validate what
supportsRequiredColumnRewrite()returns.Since the default is false, we should be explicit in the list of filters that we say do not support re-writes. This way, if someone adds a filter in the future, this test will fail and force the dev to think about what the correct implementation should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you elaborate more details of the test you think? I'm not sure how it could help. Since the default returns false, the new filter overrode this method to return true if it had to be. The author should be able to think what it means when it is overridden. If you mean new filters need a design review, that should be done during the design phase rather than testing phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that can be difficult for the author and the reviewer. There are many functions in an interface, and most editors will pull in methods that don't have a definition unless you explicitly tell them to. As a reviewer, nothing is warning me that the implementation/ design did not consider implementing this function.
I have a WIP change that does something similar to validate that the equals and hashCode implementations are not using the implementation provided by
ObjectI suspect we'd want something very similar, except with the ability to provide exceptions. https://github.com/suneet-s/druid/blob/b8bc17551361b20458b073c62b5fe8f5d6a85183/processing/src/test/java/org/apache/druid/query/filter/FilterTest.java#L33This way, the dev has to add the exception as part of the PR for the tests to pass, and the reviewer can see that an exception was added and ask themselves if that exception makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thanks, I see what you are saying. But do we need that kind of tests? If I understand correctly, this interface is for temporarily marking what filters support the rewrite for now since some of filters don't implement it. This interface will be eventually removed in the future. I don't see any reason to not support the rewrite for some particular filter types.