Support filtering on long columns (including __time)#3180
Support filtering on long columns (including __time)#3180fjy merged 11 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
since we don't automatically convert to strings for javascript, can we add a JavaScriptDimFilter test that operates on the time column values directly?
There was a problem hiding this comment.
@xvrl I have one in the testTimeFilterAsLong() test, the JavascriptDimFilter there compares directly on longs
|
@jon-wei can you add some docs also with an example doing filtering on a time column ? |
|
@nishantmonu51 I've added a section to the docs on filtering on __time |
a18ec5d to
0e1cdda
Compare
There was a problem hiding this comment.
`__time`
would probably mess with the syntax highlighting less
|
minor comments to be fixed, but 👍 after those are addressed |
|
@fjy I've moved the common logic to functions in Filters, addressed the other comments as well |
|
Ran the ValueMatcher-using benchmarks again, results haven't changed: |
There was a problem hiding this comment.
How about calling this getLongValueMatcher so it can be used with other long dims too, when the time comes?
There was a problem hiding this comment.
seems like long predicate should extend druid predicate
There was a problem hiding this comment.
also i dont understand the purpose of interfaces with no methods
There was a problem hiding this comment.
@fjy To support separate String/Long ValueMatchers on the factory, the Filter side needs to pass in a different kind of predicate for each type, so this interface is used to combine the predicate implementations into a single object
There was a problem hiding this comment.
Can we get a more descriptive name than DruidPredicate?
There was a problem hiding this comment.
@fjy renamed this to DruidCompositePredicate
| /** | ||
| * Composite predicate class that can accept all supported types | ||
| * | ||
| * The apply() method inherited from Predicate<Object> is intended for String values |
There was a problem hiding this comment.
Hmm, the Predicate<Object> seems like premature generalization. We aren't getting any use out of this now (we don't have non-primitive Object types that filters support), and generally the filters all call toString on this object anyway. How do you feel about making this a Predicate<String>?
|
going to change DruidCompositePredicate to DruidPredicateFactory, will update this PR |
| * | ||
| * A separate method is present for each supported primitive type to avoid boxing (for performance reasons) | ||
| */ | ||
| public interface DruidCompositePredicate extends Predicate<Object>, DruidLongPredicate |
There was a problem hiding this comment.
This file is unused now and could be removed.
There was a problem hiding this comment.
@gianm ah, my bad, should've put a WIP on the last commit, I wasn't quite done with the PR changes yet but wasn't expecting you to review so soon :D
There was a problem hiding this comment.
haha, okay, I'll wait for you to finish :)
|
Updated patch benchmarks in original comment |
| this.dimension = dimension; | ||
| this.values = values; | ||
| this.extractionFn = extractionFn; | ||
| setLongValues(); |
There was a problem hiding this comment.
Hmm, it might be worth only doing this if a long predicate is actually requested. I could see the parsing taking a while for long IN filters. But make sure we only do this once even if getLongPredicate is called many times.
| private final String value; | ||
| private final ExtractionFn extractionFn; | ||
|
|
||
| private Object initLock = new Object(); |
|
👍 after travis |
|
Stil 👍 from me |
Fixes #2816
This PR adds support for filtering on long columns, including __time, using the non-bitmap indexed column filtering support added by #3018.
This patch changes the interface of the ValueMatcherFactory regarding predicate handling. Filters will now create a DruidPredicateFactory, an object that can create a predicate suitable for each filterable column type (currently String and long only).
I have included some benchmarks to check performance of predicate matching, using a set of the affected filters in an OrFilter, applied during an IncrementalIndex read, also during a TimeseriesQuery with FilteredAggregators on both types of indexes.
Patch
Master
Benchmarks for basic queries are shown below:
Patch
Master