Skip to content

add bloom filter fallback aggregator when types are unknown#7719

Merged
gianm merged 1 commit intoapache:masterfrom
clintropolis:bloom-filter-agg-fallback-object-aggregator
Jun 6, 2019
Merged

add bloom filter fallback aggregator when types are unknown#7719
gianm merged 1 commit intoapache:masterfrom
clintropolis:bloom-filter-agg-fallback-object-aggregator

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented May 21, 2019

I discovered a similar issue to #7660 while working on #7718 with the bloom filter aggregator, where it behaved in a manner even more strict than the quantiles aggregator, just not working at all if ColumnCapabilities are not available. This PR remedies this issue by adding a fallback aggregator, ObjectBloomFilterAggregator which examines the objects and aggregates to the best of its ability.

This (and many other) aggregator could perhaps be improved by using something like a functional interface inside bufferAdd to have the initial version of the function checking types, and then locking in a selector specialized function after the first non-null value. However, since i'm unsure if the cost of the if is insignificant to the rest of the work, and since this is not the only aggregator that is using this per-row check, I save exploring this optimization for future work revisiting complex value aggregators as a whole.

The added test only works for group by v2 because the bloom filter aggregator only has stub methods for it's ComplexMetricSerde, which group by v1 requires to be a bit more implemented to perform nested queries, and results in some confusing Bloom filter aggregators are query-time only error messages that should probably be fixed in a follow-up PR.

@gianm gianm merged commit ee0d4ea into apache:master Jun 6, 2019
@gianm gianm added this to the 0.16.0 milestone Jun 6, 2019
@clintropolis clintropolis deleted the bloom-filter-agg-fallback-object-aggregator branch June 6, 2019 22:42
gianm pushed a commit to implydata/druid-public that referenced this pull request Jul 3, 2019
gianm pushed a commit to implydata/druid-public that referenced this pull request Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants