Skip to content

fix group-by v2 BufferArrayGrouper for empty multi-value dimension row#7794

Merged
gianm merged 4 commits intoapache:masterfrom
clintropolis:fix-buffer-array-group-multi-val-empty-row
May 30, 2019
Merged

fix group-by v2 BufferArrayGrouper for empty multi-value dimension row#7794
gianm merged 4 commits intoapache:masterfrom
clintropolis:fix-buffer-array-group-multi-val-empty-row

Conversation

@clintropolis
Copy link
Copy Markdown
Member

@clintropolis clintropolis commented May 29, 2019

This PR fixes an 'off by 1' issue in BufferArrayGrouper for the scenario that when faced with multi-value dimension rows with 0 elements, and attempting to aggregate on GroupByColumnSelectorStrategy.GROUP_BY_MISSING_VALUE, the grouper would initialize incorrectly into a state where it's next value was -1, meaning hasNext evaluates to false, but next is called without checking resulting in something like:

java.lang.RuntimeException: java.util.NoSuchElementException
	at org.apache.druid.query.aggregation.AggregationTestHelper$9.run(AggregationTestHelper.java:678) ~[test-classes/:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:243) [classes/:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:232) [classes/:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
	at org.apache.druid.java.util.common.concurrent.DirectExecutorService.execute(DirectExecutorService.java:82) [classes/:?]
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) [?:1.8.0_192]
	at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:58) [guava-16.0.1.jar:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1.apply(GroupByMergingQueryRunnerV2.java:230) [classes/:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1.apply(GroupByMergingQueryRunnerV2.java:220) [classes/:?]
	at com.google.common.collect.Iterators$8.transform(Iterators.java:794) [guava-16.0.1.jar:?]
	at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) [guava-16.0.1.jar:?]
	at com.google.common.collect.Iterators.addAll(Iterators.java:357) [guava-16.0.1.jar:?]
	at com.google.common.collect.Lists.newArrayList(Lists.java:147) [guava-16.0.1.jar:?]
	at com.google.common.collect.Lists.newArrayList(Lists.java:129) [guava-16.0.1.jar:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1.make(GroupByMergingQueryRunnerV2.java:216) [classes/:?]
	at org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1.make(GroupByMergingQueryRunnerV2.java:158) [classes/:?]
...

Also fixes an issue in GroupByEngineIterator#hasNext to return delegate.hasNext instead of assuming the delegate has a next after initialized. This is no longer an issue since the delegate in this case is now getting initialized correctly, but still seems like the right thing to do.

Finally, fixes a third issue with GroupByQueryEngineV2.ArrayAggregateIterator and sql compatible null handling, where it was always using "" as the key for GroupByColumnSelectorStrategy.GROUP_BY_MISSING_VALUE instead of what the group by strategies for the hash iterator was doing for string dimensions and using NullHandling.defaultStringValue().

The added test will trigger the NoSuchElementException without the modifications in this PR (test code was borrowed from #7588). Additionally, the tests are run with both the array grouper and hash grouper to ensure they produce the same results (the hash grouper did not have this issue). The tests would also fail for sql compatible null handling for the array grouper without the 3rd fix to ArrayAggregateIterator, which is how I stumbled into that issue.

@jihoonson
Copy link
Copy Markdown
Contributor

Would you please check the CI failure?

@clintropolis
Copy link
Copy Markdown
Member Author

Would you please check the CI failure?

Yeah, unfortunately the CI failure is legitimate, I'm trying to determine the issue. Interestingly, this appears to be another inconsistency between BufferHashGrouper and BufferArrayGrouper, as only the latter is failing this test.. though I haven't really determined how that is possible yet since neither of them directly do anything with NullHandling stuff.

Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @clintropolis.

@gianm gianm added this to the 0.16.0 milestone May 30, 2019
@gianm gianm merged commit aaefdb3 into apache:master May 30, 2019
clintropolis added a commit to implydata/druid-public that referenced this pull request May 30, 2019
apache#7794)

* fix groupby v2 BufferArrayGrouper

* better name test

* fix sql compatible null handling array grouper bug

* another test
@clintropolis clintropolis deleted the fix-buffer-array-group-multi-val-empty-row branch May 30, 2019 20:03
clintropolis added a commit to clintropolis/druid that referenced this pull request May 30, 2019
apache#7794)

* fix groupby v2 BufferArrayGrouper

* better name test

* fix sql compatible null handling array grouper bug

* another test
@clintropolis clintropolis modified the milestones: 0.16.0, 0.15.0 May 30, 2019
fjy pushed a commit that referenced this pull request May 31, 2019
#7794) (#7803)

* fix groupby v2 BufferArrayGrouper

* better name test

* fix sql compatible null handling array grouper bug

* another test
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 4, 2019
apache#7794) (apache#7803)

* fix groupby v2 BufferArrayGrouper

* better name test

* fix sql compatible null handling array grouper bug

* another test
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 26, 2019
apache#7794)

* fix groupby v2 BufferArrayGrouper

* better name test

* fix sql compatible null handling array grouper bug

* another test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants