Skip to content

Fix GroupBy type cast when ChainedExecutionQueryRunner merges results#4488

Merged
gianm merged 4 commits intoapache:masterfrom
jon-wei:groupby_type_merge_fix
Jul 1, 2017
Merged

Fix GroupBy type cast when ChainedExecutionQueryRunner merges results#4488
gianm merged 4 commits intoapache:masterfrom
jon-wei:groupby_type_merge_fix

Conversation

@jon-wei
Copy link
Copy Markdown
Contributor

@jon-wei jon-wei commented Jun 29, 2017

When using the "outputType" parameter of DimensionSpec to change the type of a dimension, a ClassCastException would occur when the ChainedExecutionQueryRunner merges results from multiple query runners:

java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Number

	at io.druid.query.groupby.GroupByQuery.compareDims(GroupByQuery.java:558)
	at io.druid.query.groupby.GroupByQuery.lambda$getRowOrdering$8(GroupByQuery.java:532)
	at io.druid.query.groupby.GroupByQuery$$Lambda$23/727666004.compare(Unknown Source)
	at com.google.common.collect.ComparatorOrdering.compare(ComparatorOrdering.java:38)
	at io.druid.query.groupby.GroupByQuery.lambda$getResultOrdering$5(GroupByQuery.java:328)
	at io.druid.query.groupby.GroupByQuery$$Lambda$24/1803093683.compare(Unknown Source)
	at com.google.common.collect.ComparatorOrdering.compare(ComparatorOrdering.java:38)
	at com.google.common.collect.NullsFirstOrdering.compare(NullsFirstOrdering.java:44)
	at io.druid.java.util.common.guava.MergeIterator$1.compare(MergeIterator.java:48)
	at io.druid.java.util.common.guava.MergeIterator$1.compare(MergeIterator.java:44)
	at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:669)
	at java.util.PriorityQueue.siftUp(PriorityQueue.java:645)
	at java.util.PriorityQueue.offer(PriorityQueue.java:344)
	at java.util.PriorityQueue.add(PriorityQueue.java:321)
	at io.druid.java.util.common.guava.MergeIterator.<init>(MergeIterator.java:57)
	at io.druid.java.util.common.guava.MergeIterable.iterator(MergeIterable.java:52)
	at io.druid.query.ChainedExecutionQueryRunner$1.make(ChainedExecutionQueryRunner.java:159)
	at io.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:43)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:233)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:224)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
	at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:58)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1.apply(GroupByMergingQueryRunnerV2.java:222)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1.apply(GroupByMergingQueryRunnerV2.java:212)
	at com.google.common.collect.Iterators$8.transform(Iterators.java:794)
	at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
	at com.google.common.collect.Iterators.addAll(Iterators.java:357)
	at com.google.common.collect.Lists.newArrayList(Lists.java:147)
	at com.google.common.collect.Lists.newArrayList(Lists.java:129)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1.make(GroupByMergingQueryRunnerV2.java:208)
	at io.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1.make(GroupByMergingQueryRunnerV2.java:156)
	at io.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:43)
	at io.druid.common.guava.CombiningSequence.accumulate(CombiningSequence.java:64)
	at io.druid.java.util.common.guava.MappedSequence.accumulate(MappedSequence.java:43)
	at io.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50)
	at io.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55)
	at io.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45)
	at io.druid.java.util.common.guava.MappedSequence.accumulate(MappedSequence.java:43)
	at io.druid.java.util.common.guava.Sequences.toList(Sequences.java:150)

The unit test provides an example that would trigger the error.

This patch adds a type conversion step to results generated by GroupByEngineV2 before they are merged by the ChainedExecutionQueryRunner.

@jon-wei jon-wei added the Bug label Jun 29, 2017
@jon-wei jon-wei added this to the 0.10.1 milestone Jun 29, 2017
);
}

// Convert dimension types to specified output types
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this conversion a method

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this comment almost exactly repeats the method name so not needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the unnecessary comment

private static void convertRowTypesToOutputTypes(List<DimensionSpec> dimensionSpecs, Map<String, Object> rowMap)
{
for (DimensionSpec dimSpec : dimensionSpecs) {
Object baseVal = rowMap.get(dimSpec.getOutputName());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use rowMap.compute(...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to keep it as-is, the function for compute(...) would need to lookup the outputType associated with the dimension name key.

That info is only kept in the DimensionSpecs list from the query in that context, so it would have to scan the list for every key or keep some other mapping structure of dimName->outputType around

Copy link
Copy Markdown
Member

@leventov leventov Jun 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why?

for (DimensionSpec dimSpec : dimensionSpecs) {
  ValueType outputType = dimSpec.getOutputType();
  rawMap.compute(dimSpec.getOutputName(), (name, baseVal) -> {
    switch (outputType) {
      case STRING:
        return baseVal == null ? "" : baseVal.toString();
      ...
    }
  });
}

@jon-wei jon-wei force-pushed the groupby_type_merge_fix branch from c3ce20b to 3ed1057 Compare June 30, 2017 00:56
@jon-wei jon-wei force-pushed the groupby_type_merge_fix branch from c1daa42 to 9dfd18d Compare June 30, 2017 20:16
(dimName, baseVal) -> {
switch (outputType) {
case STRING:
baseVal = baseVal == null ? "" : baseVal.toString();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor] Could return from switch, without reassigning baseVal and break

@gianm gianm merged commit 97a79f4 into apache:master Jul 1, 2017
gianm pushed a commit to gianm/druid that referenced this pull request Jul 1, 2017
…apache#4488)

* Fix GroupBy type cast error when ChainedExecutionQueryRunner merges multiple runners

* Move conversion step to separate method

* Remove unnecessary comment

* Use compute to update map
fjy pushed a commit that referenced this pull request Jul 1, 2017
…#4488) (#4497)

* Fix GroupBy type cast error when ChainedExecutionQueryRunner merges multiple runners

* Move conversion step to separate method

* Remove unnecessary comment

* Use compute to update map
@jon-wei jon-wei deleted the groupby_type_merge_fix branch October 6, 2017 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants