Affected Version
0.12.0+
Description
When using a quantiles sketch agg (http://druid.io/docs/latest/development/extensions-core/datasketches-quantiles.html) in the outer query of a nested GroupBy that references a numeric column generated by a post-agg in the inner query, the following exception occurs:
java.lang.ClassCastException: java.lang.Double cannot be cast to com.yahoo.sketches.quantiles.DoublesSketch
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchMergeBufferAggregator.aggregate(DoublesSketchMergeBufferAggregator.java:65) ~[?:?]
at org.apache.druid.query.groupby.epinephelinae.AbstractBufferHashGrouper.aggregate(AbstractBufferHashGrouper.java:165) ~[druid-processing-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.query.groupby.epinephelinae.SpillingGrouper.aggregate(SpillingGrouper.java:167) ~[druid-processing-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.query.groupby.epinephelinae.Grouper.aggregate(Grouper.java:82) ~[druid-processing-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.query.groupby.epinephelinae.RowBasedGrouperHelper$1.accumulate(RowBasedGrouperHelper.java:270) ~[druid-processing-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.query.groupby.epinephelinae.RowBasedGrouperHelper$1.accumulate(RowBasedGrouperHelper.java:247) ~[druid-processing-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.java.util.common.guava.FilteringAccumulator.accumulate(FilteringAccumulator.java:41) ~[druid-core-0.14.2-incubating.jar:0.14.2-incubating]
at org.apache.druid.java.util.common.guava.MappingAccumulator.accumulate(MappingAccumulator.java:40) ~[druid-core-0.14.2-incubating.jar:0.14.2-incubating]
This occurs because the factorizeBuffered method in DoublesSketchAggregatorFactory relies on metricFactory.getColumnCapabilities(fieldName) to determine if an input column is numeric. If the column is not numeric, the aggregator assumes the input is a complex DoublesSketch object. For postaggs, the type information is not available, so the type mismatch occurs.
This issue may also be present in other aggregator types, I have not searched through the other implementations.
The following query structure will reproduce the issue:
{
"queryType": "groupBy",
"intervals": [
"2015-09-12/2015-09-13"
],
"dataSource": {
"type": "query",
"query": {
"queryType": "groupBy",
"dataSource": "wikipedia",
"intervals": [
"2015-09-12/2015-09-13"
],
"dimensions": [
"page"
],
"aggregations": [
{
"type": "quantilesDoublesSketch",
"name": "innerSketch",
"fieldName": "added"
},
{
"type": "count",
"name": "sampleCount"
}
],
"postAggregations": [
{
"type": "quantilesDoublesSketchToQuantile",
"name": "innerMedian",
"field": {
"type": "fieldAccess",
"fieldName": "innerSketch"
},
"fraction": 0.5
}
],
"granularity": "all"
}
},
"dimensions": [
"page"
],
"aggregations": [
{
"type": "quantilesDoublesSketch",
"name": "outerSketch",
"fieldName": "innerMedian"
},
{
"type": "count",
"name": "clientCount"
}
],
"postAggregations": [
{
"type": "quantilesDoublesSketchToQuantile",
"name": "outerMedian",
"field": {
"type": "fieldAccess",
"fieldName": "outerSketch"
},
"fraction": 0.5
}
],
"granularity": "all",
"context": {
"skipEmptyBuckets": "true"
}
}
Affected Version
0.12.0+
Description
When using a quantiles sketch agg (http://druid.io/docs/latest/development/extensions-core/datasketches-quantiles.html) in the outer query of a nested GroupBy that references a numeric column generated by a post-agg in the inner query, the following exception occurs:
This occurs because the
factorizeBufferedmethod inDoublesSketchAggregatorFactoryrelies onmetricFactory.getColumnCapabilities(fieldName)to determine if an input column is numeric. If the column is not numeric, the aggregator assumes the input is a complex DoublesSketch object. For postaggs, the type information is not available, so the type mismatch occurs.This issue may also be present in other aggregator types, I have not searched through the other implementations.
The following query structure will reproduce the issue: