Skip to content

useApproximateCountDistinct implicitly makes all aggregates using distinct unplannable #16715

@kgyrtkirk

Description

@kgyrtkirk
  • useApproximateCountDistinct supposed to enable a special mode to handle COUNT(DISTINCT x) with skethces
  • the rules are configured to remove the generic distinct handler rules here
  • so queries like select sum(distinct added) from wikipedia will fail to plan if useApproximateCountDistinct is enabled

quidem test:

!set useApproximateCountDistinct false
!use druidtest://?numMergeBuffers=3
!set outputformat mysql

select sum(distinct added) from wikipedia;
+---------+
| EXPR$0  |
+---------+
| 6455074 |
+---------+
(1 row)

!ok

!set useApproximateCountDistinct true
!use druidtest://?numMergeBuffers=3
select sum(distinct added) from wikipedia;
[...]
Missing conversion is LogicalAggregate[convention: NONE -> DRUID]
There is 1 empty subset: rel#105:RelSubset#2.DRUID.[], the relevant part of the original plan is as follows
103:LogicalAggregate(group=[{}], EXPR$0=[SUM(DISTINCT $0)])
  101:LogicalProject(subset=[rel#102:RelSubset#1.NONE.[]], added=[$18])
    74:LogicalTableScan(subset=[rel#100:RelSubset#0.NONE.[]], table=[[druid, wikipedia]])
[...]
QueryInterruptedException{msg=Query could not be planned. A possible reason is [Aggregation [SUM(DISTINCT $18)] is not supported2], code=Unknown exception, class=org.apache.druid.error.DruidException, host=null}
	at org.apache.druid.query.QueryInterruptedException.wrapIfNeeded(QueryInterruptedException.java:113)
[...]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions