Skip to content

SQL: Excessive time and memory use while planning joins on subqueries #9646

@gianm

Description

@gianm

These queries will blow up the planner's rule queue and make it take forever (and also use a lot of memory).

SELECT *
  FROM foo
  WHERE dim1 IN (SELECT NULL FROM lookup.lookyloo);

SELECT l.k, l.v, SUM(f.m1), SUM(nf.m1)
  FROM lookup.lookyloo l
  INNER JOIN druid.foo f ON f.dim1 = l.k
  INNER JOIN druid.numfoo nf ON nf.dim1 = l.k
  GROUP BY 1, 2 ORDER BY 2;

SELECT COUNT(*)
  FROM foo, lookup.lookyloo l, numfoo
  WHERE foo.cnt = l.k AND l.k = numfoo.cnt;

SELECT COUNT(*)
  FROM foo
  INNER JOIN lookup.lookyloo l1 ON l1.k = foo.m1
  INNER JOIN lookup.lookyloo l2 ON l2.k = l1.k;

Something in common with all of them is type mismatches in the join condition yielding implicit casts. Removing JoinProjectTransposeRule calms down the rule queue and the planner is able to complete in a reasonable amount of time. This sort of makes sense, since the implicit casts would generate a Project beneath the Join, and the purpose of JoinProjectTransposeRule is to try to swap them.

However, this rule is important because it helps us avoid subqueries. So we'll need another solution to replace it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions