Skip to content

PG18 - wrong result in join between distributed and local table #8217

@colm-mchugh

Description

@colm-mchugh

Seen in local_dist_join_mixed regress test. Repro:

CREATE TABLE distributed (id bigserial PRIMARY KEY, name text, created_at timestamptz DEFAULT now());
CREATE TABLE local (id bigserial PRIMARY KEY, title text);

SELECT create_distributed_table('distributed', 'id');

INSERT INTO distributed SELECT i,  i::text, now() FROM generate_series(0,100)i;
INSERT INTO local SELECT i,  i::text FROM generate_series(0,100)i;

SET client_min_messages TO DEBUG1; -- show worker queries

-- Returns 0, should return 101
SELECT count(*) FROM distributed INNER
JOIN local u1 USING (id) INNER
JOIN local u2 USING (id);

-- Returns 0, should return 101
SELECT count(*) FROM distributed d INNER
JOIN local u1 ON (d.id=u1.id) INNER
JOIN local u2 ON  (u1.id=u2.id);

The problem appears to be that Citus is generating a bad query for the subquery on the local table;

SELECT NULL::integer AS "dummy-1" FROM local u1 WHERE true;

In Citus 13.x on PG17, the corresponding generated query is:

SELECT id FROM local u1 WHERE true;

Suspect a PG18 change to the Query data structure is being overlooked by Citus, and confuses ruleutils.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions