Repo steps:
create table bt1(id int, accno text) with(appendonly=true);
create table st1(id int, val int);
set enable_hashjoin = off;
set enable_mergejoin = off;
set enable_nestloop = on;
set enable_seqscan = off;
create index on bt1 using btree(id);
explain select * from st1 left join bt1 on st1.id = bt1.id;
The execute plan for left join is:
vagrant=# explain select * from st1 left join bt1 on st1.id = bt1.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=4270560 width=44)
-> Nested Loop Left Join (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=2135280 width=44)
Join Filter: (st1.id = bt1.id)
-> Seq Scan on st1 (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=43050 width=8)
-> Bitmap Heap Scan on bt1 (cost=654.37..952.37 rows=24800 width=36)
-> Bitmap Index Scan on bt1_id_idx (cost=0.00..648.16 rows=24800 width=0)
Optimizer: Postgres query optimizer
(7 rows)
The join condition is checked in the nestloop node. The plan scans all btree index, which returns all tuples to the upper execute node. It's unwanted behavior.
drop index bt1_id_idx;
create index on bt1 using brin(id) with(pages_per_range=1);
To compare, if we replace the index by brin index, the plan becomes:
vagrant=# explain select * from st1 left join bt1 on st1.id = bt1.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 2:1 (slice1; segments: 2) (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=4270560 width=44)
-> Nested Loop Left Join (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=2135280 width=44)
-> Seq Scan on st1 (cost=1000000000000000019884624838656.00..1000000000000000019884624838656.00 rows=43050 width=8)
-> Bitmap Heap Scan on bt1 (cost=9384900.03..9385214.03 rows=25 width=36)
Recheck Cond: (st1.id = id)
-> Bitmap Index Scan on bt1_id_idx (cost=0.00..9384900.03 rows=24800 width=0)
Index Cond: (id = st1.id)
Optimizer: Postgres query optimizer
(8 rows)
Repo steps:
The execute plan for left join is:
The join condition is checked in the nestloop node. The plan scans all btree index, which returns all tuples to the upper execute node. It's unwanted behavior.
To compare, if we replace the index by brin index, the plan becomes: