Cloudberry Database version
No response
What happened
While spiking on MV recently, found that Parallel Refresh Materialized View of AO/AOCO storage is disabled before.
It's closed by me because the reason:
Parallel workers processes couldn't have writeable operations, the assertion is added by PG.
cannot update tuples during a parallel operation
It's not a problem for PG as workers are launched by Gather node, and the SELECT part of Refresh MV could be parallel.
However, AO/AOCS will require batches of Row number, that is a writeable incremental on a special table.
But CBDB parallel will EnterParallelMode() anyway when ExecutePlan in QE if there is parallel across the whole plan.That's a difference between PG and CBDB parallel. Not sure which is reasonable, because CBDB parallel style doesn't have a leader process, each process could be treated as parallel worker. Another way is to distinguish process's role according to the plan which is more complicated and more prone to error.
When in parallel mode, update is not allowed because PG haven't enabled parallel update. REFRESH MATERIAL VIEW for a ao table is specially implemented in CBDB. We will insert into ao tables, and need a row number generated from gp_fastquence which will in-place update catalog.
So we disable parallel for AO/AOCS when refresh. Disable ctas in parallel mode if create AO/AOCS tables
At first sight: the issue exists for the similar operation like
- REFRESH MATERIAL VIEW ao;
- CREATE TABLE AO USEING ao_row AS;
What you think should happen instead
Nowadays, EnterParallelMode() is not so accurate for CBDB, at least for the slice only execute SELECT part of REFRESH.
It's possible to Parallel Refresh AO Materialized View. Benefit a lot with it if there is agg on many data and etc. @my-ship-it
How to reproduce
Refresh MV of AO.
Operating System
Ubuntu
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Cloudberry Database version
No response
What happened
While spiking on MV recently, found that Parallel Refresh Materialized View of AO/AOCO storage is disabled before.
It's closed by me because the reason:
Parallel workers processes couldn't have writeable operations, the assertion is added by PG.
cannot update tuples during a parallel operationIt's not a problem for PG as workers are launched by Gather node, and the
SELECTpart of Refresh MV could be parallel.However, AO/AOCS will require batches of Row number, that is a writeable incremental on a special table.
But CBDB parallel will EnterParallelMode() anyway when ExecutePlan in QE if there is parallel across the whole plan.That's a difference between PG and CBDB parallel. Not sure which is reasonable, because CBDB parallel style doesn't have a leader process, each process could be treated as parallel worker. Another way is to distinguish process's role according to the plan which is more complicated and more prone to error.
When in parallel mode, update is not allowed because PG haven't enabled parallel update. REFRESH MATERIAL VIEW for a ao table is specially implemented in CBDB. We will insert into ao tables, and need a row number generated from gp_fastquence which will in-place update catalog.
So we disable parallel for AO/AOCS when refresh. Disable ctas in parallel mode if create AO/AOCS tables
At first sight: the issue exists for the similar operation like
What you think should happen instead
Nowadays, EnterParallelMode() is not so accurate for CBDB, at least for the slice only execute
SELECTpart ofREFRESH.It's possible to Parallel Refresh AO Materialized View. Benefit a lot with it if there is agg on many data and etc. @my-ship-it
How to reproduce
Refresh MV of AO.
Operating System
Ubuntu
Anything else
No response
Are you willing to submit PR?
Code of Conduct