Skip to content

Conversation

@onurctirtir
Copy link
Member

@onurctirtir onurctirtir commented Jul 28, 2025

DESCRIPTION: Avoids automatically creating citus_columnar when there are no relations using it

Previously, we were always creating citus_columnar when creating citus with version >= 11.1. And how we were doing was as follows:

  • Detach SQL objects owned by old columnar, i.e., "drop" them from citus, but not actually drop them from the database
    • "old columnar" is the one that we had before Citus 11.1 as part of citus, i.e., before splitting the access method ands its catalog to citus_columnar.
  • Create citus_columnar and attach the SQL objects leftover from old columnar to it so that we can continue supporting the columnar tables that user had before Citus 11.1 with citus_columnar.

First part is unchanged, however, now we don't create citus_columnar automatically anymore if the user didn't have any relations using columnar. For this reason, as of Citus 13.2, when these SQL objects are not owned by an extension and there are no relations using columnar access method, we drop these SQL objects when updating Citus to 13.2.

The net effect is still the same as if we automatically created citus_columnar and user dropped citus_columnar later, so we should not have any issues with dropping them.

(Update: Seems we've made some assumptions in citus, e.g., citus_finish_pg_upgrade() still assumes columnar metadata exists and tries to apply some fixes for it, so this PR fixes them as well. See the last section of this PR description.)

Also, ideally I was hoping to just remove some lines of code from extension.c, where we decide automatically creating citus_columnar when creating citus, however, this didn't happen to be the case for two reasons:

  • We still need to automatically create it for the servers using columnar access method.
  • We need to clean-up the leftover SQL objects from old columnar when the above is not case otherwise we would have leftover SQL objects from old columnar for no reason, and that would confuse users too.
    • Old columnar cannot be used to create columnar tables properly, so we should clean them up and let the user decide whether they want to create citus_columnar when they really need it later.

Also made several changes in the test suite because similarly, we don't always want to have citus_columnar created in citus tests anymore:

  • Now, columnar specific test targets, which cover 41 test sql files, always install columnar by default, by using "--load-extension=citus_columnar".
  • "--load-extension=citus_columnar" is not added to citus specific test targets because by default we don't want to have citus_columnar created during citus tests.
    • Excluding citus_columnar specific tests, we have 601 sql files that we have as citus tests and in 27 of them we manually create citus_columnar at the very beginning of the test because these tests do test some functionalities of citus together with columnar tables.

Also, before and after schedules for PG upgrade tests are now duplicated so we have two versions of each: one with columnar tests and one without. To choose between them, check-pg-upgrade now supports a "test-with-columnar" option, which can be set to "true" or anything else to logically indicate "false". In CI, we run the check-pg-upgrade test target with both options. The purpose is to ensure we can test PG upgrades where citus_columnar is not created in the cluster before the upgrade as well.

Finally, added more tests to multi_extension.sql to test Citus upgrade scenarios with / without columnar tables / citus_columnar extension.


Also, seems citus_finish_pg_upgrade was assuming that citus_columnar is always created but actually we should have never made such an assumption. To fix that, moved columnar specific post-PG-upgrade work from citus to a new columnar UDF, which is columnar_finish_pg_upgrade. But to avoid breaking existing customer / managed service scripts, we continue to automatically perform post PG-upgrade work for columnar within citus_finish_pg_upgrade, but only if columnar access method exists this time.


Please also see the comments I dropped below starting with "Note to reviewer".

@codecov
Copy link

codecov bot commented Jul 28, 2025

Codecov Report

❌ Patch coverage is 59.52381% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.20%. Comparing base (649050c) to head (3ad12a2).
⚠️ Report is 1 commits behind head on main.

❌ Your patch status has failed because the patch coverage (59.52%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8081      +/-   ##
==========================================
- Coverage   89.20%   89.20%   -0.01%     
==========================================
  Files         285      285              
  Lines       61954    62005      +51     
  Branches     7760     7765       +5     
==========================================
+ Hits        55269    55310      +41     
- Misses       4470     4473       +3     
- Partials     2215     2222       +7     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 4 times, most recently from 09924d7 to 5cedb18 Compare July 31, 2025 09:19

-- re-create in newest version
DROP EXTENSION citus;
DROP EXTENSION citus_columnar;
Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

citus_columnar is not automatically created when we create citus anymore (when there are not relations using it), so we don't need to drop it here.

\c - - - :worker_1_port

DROP EXTENSION citus;
DROP EXTENSION citus_columnar;
Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

citus_columnar is not automatically created when we create citus anymore (when there are not relations using it), so we don't need to drop it here.

Comment on lines -1051 to -1052
-- Easiest way of verifying this is to drop and re-create columnar extension.
DROP EXTENSION citus_columnar;
Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

citus_columnar is not automatically created when we create citus anymore (when there are not relations using it), so we don't need to drop it here.

/*alter extension citus update, need upgrade citus_columnar from Y to Z*/
int versionNumber = (int) (100 * strtod(CITUS_MAJORVERSION, NULL));
if (versionNumber >= 1110)
if (versionNumber >= 1110 && citusColumnarOid != InvalidOid)
Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

Now that we might not always auto-create citus_columnar anymore, need to make sure that we alter citus_columnar only if it exists. The other relevant codepaths anyways had such checks.

{
if (get_extension_oid("citus_columnar", true) == InvalidOid)
if (get_extension_oid("citus_columnar", true) == InvalidOid &&
(versionNumber < 1320 || HasAnyRelationsUsingOldColumnar()))
Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

The checks making use of "versionNumber < 1320" in this file are mostly introduced to avoid updating our downgrade tests and won't make a difference for real world use cases.

This is because, from this commit, we don't anyway allow creating citus with a sql version smaller than 13.2-1 (note that "13.2-1" corresponds to 1320 due to the way we translate it to an integer) thanks to citus.enable_version_checks being enabled by default. And the only usage of that GUC as being disabled is when we want to have a way of testing downgrades in multi_extension.sql.

So by checking "versionNumber < 1320" in these codepaths, we continue automatically creating citus_columnar together with citus in the codepaths that we were doing so before. And in real-world, this check will never hold (as as we don't really recommend disabling citus.enable_version_checks), so we will only auto-create citus_columnar when HasAnyRelationsUsingOldColumnar() holds, which is the desired effect.

Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

Previously, as we were always creating citus_columnar when creating Citus with version >= 11.1, we were first detaching these objects (i.e., were "dropping" them from "citus", but we were not actually dropping them from the database), then we were creating citus_columnar and attaching them to it.

First part is unchanged, however, now we don't create citus_columnar automatically anymore if the user didn't have any relations using columnar. For this reason, as of Citus 13.2, when these SQL objects are not owned by an extension and there are no relations depending columnar access method, we drop these SQL objects here.

The net effect is still the same as if we created automatically citus_columnar and user dropped citus_columnar later, so we should not have any issues with dropping them.

@onurctirtir onurctirtir marked this pull request as ready for review July 31, 2025 09:55
@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 2 times, most recently from 9ab5f50 to 34044c2 Compare July 31, 2025 16:46
Comment on lines 206 to 208
-- set dependencies for columnar table access method
PERFORM columnar_internal.columnar_ensure_am_depends_catalog();

Copy link
Member Author

@onurctirtir onurctirtir Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

We should not assume presence of citus_columnar anymore.

Actually, these lines should've never been placed in a Citus UDF because even before it was possible that a user could drop citus_columnar later on, so should've never made such an assumption.

For context, #5456 introduced these lines for below reasons:

Note that in addition to inserting those records via installation script, we also
do the same in citus_finish_pg_upgrade(). This is because, pg_upgrade
rebuilds catalog tables in the new cluster and that means, we must insert
them in the new cluster too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, a new UDF is created for columnar at src/backend/columnar/sql/udfs/columnar_finish_pg_upgrade/13.2-1.sql and invocation of "columnar_internal.columnar_ensure_am_depends_catalog()" is moved there.

Also note that now "columnar_finish_pg_upgrade()" needs to be called by users after a PG upgrade, as it was the case for citus_finish_pg_upgrade().

@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 2 times, most recently from 002fe1c to 1c81c24 Compare August 5, 2025 10:33
partitioned_citus_table_exists_pre_11 | is_null
---------------------------------------------------------------------
| t
f | f
Copy link
Member Author

@onurctirtir onurctirtir Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

Given that now we have a few tests that upgrade Citus to 13.2 before this line, 'partitioned_citus_table_exists_pre_11' field remains in the one and only row of the pg_dist_node_metadata as "false", rather than not having it, which is fine.

@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 5 times, most recently from db67392 to b21ebb5 Compare August 5, 2025 15:07
@ihalatci ihalatci added this to the 13.2 Release milestone Aug 7, 2025
@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 9 times, most recently from cfbfc95 to 634a272 Compare August 12, 2025 13:12
onurctirtir added a commit to citusdata/test-automation that referenced this pull request Aug 12, 2025
Comment on lines +74 to +76
-- FIXME: Somehow, after we stopped creating citus extension for columnar tests,
-- we started observing %38 growth in TopMemoryContext here.
SELECT CASE WHEN 1.0 * TopMemoryContext / :top_post BETWEEN 0.98 AND 1.40 THEN 1 ELSE 1.0 * TopMemoryContext / :top_post END AS top_growth
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

No code changes are made for citus_columnar, so this doesn't actually indicate a bug but a result of the changes made in the test environment, although not ideal.

But in any case, I feel like I should re-open the related issue and leave this output like that.


-- Verify the partitions' access methods on the worker node
SELECT relname, relam
SELECT relname, amname
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

well, using oid of the access-method is flaky, so switched to using amname instead

# Split Shard tests.
# Include tests from 'minimal_schedule' for setup.
test: multi_test_helpers multi_test_helpers_superuser columnar_test_helpers
test: multi_test_helpers multi_test_helpers_superuser
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

Removed columnar_test_helpers from citus schedules so that we don't have to create citus_columnar in those schedules

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer:

Removed citus specific tests from columnar schedule so that we don't have to do --load-extension=citus for columnar_schedule.

First three lines were there to define some test helper UDFs for citus, so they were not needed at all.

columnar_citus_integration is moved to a citus schedule since it depends on citus.

check_mx is anyway only needed for citus tests, so removed.

@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch 3 times, most recently from 98ff38b to f61d70f Compare August 15, 2025 09:02
@onurctirtir onurctirtir force-pushed the not-install-columnar-by-default branch from f61d70f to 8d02991 Compare August 15, 2025 14:01
@onurctirtir onurctirtir enabled auto-merge (squash) August 15, 2025 14:49
@eaydingol
Copy link
Collaborator

LGTM, another pair of eyes would be good

@tejeswarm
Copy link
Contributor

There are two columnar extensions

columnar (legacy)
citus_columnar (latest)

Let's check the functionality for all scenarios, installations with

  1. columnar created by default and no user objects with it.
  2. columnar created by default and user objects with it.
  3. citus_columnar created by default and no user objects with it.
  4. citus_columnar created by default and user objects with it.

Not sure if (1) and (2) are possible,, could there be old installation that are never upgraded? This PR should be no-op for (4), and bulk of the impact (and installations) will be on (3) because we attempt to drop not only extension but also objects. I think (re)check thoroughly the differentiating condition between (3) and (4) to avoid any accidental damage to customers' data with type (4).

@onurctirtir
Copy link
Member Author

There are two columnar extensions

columnar (legacy) citus_columnar (latest)

Let's check the functionality for all scenarios, installations with

  1. columnar created by default and no user objects with it.
  2. columnar created by default and user objects with it.
  3. citus_columnar created by default and no user objects with it.
  4. citus_columnar created by default and user objects with it.

Not sure if (1) and (2) are possible,, could there be old installation that are never upgraded? This PR should be no-op for (4), and bulk of the impact (and installations) will be on (3) because we attempt to drop not only extension but also objects. I think (re)check thoroughly the differentiating condition between (3) and (4) to avoid any accidental damage to customers' data with type (4).

All cases are possible right now.

  1. Then we drop columnar AM and any other related catalog tables.
  2. We replace old columnar with citus_columnar, as before.
  3. We don't do anything, citus_columnar stays there. It's entirely user's decision to drop citus_columnar later or to use it.
  4. Same as 3.

@colm-mchugh colm-mchugh self-requested a review August 18, 2025 07:27
Copy link
Contributor

@colm-mchugh colm-mchugh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, thanks.

@onurctirtir onurctirtir merged commit 87a1b63 into main Aug 18, 2025
209 of 275 checks passed
@onurctirtir onurctirtir deleted the not-install-columnar-by-default branch August 18, 2025 07:29
@tejeswarm
Copy link
Contributor

QQ: @onurctirtir Eventually we will clean up this one too, similar to columnar? - citus_columnar created by default and no user objects with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants