Skip to content

Metadata changes to MVs cause downtime in prod #5653

@xardasos

Description

@xardasos

When metadata changes occur, materialized views (MVs) are deleted and recreated in prod environment. As a result, the data in production is unavailable during this recreation time, which can be very long in the case of larger MVs. Furthermore, rollbacks require an additional change in the physical layer.

This issue has been observed in RisingWave, but most likely the same problem affects also other engines.

How to reproduce

  1. Create a sqlmesh project with a dummy materialized view
MODEL (
  name my_mv,
  kind VIEW (materialized true),
);

SELECT 1 as my_column
  1. Execute sqlmesh plan
  2. Update metadata of my_mv by adding a tag tags "mytag=a"
  3. Execute sqlmesh plan
  4. Check the sqlmesh logs

Current bug behavior

In Step 2, a MV "sqlmesh__default"."my_mv__3295868668" in the physical layer and a view "default"."my_mv" is created in the virtual layer of the prod environment.

In Step 4:

  • The MV "sqlmesh__default"."my_mv__3295868668" is deleted. Downtime in prod starts.
  • The MV "sqlmesh__default"."my_mv__3295868668" is recreated in the physical layer (with the same definition as before) and the view "default"."my_mv" is recreated in the virtual layer of prod environment. Downtime ends.

Note: my_mv could be a large materialized view that takes days to create, resulting in extended downtime.

Step 4. log

2026-01-09 13:19:03,650 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage CreateSnapshotRecordsStage (evaluator.py:123)
2026-01-09 13:19:03,765 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage PhysicalLayerSchemaCreationStage (evaluator.py:123)
2026-01-09 13:19:03,766 - ThreadPoolExecutor-1_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'sqlmesh__default' (evaluator.py:1449)
2026-01-09 13:19:03,767 - ThreadPoolExecutor-1_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /* SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / CREATE SCHEMA IF NOT EXISTS "sqlmesh__default" (base.py:2626)
2026-01-09 13:19:03,771 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage PhysicalLayerUpdateStage (evaluator.py:123)
2026-01-09 13:19:03,772 - ThreadPoolExecutor-2_0 - sqlmesh.core.snapshot.evaluator - INFO - Listing data objects in schema sqlmesh__default (evaluator.py:1616)
2026-01-09 13:19:03,773 - ThreadPoolExecutor-2_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / SELECT current_catalog (base.py:2626)
2026-01-09 13:19:03,788 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage BackfillStage (evaluator.py:123)
2026-01-09 13:19:03,789 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage EnvironmentRecordUpdateStage (evaluator.py:123)
2026-01-09 13:19:03,789 - MainThread - sqlmesh.core.state_sync.db.facade - INFO - Promoting environment 'prod' (facade.py:167)
2026-01-09 13:19:04,162 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage MigrateSchemasStage (evaluator.py:123)
2026-01-09 13:19:04,164 - ThreadPoolExecutor-3_0 - sqlmesh.core.snapshot.evaluator - INFO - Listing data objects in schema sqlmesh__default (evaluator.py:1616)
2026-01-09 13:19:04,165 - ThreadPoolExecutor-3_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / SELECT current_catalog (base.py:2626)
2026-01-09 13:19:04,185 - ThreadPoolExecutor-4_0 - sqlmesh.core.snapshot.evaluator - INFO - Migrating table schema from 'sqlmesh__default.my_mv__3295868668_schema_tmp' to 'sqlmesh__default.my_mv__3295868668' (evaluator.py:1214)
2026-01-09 13:19:04,185 - ThreadPoolExecutor-4_0 - sqlmesh.core.snapshot.evaluator - INFO - Migrating view 'sqlmesh__default.my_mv__3295868668' (evaluator.py:2762)
2026-01-09 13:19:04,190 - ThreadPoolExecutor-4_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / DROP MATERIALIZED VIEW IF EXISTS "sqlmesh__default"."my_mv__3295868668" CASCADE (base.py:2626)
2026-01-09 13:19:04,389 - ThreadPoolExecutor-4_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / CREATE MATERIALIZED VIEW "sqlmesh__default"."my_mv__3295868668" AS SELECT 1 AS "my_column" (base.py:2626)
2026-01-09 13:19:05,672 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage UnpauseStage (evaluator.py:123)
2026-01-09 13:19:06,003 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage VirtualLayerUpdateStage (evaluator.py:123)
2026-01-09 13:19:06,012 - ThreadPoolExecutor-5_0 - sqlmesh.core.snapshot.evaluator - INFO - Creating schema 'default' (evaluator.py:1449)
2026-01-09 13:19:06,013 - ThreadPoolExecutor-5_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / CREATE SCHEMA IF NOT EXISTS "default" (base.py:2626)
2026-01-09 13:19:06,017 - ThreadPoolExecutor-6_0 - sqlmesh.core.snapshot.evaluator - INFO - Listing data objects in schema default (evaluator.py:1616)
2026-01-09 13:19:06,017 - ThreadPoolExecutor-6_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / SELECT current_catalog (base.py:2626)
2026-01-09 13:19:06,039 - ThreadPoolExecutor-7_0 - sqlmesh.core.snapshot.evaluator - INFO - Updating view 'default.my_mv' to point at table 'sqlmesh__default.my_mv__3295868668' (evaluator.py:1988)
2026-01-09 13:19:06,042 - ThreadPoolExecutor-7_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f / DROP VIEW IF EXISTS "default"."my_mv" CASCADE (base.py:2626)
2026-01-09 13:19:06,046 - ThreadPoolExecutor-7_0 - sqlmesh.core.engine_adapter.base - INFO - Executing SQL: /
SQLMESH_PLAN: 907bcf209da149ca9b48a756e14a780f */ CREATE VIEW "default"."my_mv" AS SELECT * FROM "sqlmesh__default"."my_mv__3295868668" (base.py:2626)
2026-01-09 13:19:06,113 - MainThread - sqlmesh.core.plan.evaluator - INFO - Evaluating plan stage FinalizeEnvironmentStage (evaluator.py:123)
2026-01-09 13:19:06,115 - MainThread - sqlmesh.core.state_sync.db.environment - INFO - Finalizing environment 'prod' (environment.py:141)

Affected SQLMesh versions

0.228.1

The downtime problem does not appear in 0.182.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions