-
Notifications
You must be signed in to change notification settings - Fork 358
Description
Following SQLMesh's initial (fast) plan of a FULL model on the mssql adapter, subsequent SQLMesh runs, which rebuild the model completely, are very slow (hours or more).
This is caused by using the pattern MERGE INTO ... ON (1 = 0) existing table, like this
MERGE INTO [schema].[schema__table__1460139764] AS [__MERGE_TARGET__]
USING ( ... ) AS [__MERGE_SOURCE__]
ON (1 = 0)
WHEN NOT MATCHED BY SOURCE THEN DELETE
WHEN NOT MATCHED THEN INSERT ( ... ) VALUES ( ... );
I can see this query is suspended in the MS SQL query engine with 100% CPU use for hours.
The materialisation strategy for FULL mssql models is not specified in the docs, but no other database uses MERGE for FULL.
https://sqlmesh.readthedocs.io/en/stable/concepts/models/model_kinds/?h=merge#materialization-strategy_2
I suggest using the Postgres method for mssql: DROP TABLE, CREATE TABLE, INSERT
How do I know the MERGE pattern is the problem?
When I terminate the stuck
runprocess and thenTRUNCATE TABLE [schema].[schema__table__1460139764];and re-run
sqlmesh runthen therunis as fast as the initialplan. So SQLMesh is still doing theMERGE INTOexcept now it's comparing0records.The problem is one of squares: if I have a table of 2 million records, and I'm doing a full rebuild, I have 2 million new records which must be compared, and that's 4 trillion comparisons. Even if each comparison only took 1 millisecond, the comparisons would take over 100 years to complete!