Skip to content

MSSQL: Failure on backfill for subqueries using ORDER BY #4649

@MarcusRisanger

Description

@MarcusRisanger

I believe this is the same issue as faced in #4048

Minimal example:

Seed data:

id,item_id,event_date
1,2,2020-01-01
2,1,2020-01-01
3,3,2020-01-03
4,1,2020-01-04
5,1,2020-01-05
6,1,2025-01-06
7,1,2025-01-07

Seed model:

MODEL (
  name sqlmesh_example.seed_model,
  kind SEED (
    path '../seeds/seed_data.csv'
  ),
  columns (
    id INTEGER,
    item_id INTEGER,
    event_date DATE
  ),
  grain (id, event_date)
);

Query with an OUTER APPLY subquery:

MODEL (
  name sqlmesh_example.output_model,
  kind INCREMENTAL_BY_PARTITION,
  partitioned_by item_id,
  cron '@daily',
);

WITH partitions_to_update AS (
  SELECT DISTINCT item_id
  FROM sqlmesh_example.seed_model
  WHERE event_date > DATEADD(YEAR, -1, GETDATE())
)

SELECT 
  item_id,
  last_id.id AS last_id
FROM 
  sqlmesh_example.seed_model
  OUTER APPLY (
    SELECT TOP 1 id
    FROM sqlmesh_example.seed_model AS s
    WHERE s.item_id = item_id
    ORDER BY event_date DESC
  ) AS last_id
WHERE
  item_id IN (SELECT item_id FROM partitions_to_update)
;

Query with a "regular" SELECT subquery:

MODEL (
  name sqlmesh_example.output_model_2,
  kind INCREMENTAL_BY_PARTITION,
  partitioned_by item_id,
  cron '@daily',
);

WITH partitions_to_update AS (
  SELECT DISTINCT item_id
  FROM sqlmesh_example.seed_model
  WHERE event_date > DATEADD(YEAR, -1, GETDATE())
)

SELECT 
  item_id,
  (SELECT TOP 1 id
    FROM sqlmesh_example.seed_model AS s
    WHERE s.item_id = item_id
    ORDER BY event_date DESC) AS last_id
FROM 
  sqlmesh_example.seed_model
WHERE
  item_id IN (SELECT item_id FROM partitions_to_update)
;

Run commands:
sqlmesh plan dev

Backfill:
sqlmesh plan dev --start '2020-01-01'

Results in the following error:

The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.DB-Lib error message 20018, severity 15:
General SQL Server
error: Check messages from the SQL Server
DB-Lib error message 20018, severity 15:
General SQL Server error: Check messages from the SQL Server'

From the logs it is apparent that the LIMIT 1 clause is retained in table creation but not in backfill. Table creation fix was addressed in #4050 , but the fix does not appear to cover the partition update scenario.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions