diff --git a/AGENTS.md b/AGENTS.md index 6850c3c..350e35c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -642,6 +642,7 @@ To add/modify JSON schemas: | `dj.dataExplorer.autoRefresh` | boolean | `false` | Auto-refresh Data Explorer on file switch | | `dj.lightdashProjectPath` | string | — | Custom path to dbt project for Lightdash | | `dj.lightdashProfilesPath` | string | — | Custom path to dbt profiles for Lightdash | +| `dj.lightdash.defaultPartitionColumnCaseSensitive` | boolean | `false` | Auto-emit `meta.dimension.case_sensitive: true` on partition columns in generated YAML (stops Lightdash from wrapping partition columns in `UPPER()`, preserving Trino predicate pushdown). Requires `DJ: Sync to SQL and YML` to apply. | | `dj.materialization.defaultIncrementalStrategy` | string | `overwrite_existing_partitions` | Default incremental strategy: `append`, `delete+insert`, `merge`, or `overwrite_existing_partitions` | | `dj.autoGenerateTests` | object | — | (Experimental) Auto-generate tests on models | diff --git a/CHANGELOG.md b/CHANGELOG.md index 9044c0f..9356526 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,11 @@ # Change Log +## 1.3.5 + +- **`unique_key` no longer emitted for `overwrite_existing_partitions`** — this strategy requires a custom dbt macro in your project (typically `get_incremental_overwrite_existing_partitions_sql`); the DJ extension does not ship it and dbt-trino does not provide it natively. If your project does not define the macro, switch to `{ "type": "delete+insert" }` — it auto-derives `unique_key` from partition columns. +- **`dj.lightdash.defaultPartitionColumnCaseSensitive`** (default: `false`) — when `true`, partition columns in generated YAML get `meta.dimension.case_sensitive: true`. This stops Lightdash from wrapping them in `UPPER()` in queries, preserving Trino predicate pushdown on partitioned tables. Per-model and per-column `lightdash.case_sensitive` overrides in `.model.json` continue to apply. +- **Aggregation Validator Enhancements** — Validation issues are now flagged as Warnings rather than errors, allowing you to generate SQL and iterate even if columns are un-aggregated. The validator now ignores constant values (e.g., 0, null, 'foo') and Jinja/dbt macros that it cannot introspect. Specific messages added to guide on partition-column alignment for window functions, replacing generic aggregation errors. + ## 1.3.4 - Partition columns automatically emit `case_sensitive: true` in YAML meta so Lightdash does not wrap them in `UPPER()`, preserving predicate pushdown diff --git a/docs/SETTINGS.md b/docs/SETTINGS.md index 86f3fd8..7df3216 100644 --- a/docs/SETTINGS.md +++ b/docs/SETTINGS.md @@ -4,24 +4,25 @@ Complete guide to configuring the DJ VS Code extension. ## Quick Reference -| Setting | Purpose | Takes Effect | -| --------------------------- | ---------------------------------- | --------------- | -| `pythonVenvPath` | Python virtual environment path | Next command ⚡ | -| `trinoPath` | Trino CLI executable location | Next query ⚡ | -| `dbtProjectNames` | Filter which dbt projects to load | Refresh 🔄 | -| `dbtMacroPath` | Custom path for extension macros | Refresh 🔄 | -| `dbtGenericTestsPath` | Custom path for generic test files | Refresh 🔄 | -| `airflowGenerateDags` | Enable Airflow DAG generation | Refresh 🔄 | -| `airflowTargetVersion` | Target Airflow version | Refresh 🔄 | -| `airflowDagsPath` | Custom path for Airflow DAGs | Refresh 🔄 | -| `lightdashProjectPath` | Custom Lightdash project path | Next preview ⚡ | -| `lightdashProfilesPath` | Custom Lightdash profiles path | Next preview ⚡ | -| `aiHintTag` | Tag for AI-generated hints | Next sync 🔄 | -| `codingAgent` | Coding agent integration | Refresh 🔄 | -| `autoGenerateTests` | Auto-generate row count tests | Varies 🔄 | -| `columnLineage.autoRefresh` | Auto-refresh column lineage | File switch ✅ | -| `dataExplorer.autoRefresh` | Auto-refresh data explorer | File switch ✅ | -| `logLevel` | Extension logging level | Immediate ✅ | +| Setting | Purpose | Takes Effect | +| ----------------------------------------------- | ---------------------------------------------------------------- | --------------- | +| `pythonVenvPath` | Python virtual environment path | Next command ⚡ | +| `trinoPath` | Trino CLI executable location | Next query ⚡ | +| `dbtProjectNames` | Filter which dbt projects to load | Refresh 🔄 | +| `dbtMacroPath` | Custom path for extension macros | Refresh 🔄 | +| `dbtGenericTestsPath` | Custom path for generic test files | Refresh 🔄 | +| `airflowGenerateDags` | Enable Airflow DAG generation | Refresh 🔄 | +| `airflowTargetVersion` | Target Airflow version | Refresh 🔄 | +| `airflowDagsPath` | Custom path for Airflow DAGs | Refresh 🔄 | +| `lightdashProjectPath` | Custom Lightdash project path | Next preview ⚡ | +| `lightdashProfilesPath` | Custom Lightdash profiles path | Next preview ⚡ | +| `lightdash.defaultPartitionColumnCaseSensitive` | Set default `case_sensitive` value for partition columns in YAML | Next sync 🔄 | +| `aiHintTag` | Tag for AI-generated hints | Next sync 🔄 | +| `codingAgent` | Coding agent integration | Refresh 🔄 | +| `autoGenerateTests` | Auto-generate row count tests | Varies 🔄 | +| `columnLineage.autoRefresh` | Auto-refresh column lineage | File switch ✅ | +| `dataExplorer.autoRefresh` | Auto-refresh data explorer | File switch ✅ | +| `logLevel` | Extension logging level | Immediate ✅ | **Legend:** ✅ Immediate | ⚡ Next command/action | 🔄 Requires `DJ: Refresh Projects` or sync @@ -117,6 +118,16 @@ Complete guide to configuring the DJ VS Code extension. - Both optional (extension auto-detects by default) - See [Lightdash Configuration Guide](setup/lightdash-configuration.md) +**`dj.lightdash.defaultPartitionColumnCaseSensitive`** - Auto-emit `case_sensitive: true` on partition columns in generated YAML (default: `false`) + +```json +{ "dj.lightdash.defaultPartitionColumnCaseSensitive": true } +``` + +- When `true`, every generated partition column in `.yml` files gets `meta.dimension.case_sensitive: true`. This stops Lightdash from wrapping the column in `UPPER()` in queries, preserving Trino predicate pushdown on partitioned tables. +- When `false` (the default), partition columns are emitted without the auto-injected `case_sensitive` flag. Per-model and per-column `lightdash.case_sensitive` overrides in `.model.json` continue to work in either mode. +- Takes effect on next `DJ: Sync to SQL and YML`. + --- ### AI & Coding Agents @@ -230,6 +241,7 @@ Run this command (`Cmd/Ctrl+Shift+P` → `DJ: Refresh Projects`) after changing: ### 🔄 Requires `DJ: Sync to SQL and YML` - `aiHintTag` - Recompiles models with updated tags +- `lightdash.defaultPartitionColumnCaseSensitive` - Re-emits partition column YAML meta --- diff --git a/docs/models/README.md b/docs/models/README.md index 4b05429..c07245e 100644 --- a/docs/models/README.md +++ b/docs/models/README.md @@ -433,7 +433,7 @@ Set `materialization.strategy.type` to one of the following (or rely on the exte | `append` | Inserts new rows with no de-duplication. Fastest. | Upstream must guarantee no duplicates in the new slice. | | `delete+insert` | Partition-safe upsert. Safe default. | `unique_key` auto-derived from partition columns when omitted. Works on Delta Lake, Hive, and Iceberg. | | `merge` | Row-level upsert on `unique_key`. | **dbt-trino requires Iceberg format** on the target table. On Delta Lake / Hive use `delete+insert` instead. | -| `overwrite_existing_partitions` | Drops and rewrites only the partitions present in the new slice. | **Requires a custom dbt macro in your project** (e.g. `get_incremental_overwrite_existing_partitions_sql`). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. If your project does not define it, use `delete+insert` with a partition column as `unique_key` \u2014 it produces equivalent behavior for daily/monthly partitioned models. | +| `overwrite_existing_partitions` | Drops and rewrites only the partitions present in the new slice. `unique_key` is not applicable, the consumer macro derives the partition list from the new slice itself, so the schema rejects `unique_key` on this strategy. | **Requires a custom dbt macro in your project** (e.g. `get_incremental_overwrite_existing_partitions_sql`). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. If your project does not define it, use `delete+insert` with a partition column as `unique_key` \u2014 it produces equivalent behavior for daily/monthly partitioned models. | ### Data Quality diff --git a/docs/setup/lightdash-configuration.md b/docs/setup/lightdash-configuration.md index 31e49d7..3c2d05c 100644 --- a/docs/setup/lightdash-configuration.md +++ b/docs/setup/lightdash-configuration.md @@ -38,6 +38,19 @@ You can override the auto-detection by setting these VS Code configuration optio - **Description**: Custom path to the dbt profiles directory (relative to workspace root) - **Default**: Uses the project directory or workspace root if not specified +#### `dj.lightdash.defaultPartitionColumnCaseSensitive` + +- **Type**: `boolean` +- **Default**: `false` +- **Description**: When `true`, the extension automatically emits `meta.dimension.case_sensitive: true` on every partition column in generated YAML. This stops Lightdash from wrapping the column in `UPPER()` in queries, preserving Trino predicate pushdown on partitioned tables. +- **When to enable**: Lightdash projects that query partitioned tables and notice slow scans because partition predicates are wrapped in `UPPER()`. +- **Per-model override**: `lightdash.case_sensitive` on a specific model or column in `.model.json` continues to take precedence in either mode. +- **Takes effect**: Run `DJ: Sync to SQL and YML` (`Cmd+Shift+P`) to regenerate YAML with the new value. + +```json +{ "dj.lightdash.defaultPartitionColumnCaseSensitive": true } +``` + ## Examples ### Example 1: dbt project in subdirectory diff --git a/package-lock.json b/package-lock.json index 83cfcda..02b8c57 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,12 +1,12 @@ { "name": "dj-framework", - "version": "1.3.4", + "version": "1.3.5", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "dj-framework", - "version": "1.3.4", + "version": "1.3.5", "license": "Apache-2.0", "workspaces": [ "web" @@ -255,28 +255,27 @@ } }, "node_modules/@azure/msal-node": { - "version": "5.1.4", - "resolved": "https://registry.npmjs.org/@azure/msal-node/-/msal-node-5.1.4.tgz", - "integrity": "sha512-G4LXGGggok1QC48uKu64/SV2DPRDlddmV8EieK8pflsNYMj9/Zz+Y9OHoEBhT15h+zpdwXXLYA/7PJCR/yZ8aw==", + "version": "5.1.5", + "resolved": "https://registry.npmjs.org/@azure/msal-node/-/msal-node-5.1.5.tgz", + "integrity": "sha512-ObTeMoNPmq19X3z40et9Xvs4ZoWVeJg43PZMRLG5iwVL+2nCtAerG3YTDItqPp1CfXNwmCXBbg8jn1DOx65c3g==", "dev": true, "license": "MIT", "dependencies": { - "@azure/msal-common": "16.5.1", - "jsonwebtoken": "^9.0.0", - "uuid": "^8.3.0" + "@azure/msal-common": "16.5.2", + "jsonwebtoken": "^9.0.0" }, "engines": { "node": ">=20" } }, - "node_modules/@azure/msal-node/node_modules/uuid": { - "version": "8.3.2", - "resolved": "https://registry.npmjs.org/uuid/-/uuid-8.3.2.tgz", - "integrity": "sha512-+NYs2QeMWy+GWFOEm9xnn6HCDp0l7QBD7ml8zLUmJ+93Q5NF0NocErnwkTkXVFNiX3/fpC6afS8Dhb/gz7R7eg==", + "node_modules/@azure/msal-node/node_modules/@azure/msal-common": { + "version": "16.5.2", + "resolved": "https://registry.npmjs.org/@azure/msal-common/-/msal-common-16.5.2.tgz", + "integrity": "sha512-GkDEL6TYo3HgT3UuqakdgE9PZfc1hMki6+Hwgy1uddb/EauvAKfu85vVhuofRSo22D1xTnWt8Ucwfg4vSCVwvA==", "dev": true, "license": "MIT", - "bin": { - "uuid": "dist/bin/uuid" + "engines": { + "node": ">=0.8.0" } }, "node_modules/@babel/code-frame": { @@ -10815,9 +10814,9 @@ } }, "node_modules/postcss": { - "version": "8.5.6", - "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.6.tgz", - "integrity": "sha512-3Ybi1tAuwAP9s0r1UQ2J4n5Y0G05bJkpUIO0/bI9MhwmD70S5aTWbXGBwxHrelT+XM1k6dM0pk+SwNkpTRN7Pg==", + "version": "8.5.12", + "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.12.tgz", + "integrity": "sha512-W62t/Se6rA0Az3DfCL0AqJwXuKwBeYg6nOaIgzP+xZ7N5BFCI7DYi1qs6ygUYT6rvfi6t9k65UMLJC+PHZpDAA==", "dev": true, "funding": [ { diff --git a/package.json b/package.json index 8055d28..cc44c40 100644 --- a/package.json +++ b/package.json @@ -10,7 +10,7 @@ "type": "git", "url": "https://github.com/Workday/vscode-dbt-json.git" }, - "version": "1.3.4", + "version": "1.3.5", "workspaces": [ "web" ], @@ -321,6 +321,11 @@ "type": "string", "description": "Custom path to the dbt profiles directory for Lightdash preview (relative to workspace root). If not set, will use the project directory or workspace root. Takes effect on next command execution." }, + "dj.lightdash.defaultPartitionColumnCaseSensitive": { + "type": "boolean", + "default": false, + "description": "Automatically set case_sensitive: true on partition columns in generated YAML. Requires 'DJ: Sync to SQL and YML' to take effect." + }, "dj.columnLineage.autoRefresh": { "type": "boolean", "default": true, @@ -784,4 +789,4 @@ "typescript": "^5.4.2", "typescript-eslint": "^8.50.1" } -} +} \ No newline at end of file diff --git a/schemas/model.incremental_strategy.schema.json b/schemas/model.incremental_strategy.schema.json index bdd289f..458b80b 100644 --- a/schemas/model.incremental_strategy.schema.json +++ b/schemas/model.incremental_strategy.schema.json @@ -76,8 +76,8 @@ } }, { - "description": "Overwrite existing partitions: drop and rewrite only the partitions present in the new slice. REQUIRES a custom dbt macro in your project (e.g. get_incremental_overwrite_existing_partitions_sql). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. If your project does not define this macro, use 'delete+insert' with a partition column as unique_key — it produces equivalent behavior for daily/monthly partitioned models.", - "markdownDescription": "**`overwrite_existing_partitions`** — drop & rewrite only partitions in the new slice.\n\n> **WARNING — CUSTOM MACRO REQUIRED.** This strategy is NOT shipped by the DJ extension and is NOT part of vanilla dbt-trino. Your dbt project must define the dispatch macro (typically `get_incremental_overwrite_existing_partitions_sql`) for it to compile.\n>\n> **Don't have the macro?** Use `{ \"type\": \"delete+insert\" }` instead — if a partition column exists, DJ auto-fills `unique_key` and the behavior is equivalent for partition-aligned daily/monthly incrementals.", + "description": "Overwrite existing partitions: drop and rewrite only the partitions present in the new slice. REQUIRES a custom dbt macro in your project (e.g. get_incremental_overwrite_existing_partitions_sql). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. The macro derives the partition list from the new slice itself, so unique_key is not applicable for this strategy and is rejected by the schema. If your project does not define this macro, use 'delete+insert' with a partition column as unique_key — it produces equivalent behavior for daily/monthly partitioned models.", + "markdownDescription": "**`overwrite_existing_partitions`** — drop & rewrite only partitions in the new slice.\n\n> **WARNING — CUSTOM MACRO REQUIRED.** This strategy is NOT shipped by the DJ extension and is NOT part of vanilla dbt-trino. Your dbt project must define the dispatch macro (typically `get_incremental_overwrite_existing_partitions_sql`) for it to compile.\n>\n> **`unique_key` is not applicable** — the macro derives the partition list from the new slice itself, so this strategy ignores `unique_key` and the schema rejects it.\n>\n> **Don't have the macro?** Use `{ \"type\": \"delete+insert\" }` instead — if a partition column exists, DJ auto-fills `unique_key` and the behavior is equivalent for partition-aligned daily/monthly incrementals.", "type": "object", "required": ["type"], "additionalProperties": false, @@ -85,17 +85,6 @@ "type": { "const": "overwrite_existing_partitions", "description": "Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one." - }, - "unique_key": { - "description": "Optional override for the unique key(s). Defaults to the model's partition column when omitted.", - "anyOf": [ - { "type": "string" }, - { - "type": "array", - "minItems": 1, - "items": { "type": "string" } - } - ] } } } diff --git a/src/services/config.ts b/src/services/config.ts index b85fac2..6cc0230 100644 --- a/src/services/config.ts +++ b/src/services/config.ts @@ -42,6 +42,10 @@ export function getDjConfig(): CoderConfig { trinoPath: config.get('trinoPath', undefined), lightdashProjectPath: config.get('lightdashProjectPath', undefined), lightdashProfilesPath: config.get('lightdashProfilesPath', undefined), + lightdashDefaultPartitionColumnCaseSensitive: config.get( + 'lightdash.defaultPartitionColumnCaseSensitive', + false, + ), // Settings with defaults from package.json. // All fallbacks (extension, webview, generator) route through the shared @@ -264,6 +268,12 @@ export function getSettingReloadRequirement( actionCommand: 'dj.command.jsonSync', description: "Requires 'DJ: Sync to SQL and YML' to take effect", }, + 'lightdash.defaultPartitionColumnCaseSensitive': { + requiresAction: true, + action: 'compile', + actionCommand: 'dj.command.jsonSync', + description: "Requires 'DJ: Sync to SQL and YML' to take effect", + }, // Complex settings autoGenerateTests: { @@ -761,6 +771,18 @@ export function registerConfigurationChangeHandler( ); } + // lightdash.defaultPartitionColumnCaseSensitive - notify about compile requirement + if ( + event.affectsConfiguration( + 'dj.lightdash.defaultPartitionColumnCaseSensitive', + ) + ) { + void vscode.window.setStatusBarMessage( + "DJ: Partition column case_sensitive updated - run 'DJ: Sync to SQL and YML' to apply", + 5000, + ); + } + // Lightdash paths - simple notifications if (event.affectsConfiguration('dj.lightdashProjectPath')) { void vscode.window.setStatusBarMessage( diff --git a/src/services/framework/__tests__/cte-validation.test.ts b/src/services/framework/__tests__/cte-validation.test.ts index a87540b..ca2292f 100644 --- a/src/services/framework/__tests__/cte-validation.test.ts +++ b/src/services/framework/__tests__/cte-validation.test.ts @@ -808,6 +808,104 @@ describe('validateMainModelAggregation (Gap 2)', () => { expect(errors).toHaveLength(0); }); + // Constants have no column dependency, so they can never violate GROUP BY. + // Covers the literal patterns we see in the wild: `0`, padded numerics, + // null in any case, quoted strings, and `CAST( AS )`. + test.each([ + ['numeric zero', '0'], + ['numeric with surrounding whitespace', ' 0 '], + ['negative numeric', '-1'], + ['decimal numeric', '3.14'], + ['lowercase null', 'null'], + ['uppercase NULL', 'NULL'], + ['single-quoted string', "'foo'"], + ['double-quoted string', '"bar"'], + ['cast of null', 'CAST(NULL AS VARCHAR)'], + ['cast of null with parameterised type', 'CAST(NULL AS VARCHAR(50))'], + ['cast of numeric literal', 'CAST(0 AS DOUBLE)'], + ['cast of string literal', "CAST('5' AS INTEGER)"], + ])('no error for constant expr (%s)', (_desc, expr) => { + const errors = validateMainModelAggregation({ + group_by: 'dims', + select: [ + { name: 'region', type: 'dim' }, + { name: 'metric', type: 'fct', expr }, + ], + }); + expect(errors).toHaveLength(0); + }); + + // Jinja/dbt macros are opaque -- the expansion may wrap a real aggregate. + // We treat any `{{ ... }}` / `{% ... %}` `expr` as aggregated to avoid + // false-positive "un-aggregated" diagnostics. + test.each([ + ['simple macro call', '{{ def_total_revenue() }}'], + ['macro with args', '{{ my_macro(col_a, col_b) }}'], + ['statement-style block', '{% if foo %}sum(x){% else %}0{% endif %}'], + ])('no error for Jinja expr (%s)', (_desc, expr) => { + const errors = validateMainModelAggregation({ + group_by: 'dims', + select: [ + { name: 'region', type: 'dim' }, + { name: 'metric', type: 'fct', expr }, + ], + }); + expect(errors).toHaveLength(0); + }); + + // Window functions are still flagged (they ARE problematic with GROUP BY) + // but with a tailored message that points to partition-column alignment + // instead of suggesting `agg`/`aggs` (which doesn't apply to a row-wise + // window function). + test.each([ + [ + 'max(...) over partition', + 'max(amount) over (partition by region, portal_partition_daily)', + ], + [ + 'count distinct over partition', + 'count(distinct event_id) over (partition by region, portal_partition_hourly)', + ], + [ + 'count over partition', + 'count(user_id) over (partition by region, portal_partition_daily)', + ], + ])('tailored window-function warning for %s', (_desc, expr) => { + const errors = validateMainModelAggregation({ + group_by: 'dims', + select: [ + { name: 'region', type: 'dim' }, + { name: 'metric', type: 'fct', expr }, + ], + }); + expect(errors).toHaveLength(1); + expect(errors[0].message).toContain('Window function'); + expect(errors[0].message).toContain('metric'); + expect(errors[0].message).not.toContain('agg"/"aggs'); + expect(errors[0].instancePath).toBe('/select/1'); + }); + + test('window-function warning for CTE scalar ref includes CTE name', () => { + const errors = validateMainModelAggregation({ + group_by: 'dims', + from: { cte: 'pre_agg' }, + select: [ + { cte: 'pre_agg', type: 'dims_from_cte' }, + { + cte: 'pre_agg', + name: 'rolling_sum', + type: 'fct', + expr: 'sum(amount) over (partition by region)', + }, + ], + }); + expect(errors).toHaveLength(1); + expect(errors[0].message).toContain('Window function'); + expect(errors[0].message).toContain('rolling_sum'); + expect(errors[0].message).toContain('pre_agg'); + expect(errors[0].instancePath).toBe('/select/1'); + }); + test('no error when no group_by is set', () => { const errors = validateMainModelAggregation({ select: [{ name: 'revenue', type: 'fct' }], diff --git a/src/services/framework/__tests__/index.test.ts b/src/services/framework/__tests__/index.test.ts index 1b0aacb..3e04677 100644 --- a/src/services/framework/__tests__/index.test.ts +++ b/src/services/framework/__tests__/index.test.ts @@ -949,7 +949,11 @@ describe('incremental strategy variants', () => { expect(config.unique_key).toBeUndefined(); }); - test('strategy: overwrite_existing_partitions auto-derives unique_key from partitions', () => { + test('strategy: overwrite_existing_partitions omits unique_key on partitioned models', () => { + // unique_key is dead config for overwrite_existing_partitions: the + // consumer macro derives the partition list from the new slice itself. + // Even when the model has partition columns, we intentionally never emit + // unique_key (and the JSON schema rejects it on this strategy). const modelJson: FrameworkModel = { type: 'int_select_model', group: 'swh', @@ -974,13 +978,10 @@ describe('incremental strategy variants', () => { expect(config.materialized).toBe('incremental'); expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); - expect(config.unique_key).toBe('portal_partition_daily'); + expect(config.unique_key).toBeUndefined(); }); - test('strategy: overwrite_existing_partitions without partitions omits unique_key', () => { - // Mirrors delete+insert behavior: if the model has no partition columns, - // unique_key is left unset so dbt / the custom macro surfaces its own - // error rather than us emitting a phantom column. + test('strategy: overwrite_existing_partitions omits unique_key on unpartitioned models', () => { const modelJson: FrameworkModel = { type: 'int_select_model', group: 'sales', @@ -1004,37 +1005,10 @@ describe('incremental strategy variants', () => { expect(config.unique_key).toBeUndefined(); }); - test('strategy: overwrite_existing_partitions honors explicit unique_key', () => { - const modelJson: FrameworkModel = { - type: 'int_select_model', - group: 'sales', - topic: 'orders', - name: 'enriched', - materialization: { - type: 'incremental', - strategy: { - type: 'overwrite_existing_partitions', - unique_key: 'dim_a', - }, - }, - select: [{ model: 'model_a', columns: ['dim_a'] }], - from: { model: 'model_a' }, - } as unknown as FrameworkModel; - - const { config } = frameworkGenerateModelOutput({ - dj: createTestDJ(), - modelJson, - project: unpartitionedProject, - }); - - expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); - expect(config.unique_key).toBe('dim_a'); - }); - - test('default fallback overwrite_existing_partitions auto-derives unique_key from partitions', () => { - // When the extension default kicks in (no explicit strategy) and the - // model has partition columns, the partition-based unique_key should - // also fire for the overwrite_existing_partitions default path. + test('default fallback overwrite_existing_partitions omits unique_key', () => { + // When the extension default kicks in (no explicit strategy) and resolves + // to overwrite_existing_partitions, unique_key must NOT be emitted -- the + // consumer macro derives partitions from the new slice. const modelJson: FrameworkModel = { type: 'int_select_model', group: 'swh', @@ -1055,7 +1029,7 @@ describe('incremental strategy variants', () => { }); expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); - expect(config.unique_key).toBe('portal_partition_daily'); + expect(config.unique_key).toBeUndefined(); }); test('Delta Lake / Hive partitioning uses properties.partitioned_by', () => { @@ -1124,6 +1098,10 @@ describe('incremental strategy variants', () => { }); test('Iceberg (via model-level format override) uses properties.partitioning', () => { + // Use delete+insert so we can assert unique_key defaulting alongside the + // Iceberg `partitioning` keyword. overwrite_existing_partitions never + // emits unique_key, so the partitioning assertion still works either way, + // but delete+insert gives us coverage of both behaviors at once. const modelJson: FrameworkModel = { type: 'int_select_model', group: 'swh', @@ -1132,7 +1110,7 @@ describe('incremental strategy variants', () => { materialization: { type: 'incremental', format: 'iceberg', - strategy: { type: 'overwrite_existing_partitions' }, + strategy: { type: 'delete+insert' }, }, select: [ 'portal_partition_daily', @@ -1148,7 +1126,7 @@ describe('incremental strategy variants', () => { }); const properties = config.properties as Record | undefined; - expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); + expect(config.incremental_strategy).toBe('delete+insert'); expect(config.unique_key).toBe('portal_partition_daily'); expect(properties?.partitioning).toBe("ARRAY['portal_partition_daily']"); expect(properties?.partitioned_by).toBeUndefined(); @@ -1284,15 +1262,18 @@ describe('incremental unique_key defaulting', () => { // column from the child model's `columns`, matching the real monthly- // rollup shape in production. Without the fix, getDefaultUniqueKey would // still pick 'daily' from the inherited meta and emit a phantom unique_key. - // `createTestDJ()` leaves `materializationDefaultIncrementalStrategy` - // unset so the default (`overwrite_existing_partitions`) applies — the - // partition-based unique_key resolution fires for that strategy as well. + // Use explicit `delete+insert` so this test directly exercises the + // partition-based unique_key resolution (overwrite_existing_partitions + // never emits unique_key). const modelJson: FrameworkModel = { type: 'int_select_model', group: 'swh', topic: 'misc', name: 'monthly_rollup', - materialization: 'incremental', + materialization: { + type: 'incremental', + strategy: { type: 'delete+insert' }, + }, select: [{ name: 'dim_a', type: 'dim' }], from: { model: 'parent_daily', rollup: { interval: 'month' } }, } as unknown as FrameworkModel; @@ -1304,7 +1285,7 @@ describe('incremental unique_key defaulting', () => { }); expect(config.materialized).toBe('incremental'); - expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); + expect(config.incremental_strategy).toBe('delete+insert'); expect(config.unique_key).toBe('portal_partition_monthly'); // Inherited meta is propagated as-is. The local intersection inside // `frameworkGenerateModelOutput` is what guarantees `unique_key` is @@ -1317,19 +1298,21 @@ describe('incremental unique_key defaulting', () => { ]); }); - test('unpartitioned incremental model omits unique_key entirely', () => { + test('unpartitioned incremental delete+insert model omits unique_key entirely', () => { // No parent meta, no partition columns on the model - getDefaultUniqueKey // previously fell back to the hardcoded default list and chose // portal_partition_daily, producing a phantom unique_key that Trino then // fails to resolve at runtime. After the fix, unique_key is simply omitted - // so dbt (or the consumer's `overwrite_existing_partitions` macro) raises - // its own clear error instead. + // so dbt raises its own clear "delete+insert requires unique_key" error. const modelJson: FrameworkModel = { type: 'int_select_model', group: 'capeng', topic: 'tenant', name: 'account', - materialization: 'incremental', + materialization: { + type: 'incremental', + strategy: { type: 'delete+insert' }, + }, select: [{ name: 'dim_a', type: 'dim' }], from: { model: 'parent_unpartitioned' }, } as unknown as FrameworkModel; @@ -1341,12 +1324,12 @@ describe('incremental unique_key defaulting', () => { }); expect(config.materialized).toBe('incremental'); - expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); + expect(config.incremental_strategy).toBe('delete+insert'); expect(config.unique_key).toBeUndefined(); expect(properties.meta?.portal_partition_columns).toBeUndefined(); }); - test('daily model with portal_partition_daily column keeps daily unique_key', () => { + test('daily delete+insert model with portal_partition_daily column keeps daily unique_key', () => { // Regression guard: the common "daily model inherits daily meta and // actually has the daily column" case must still resolve to daily. const modelJson: FrameworkModel = { @@ -1354,7 +1337,10 @@ describe('incremental unique_key defaulting', () => { group: 'swh', topic: 'misc', name: 'daily_model', - materialization: 'incremental', + materialization: { + type: 'incremental', + strategy: { type: 'delete+insert' }, + }, select: ['portal_partition_daily', { name: 'dim_a', type: 'dim' }], from: { model: 'parent_daily' }, } as unknown as FrameworkModel; @@ -1366,7 +1352,7 @@ describe('incremental unique_key defaulting', () => { }); expect(config.materialized).toBe('incremental'); - expect(config.incremental_strategy).toBe('overwrite_existing_partitions'); + expect(config.incremental_strategy).toBe('delete+insert'); expect(config.unique_key).toBe('portal_partition_daily'); }); @@ -1488,7 +1474,7 @@ describe('partition column case_sensitive', () => { variables: {}, }; - test('partition columns automatically get case_sensitive: true', () => { + test('partition columns automatically get case_sensitive: true when setting enabled', () => { const modelJson: FrameworkModel = { type: 'int_select_model', group: 'sales', @@ -1499,8 +1485,15 @@ describe('partition column case_sensitive', () => { from: { model: 'parent_daily' }, } as unknown as FrameworkModel; + const dj: DJ = { + config: { + ...createTestDJ().config, + lightdashDefaultPartitionColumnCaseSensitive: true, + }, + }; + const { properties } = frameworkGenerateModelOutput({ - dj: createTestDJ(), + dj, modelJson, project, }); @@ -1515,6 +1508,29 @@ describe('partition column case_sensitive', () => { expect(dimCol?.meta?.dimension?.case_sensitive).toBeUndefined(); }); + test('partition columns do NOT get case_sensitive when setting is disabled', () => { + const modelJson: FrameworkModel = { + type: 'int_select_model', + group: 'sales', + topic: 'orders', + name: 'daily_model_disabled', + materialization: 'incremental', + select: ['portal_partition_daily', { name: 'dim_a', type: 'dim' }], + from: { model: 'parent_daily' }, + } as unknown as FrameworkModel; + + const { properties } = frameworkGenerateModelOutput({ + dj: createTestDJ(), + modelJson, + project, + }); + + const partitionCol = properties.columns?.find( + (c) => c.name === 'portal_partition_daily', + ); + expect(partitionCol?.meta?.dimension?.case_sensitive).toBeUndefined(); + }); + test('explicit case_sensitive: false on partition column is preserved', () => { const modelJson: FrameworkModel = { type: 'int_select_model', @@ -1533,8 +1549,15 @@ describe('partition column case_sensitive', () => { from: { model: 'parent_daily' }, } as unknown as FrameworkModel; + const dj: DJ = { + config: { + ...createTestDJ().config, + lightdashDefaultPartitionColumnCaseSensitive: true, + }, + }; + const { properties } = frameworkGenerateModelOutput({ - dj: createTestDJ(), + dj, modelJson, project, }); diff --git a/src/services/framework/index.ts b/src/services/framework/index.ts index 7dd2f23..642fa41 100644 --- a/src/services/framework/index.ts +++ b/src/services/framework/index.ts @@ -1629,14 +1629,14 @@ export class Framework implements ApiEnabledService<'framework'> { this.diagnosticModelJson.set(uri, diagnostics); }, - onModelValidationWarning: (uri, message) => { - this.diagnosticModelJson.set(uri, [ - new vscode.Diagnostic( - new vscode.Range(0, 0, 0, 0), - message, - vscode.DiagnosticSeverity.Error, - ), - ]); + onModelValidationWarning: (uri, message, errors, jsonContent) => { + const diagnostics = resolveValidationDiagnostics( + message, + errors, + jsonContent, + vscode.DiagnosticSeverity.Warning, + ); + this.diagnosticModelJson.set(uri, diagnostics); }, // Handle generation errors @@ -1833,13 +1833,14 @@ function resolveValidationDiagnostics( fallbackMessage: string, errors?: ValidationErrorDetail[], jsonContent?: string, + severity: vscode.DiagnosticSeverity = vscode.DiagnosticSeverity.Error, ): vscode.Diagnostic[] { if (!errors?.length || !jsonContent) { return [ new vscode.Diagnostic( new vscode.Range(0, 0, 0, 0), fallbackMessage, - vscode.DiagnosticSeverity.Error, + severity, ), ]; } @@ -1874,10 +1875,6 @@ function resolveValidationDiagnostics( } } - return new vscode.Diagnostic( - range, - err.message, - vscode.DiagnosticSeverity.Error, - ); + return new vscode.Diagnostic(range, err.message, severity); }); } diff --git a/src/services/framework/utils/column-utils.ts b/src/services/framework/utils/column-utils.ts index b121d73..67b6f72 100644 --- a/src/services/framework/utils/column-utils.ts +++ b/src/services/framework/utils/column-utils.ts @@ -2385,6 +2385,67 @@ export function isAggregateExpr(expr?: string): boolean { }); } +// Numeric literal: `0`, `-1.5`, `42`. +const CONSTANT_NUMERIC_REGEX = /^-?\d+(\.\d+)?$/; +// SQL string literal: `'foo'` or `"foo"` (no embedded quotes). +const CONSTANT_STRING_REGEX = /^('[^']*'|"[^"]*")$/; +// `CAST( AS )` where `` is null, numeric, or string, +// and `` is an identifier optionally followed by `(...)` for length/ +// precision (e.g. `varchar(50)`, `decimal(10, 2)`). +const CONSTANT_CAST_REGEX = + /^cast\s*\(\s*(null|-?\d+(\.\d+)?|'[^']*'|"[^"]*")\s+as\s+\w+(\s*\([^)]*\))?\s*\)$/i; + +/** + * Returns true when `expr` is a SQL constant with no column dependencies: + * numeric literal, `null`/`NULL`, quoted string, or `CAST( AS )`. + * Such expressions cannot conflict with `GROUP BY`, so they are safe to skip + * in main-model aggregation validation. + */ +export function isConstantExpr(expr?: string): boolean { + if (!expr) { + return false; + } + const haystack = expr.trim(); + if (!haystack) { + return false; + } + if (haystack.toLowerCase() === 'null') { + return true; + } + if (CONSTANT_NUMERIC_REGEX.test(haystack)) { + return true; + } + if (CONSTANT_STRING_REGEX.test(haystack)) { + return true; + } + return CONSTANT_CAST_REGEX.test(haystack); +} + +/** + * Returns true when `expr` contains Jinja templating (`{{ ... }}` or + * `{% ... %}`). The framework cannot inspect macro expansions, so any Jinja + * `expr` is treated as opaque-aggregated to avoid false-positive + * "un-aggregated" diagnostics. + */ +export function isJinjaExpr(expr?: string): boolean { + if (!expr) { + return false; + } + return /\{\{[\s\S]*?\}\}|\{%[\s\S]*?%\}/.test(expr); +} + +/** + * Returns true when `expr` looks like a window function: a bare function call + * followed by `OVER (`. Used by validators to switch the diagnostic wording + * (window functions are row-wise, so `agg`/`aggs` aren't applicable). + */ +export function isWindowFunctionExpr(expr?: string): boolean { + if (!expr) { + return false; + } + return /\b\w+\s*\([\s\S]*?\)\s*over\s*\(/i.test(expr); +} + /** * Check if a model has aggregation * diff --git a/src/services/framework/utils/sql-utils.ts b/src/services/framework/utils/sql-utils.ts index 6e4e8f7..ca97627 100644 --- a/src/services/framework/utils/sql-utils.ts +++ b/src/services/framework/utils/sql-utils.ts @@ -1987,14 +1987,10 @@ export function frameworkGenerateModelOutput({ case 'overwrite_existing_partitions': { // Requires a custom dbt macro in the consumer project (the DJ // extension does not ship it and dbt-trino does not provide it - // natively). Auto-derive unique_key from partition columns when - // not explicitly supplied, mirroring delete+insert semantics. + // natively). The macro derives the partition list from the new + // slice itself, so unique_key is dead config for this strategy -- + // we intentionally never emit it (and the JSON schema rejects it). modelConfig.incremental_strategy = 'overwrite_existing_partitions'; - if (strategy.unique_key) { - modelConfig.unique_key = strategy.unique_key; - } else if (partitions.length) { - modelConfig.unique_key = getDefaultUniqueKey(partitions); - } break; } default: { @@ -2002,15 +1998,12 @@ export function frameworkGenerateModelOutput({ dj.config.materializationDefaultIncrementalStrategy ?? DEFAULT_INCREMENTAL_STRATEGY; modelConfig.incremental_strategy = defaultStrategy; - // Partition-based strategies auto-derive unique_key from the - // model's partition column(s) when one exists. Append never - // needs a unique_key; merge requires one and is not a valid - // default here (it has no reasonable partition-derived key). - if ( - (defaultStrategy === 'delete+insert' || - defaultStrategy === 'overwrite_existing_partitions') && - partitions.length - ) { + // Only delete+insert auto-derives unique_key from partitions. + // Append never needs one; merge requires a user-supplied key + // (and is not a valid default); overwrite_existing_partitions + // ignores unique_key entirely (the consumer macro derives + // partitions from the new slice itself). + if (defaultStrategy === 'delete+insert' && partitions.length) { modelConfig.unique_key = getDefaultUniqueKey(partitions); } } @@ -2433,7 +2426,10 @@ export function frameworkModelProperties({ // UPPER(), which breaks Trino-Iceberg predicate pushdown. if (explicitCaseSensitive !== undefined) { cleanedDimension.case_sensitive = explicitCaseSensitive; - } else if (partitionColumnNames.includes(column.name)) { + } else if ( + dj.config.lightdashDefaultPartitionColumnCaseSensitive && + partitionColumnNames.includes(column.name) + ) { cleanedDimension.case_sensitive = true; } diff --git a/src/services/modelValidation.ts b/src/services/modelValidation.ts index 8c0ffa8..b7214ef 100644 --- a/src/services/modelValidation.ts +++ b/src/services/modelValidation.ts @@ -1,6 +1,9 @@ import { filterBulkSelectColumns, isAggregateExpr, + isConstantExpr, + isJinjaExpr, + isWindowFunctionExpr, } from '@services/framework/utils/column-utils'; import { BULK_CTE_TYPES } from '@shared/framework/constants'; import type { FrameworkColumn } from '@shared/framework/types'; @@ -540,6 +543,14 @@ export function validateMainModelAggregation( // "summary" diagnostic. const hint = 'set "agg"/"aggs", wrap an aggregate in "expr", or set "exclude_from_group_by": true.'; + // Window-function-specific hint: `agg`/`aggs` are not applicable to a + // window function (it's a row-wise transform). The real failure mode is + // that the inner argument must resolve through GROUP BY. + const windowHint = + 'ensure window-partition columns are a superset of "group_by" (or wrap the inner argument in an aggregate / set "exclude_from_group_by": true).'; + + const isWindow = (sel: any): boolean => + typeof sel?.expr === 'string' && isWindowFunctionExpr(sel.expr); for (let j = 0; j < modelJson.select.length; j++) { const sel = modelJson.select[j]; @@ -556,8 +567,11 @@ export function validateMainModelAggregation( if (columnIsAggregated(sel)) { continue; } + const message = isWindow(sel) + ? `Window function "${sel.name}" in main-model select with group_by — ${windowHint}` + : `Un-aggregated fct column "${sel.name}" with main-model group_by — ${hint}`; errors.push({ - message: `Un-aggregated fct column "${sel.name}" with main-model group_by — ${hint}`, + message, instancePath: `/select/${j}`, }); continue; @@ -570,8 +584,11 @@ export function validateMainModelAggregation( sel.type === 'fct' && !columnIsAggregated(sel) ) { + const message = isWindow(sel) + ? `Window function "${sel.name}" from CTE "${sel.cte}" in main-model select with group_by — ${windowHint}` + : `Un-aggregated fct "${sel.name}" from CTE "${sel.cte}" with main-model group_by — ${hint}`; errors.push({ - message: `Un-aggregated fct "${sel.name}" from CTE "${sel.cte}" with main-model group_by — ${hint}`, + message, instancePath: `/select/${j}`, }); continue; @@ -622,10 +639,16 @@ function columnIsAggregated(sel: any): boolean { if ('aggs' in sel && Array.isArray(sel.aggs) && sel.aggs.length > 0) { return true; } + // Constants (literals, NULL, CAST( AS X)) have no column + // dependencies and can never conflict with GROUP BY. Jinja expressions + // (`{{ macro() }}`) are opaque -- the macro may expand to an aggregate, + // so we conservatively treat them as aggregated to avoid false positives. if ( 'expr' in sel && typeof sel.expr === 'string' && - isAggregateExpr(sel.expr) + (isAggregateExpr(sel.expr) || + isConstantExpr(sel.expr) || + isJinjaExpr(sel.expr)) ) { return true; } diff --git a/src/services/sync/ModelProcessor.ts b/src/services/sync/ModelProcessor.ts index 3e9b43c..f834cfd 100644 --- a/src/services/sync/ModelProcessor.ts +++ b/src/services/sync/ModelProcessor.ts @@ -36,6 +36,7 @@ import type { SyncCallbacks, SyncConfig, SyncResource, + ValidationErrorDetail, } from './types'; const JSONC_FORMAT_OPTIONS = { @@ -122,8 +123,9 @@ export class ModelProcessor { fs.promises.readFile(oldYmlPath, 'utf8').catch(() => ''), ]); - // 4. Pre-generation registry-aware checks (fail fast to avoid writing - // .sql / .yml files that we already know are broken). + // 4. Pre-generation registry-aware checks. These now emit non-blocking + // warnings -- SQL/YML generation always proceeds. Users still see + // diagnostics in the Problems tab but are not blocked from iterating. const hasCtes = 'ctes' in modelJson && !!modelJson.ctes?.length; const cteColumnRegistry = hasCtes ? frameworkBuildCteColumnRegistry({ @@ -132,36 +134,20 @@ export class ModelProcessor { }) : undefined; + // Collect all non-blocking validation warnings here. Emitted as a single + // structured `onModelValidationWarning` call after generation so each + // detail can be pinned to its specific `select[]` JSON range. + const validationWarnings: ValidationErrorDetail[] = []; + const aggErrors = validateMainModelAggregation( modelJson, cteColumnRegistry, ); if (aggErrors.length > 0) { - const summary = aggErrors.map((e) => e.message).join('\n'); - const message = `Main-model aggregation errors:\n${summary}`; - this.config.logger.error(`${modelName}: ${message}`); - // Route through the validation-error callback so each offending - // select[] item gets its own diagnostic pinned to the exact JSON - // range (instead of the fallback 0,0..100,0 generation-error range). - this.callbacks.onModelValidationError?.( - newJsonUri, - message, - aggErrors, - jsonContent, - ); - return { - modelId: resource.id, - modelName, - sql: '', - yml: '', - updatedProject: null, - operations: [], - skipped: false, - error: { - uri: newJsonUri, - message, - }, - }; + for (const err of aggErrors) { + this.config.logger.warn?.(`${modelName}: ${err.message}`); + } + validationWarnings.push(...aggErrors); } // 5. Generate SQL and YML @@ -180,19 +166,16 @@ export class ModelProcessor { newYml = generated.yml; // Validate exclude/include column references against the CTE registry - let hasColRefWarnings = false; if (cteColumnRegistry) { const colRefErrors = validateCteColumnReferences( modelJson, cteColumnRegistry, ); if (colRefErrors.length > 0) { - hasColRefWarnings = true; for (const msg of colRefErrors) { this.config.logger.warn?.(`${modelName}: ${msg}`); + validationWarnings.push({ message: msg, instancePath: '' }); } - const warningMessage = `CTE column reference warnings:\n${colRefErrors.join('\n')}`; - this.callbacks.onModelValidationWarning?.(newJsonUri, warningMessage); } } @@ -202,10 +185,21 @@ export class ModelProcessor { if (deadLayerWarnings.length > 0) { for (const w of deadLayerWarnings) { this.config.logger.warn?.(`${modelName}: ${w.message}`); + validationWarnings.push(w); } } - if (!hasColRefWarnings) { + if (validationWarnings.length > 0) { + const summary = `Model validation warnings:\n${validationWarnings + .map((w) => w.message) + .join('\n')}`; + this.callbacks.onModelValidationWarning?.( + newJsonUri, + summary, + validationWarnings, + jsonContent, + ); + } else { this.callbacks.onDiagnosticsClear?.(newJsonUri); } } catch (err: unknown) { diff --git a/src/services/sync/types.ts b/src/services/sync/types.ts index 1abbae6..0107659 100644 --- a/src/services/sync/types.ts +++ b/src/services/sync/types.ts @@ -123,8 +123,18 @@ export interface SyncCallbacks { jsonContent?: string, ) => void; - /** Called for non-blocking validation issues (e.g. CTE column reference warnings) */ - onModelValidationWarning?: (uri: vscode.Uri, message: string) => void; + /** + * Called for non-blocking validation issues (e.g. CTE column reference + * warnings, main-model aggregation soft-fails). Optionally carries + * structured per-error details (with `instancePath`) so the consumer can + * pin diagnostics to specific JSON locations. + */ + onModelValidationWarning?: ( + uri: vscode.Uri, + message: string, + errors?: ValidationErrorDetail[], + jsonContent?: string, + ) => void; /** Called when SQL/YML generation fails */ onGenerationError?: ( diff --git a/src/services/types/config.ts b/src/services/types/config.ts index 78d6063..8dd2b6f 100644 --- a/src/services/types/config.ts +++ b/src/services/types/config.ts @@ -24,6 +24,7 @@ export type CoderConfig = { trinoPath?: string; lightdashProjectPath?: string; lightdashProfilesPath?: string; + lightdashDefaultPartitionColumnCaseSensitive?: boolean; materializationDefaultIncrementalStrategy?: DefaultIncrementalStrategy; // Settings with defaults from package.json columnLineageAutoRefresh?: boolean; diff --git a/src/shared/index.ts b/src/shared/index.ts index d751490..cdc374b 100644 --- a/src/shared/index.ts +++ b/src/shared/index.ts @@ -18,6 +18,7 @@ import type { export type DJ = { config: { aiHintTag?: string; + lightdashDefaultPartitionColumnCaseSensitive?: boolean; materializationDefaultIncrementalStrategy?: DefaultIncrementalStrategy; }; }; diff --git a/src/shared/schema/types/model.incremental_strategy.schema.d.ts b/src/shared/schema/types/model.incremental_strategy.schema.d.ts index e4f6259..4d0ae7a 100644 --- a/src/shared/schema/types/model.incremental_strategy.schema.d.ts +++ b/src/shared/schema/types/model.incremental_strategy.schema.d.ts @@ -48,8 +48,4 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; diff --git a/src/shared/schema/types/model.materialization.schema.d.ts b/src/shared/schema/types/model.materialization.schema.d.ts index f892f12..f9832ee 100644 --- a/src/shared/schema/types/model.materialization.schema.d.ts +++ b/src/shared/schema/types/model.materialization.schema.d.ts @@ -80,8 +80,4 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; diff --git a/src/shared/schema/types/model.schema.d.ts b/src/shared/schema/types/model.schema.d.ts index 82f2b1f..c5a69ac 100644 --- a/src/shared/schema/types/model.schema.d.ts +++ b/src/shared/schema/types/model.schema.d.ts @@ -121,10 +121,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_join_column.schema.d.ts b/src/shared/schema/types/model.type.int_join_column.schema.d.ts index faaa8fa..7b7fadf 100644 --- a/src/shared/schema/types/model.type.int_join_column.schema.d.ts +++ b/src/shared/schema/types/model.type.int_join_column.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_join_models.schema.d.ts b/src/shared/schema/types/model.type.int_join_models.schema.d.ts index 7ccce5e..73608f8 100644 --- a/src/shared/schema/types/model.type.int_join_models.schema.d.ts +++ b/src/shared/schema/types/model.type.int_join_models.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_lookback_model.schema.d.ts b/src/shared/schema/types/model.type.int_lookback_model.schema.d.ts index 8560c48..032ea5f 100644 --- a/src/shared/schema/types/model.type.int_lookback_model.schema.d.ts +++ b/src/shared/schema/types/model.type.int_lookback_model.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_rollup_model.schema.d.ts b/src/shared/schema/types/model.type.int_rollup_model.schema.d.ts index 5d61273..5bddeb5 100644 --- a/src/shared/schema/types/model.type.int_rollup_model.schema.d.ts +++ b/src/shared/schema/types/model.type.int_rollup_model.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_select_model.schema.d.ts b/src/shared/schema/types/model.type.int_select_model.schema.d.ts index 0c1a682..cd1d2c3 100644 --- a/src/shared/schema/types/model.type.int_select_model.schema.d.ts +++ b/src/shared/schema/types/model.type.int_select_model.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.int_union_models.schema.d.ts b/src/shared/schema/types/model.type.int_union_models.schema.d.ts index 374e628..a3256bd 100644 --- a/src/shared/schema/types/model.type.int_union_models.schema.d.ts +++ b/src/shared/schema/types/model.type.int_union_models.schema.d.ts @@ -189,10 +189,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.stg_select_model.schema.d.ts b/src/shared/schema/types/model.type.stg_select_model.schema.d.ts index 2e8a605..8b829af 100644 --- a/src/shared/schema/types/model.type.stg_select_model.schema.d.ts +++ b/src/shared/schema/types/model.type.stg_select_model.schema.d.ts @@ -106,10 +106,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.stg_select_source.schema.d.ts b/src/shared/schema/types/model.type.stg_select_source.schema.d.ts index 5bc23c2..3cabd09 100644 --- a/src/shared/schema/types/model.type.stg_select_source.schema.d.ts +++ b/src/shared/schema/types/model.type.stg_select_source.schema.d.ts @@ -106,10 +106,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/src/shared/schema/types/model.type.stg_union_sources.schema.d.ts b/src/shared/schema/types/model.type.stg_union_sources.schema.d.ts index c92ff54..74d37d8 100644 --- a/src/shared/schema/types/model.type.stg_union_sources.schema.d.ts +++ b/src/shared/schema/types/model.type.stg_union_sources.schema.d.ts @@ -106,10 +106,6 @@ export type IncrementalStrategy = * Overwrite only the partitions in the new slice. REQUIRES a custom macro in your dbt project — prefer 'delete+insert' if you do not have one. */ type: 'overwrite_existing_partitions'; - /** - * Optional override for the unique key(s). Defaults to the model's partition column when omitted. - */ - unique_key?: string | [string, ...string[]]; }; /** * Type of materialization diff --git a/templates/_AGENTS.md b/templates/_AGENTS.md index 6869b5e..c7a2eaf 100644 --- a/templates/_AGENTS.md +++ b/templates/_AGENTS.md @@ -699,7 +699,7 @@ Use the `materialization` field instead of the legacy `materialized` + `incremen | `append` | `{ "type": "append" }` | Fast insert-only; no de-dup | Upstream must guarantee no duplicates in the new slice | | `delete+insert` | `{ "type": "delete+insert", "unique_key": "..." }` | Partition-safe upsert (**safe default**) | `unique_key` is auto-derived from partitions when omitted | | `merge` | `{ "type": "merge", "unique_key": "id", "merge_update_columns": [...], "merge_exclude_columns": [...] }` | Row-level upsert on a primary key | **dbt-trino requires Iceberg format.** Set `materialization.format: "iceberg"` or the project var `storage_type: iceberg` | -| `overwrite_existing_partitions` | `{ "type": "overwrite_existing_partitions", "unique_key": "..." }` (unique_key optional) | Drop & rewrite only partitions present in the new slice | **Requires a custom dbt macro in your project** (e.g. `get_incremental_overwrite_existing_partitions_sql`). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. If your project does not define it, use `{ "type": "delete+insert" }` instead \u2014 behavior is equivalent for partition-aligned daily/monthly incrementals when `unique_key` is the partition column. | +| `overwrite_existing_partitions` | `{ "type": "overwrite_existing_partitions" }` | Drop & rewrite only partitions present in the new slice | **Requires a custom dbt macro in your project** (e.g. `get_incremental_overwrite_existing_partitions_sql`). The DJ extension does NOT ship this macro and dbt-trino does NOT provide it natively. `unique_key` is **not applicable** for this strategy \u2014 the macro derives partitions from the new slice itself, and the schema rejects `unique_key`. If your project does not define the macro, use `{ "type": "delete+insert" }` instead \u2014 behavior is equivalent for partition-aligned daily/monthly incrementals when `unique_key` is the partition column. | ### Legacy Incremental Configuration diff --git a/tests/fixtures/int__sales__analytics__rolling_30day.yml b/tests/fixtures/int__sales__analytics__rolling_30day.yml index 7c57fe4..cbddb16 100644 --- a/tests/fixtures/int__sales__analytics__rolling_30day.yml +++ b/tests/fixtures/int__sales__analytics__rolling_30day.yml @@ -68,5 +68,3 @@ models: description: Portal Partition Daily meta: type: dim - dimension: - case_sensitive: true diff --git a/web/src/features/ModelWizard/AdditionalFields.tsx b/web/src/features/ModelWizard/AdditionalFields.tsx index 1b7c92c..54c72cc 100644 --- a/web/src/features/ModelWizard/AdditionalFields.tsx +++ b/web/src/features/ModelWizard/AdditionalFields.tsx @@ -228,102 +228,102 @@ export function AdditionalFields({ '\u2022 append: inserts new rows without dedup (upstream must guarantee no duplicates).\n' + '\u2022 delete+insert: partition-safe upsert; unique_key auto-derived from partitions.\n' + '\u2022 merge: row-level upsert on unique_key (requires Iceberg format in dbt-trino).\n' + - '\u2022 overwrite_existing_partitions: drops & rewrites partitions in the new slice. Requires a custom dbt macro in your project \u2014 if unavailable, use delete+insert instead.\n' + + '\u2022 overwrite_existing_partitions: drops & rewrites partitions in the new slice; no unique_key needed (the macro derives partitions from the new slice). Requires a custom dbt macro in your project \u2014 if unavailable, use delete+insert instead.\n' + 'The default can be configured via dj.materialization.defaultIncrementalStrategy.' } /> )} /> - {/* Unique Key - not applicable for 'append' */} - {incrementalStrategy?.type !== 'append' && ( - { - // Get the current type value from formValues - const strategyType = formValues.incremental_strategy?.type; - - // If type is merge, unique_key must be provided - if (strategyType === 'merge') { - if ( - !value || - (typeof value === 'string' && value.trim() === '') || - (Array.isArray(value) && value.length === 0) - ) { - return 'At least one unique key is required for merge strategy'; + {/* Unique Key - not applicable for 'append' or 'overwrite_existing_partitions' */} + {incrementalStrategy?.type !== 'append' && + incrementalStrategy?.type !== 'overwrite_existing_partitions' && ( + { + // Get the current type value from formValues + const strategyType = + formValues.incremental_strategy?.type; + + // If type is merge, unique_key must be provided + if (strategyType === 'merge') { + if ( + !value || + (typeof value === 'string' && value.trim() === '') || + (Array.isArray(value) && value.length === 0) + ) { + return 'At least one unique key is required for merge strategy'; + } } - } - return true; - }, - }} - render={({ field }) => { - // Convert unique_key to array format for TagInput - const uniqueKeyArray = Array.isArray(field.value) - ? field.value - : field.value - ? [field.value] - : []; - - return ( - { - // Convert to proper format: string or tuple [string, ...string[]] - const uniqueKeyValue: - | string - | [string, ...string[]] - | undefined = - value.length === 0 - ? undefined - : value.length === 1 - ? value[0] - : (value as [string, ...string[]]); - - field.onChange(uniqueKeyValue); - const newStrategy = { - ...incrementalStrategy, - unique_key: uniqueKeyValue || '', - }; - setValue( - 'incremental_strategy', - newStrategy as ModelWizardFormValues['incremental_strategy'], - ); - setAdditionalField( - 'incremental_strategy', - newStrategy as IncrementalStrategy, - ); - }} - onBlur={field.onBlur} - label="Unique Key(s)" - placeholder="Type and press Enter to add unique keys" - tooltipText={ - incrementalStrategy?.type === 'merge' - ? 'Required for merge. The unique key(s) to upsert on.' - : incrementalStrategy?.type === 'delete+insert' || - incrementalStrategy?.type === - 'overwrite_existing_partitions' - ? 'Optional override. Defaults to the model\u2019s partition column when omitted.' - : 'The unique key(s) to use for incremental updates' - } - error={(() => { - // `incremental_strategy` is a discriminated union, so - // `unique_key` is not present on the `append` variant. - // Access via index so react-hook-form errors resolve - // correctly regardless of the active variant. - const uniqueKeyError = ( - errors.incremental_strategy as - | Record - | undefined - )?.['unique_key']; - return uniqueKeyError?.message; - })()} - /> - ); - }} - /> - )} + return true; + }, + }} + render={({ field }) => { + // Convert unique_key to array format for TagInput + const uniqueKeyArray = Array.isArray(field.value) + ? field.value + : field.value + ? [field.value] + : []; + + return ( + { + // Convert to proper format: string or tuple [string, ...string[]] + const uniqueKeyValue: + | string + | [string, ...string[]] + | undefined = + value.length === 0 + ? undefined + : value.length === 1 + ? value[0] + : (value as [string, ...string[]]); + + field.onChange(uniqueKeyValue); + const newStrategy = { + ...incrementalStrategy, + unique_key: uniqueKeyValue || '', + }; + setValue( + 'incremental_strategy', + newStrategy as ModelWizardFormValues['incremental_strategy'], + ); + setAdditionalField( + 'incremental_strategy', + newStrategy as IncrementalStrategy, + ); + }} + onBlur={field.onBlur} + label="Unique Key(s)" + placeholder="Type and press Enter to add unique keys" + tooltipText={ + incrementalStrategy?.type === 'merge' + ? 'Required for merge. The unique key(s) to upsert on.' + : incrementalStrategy?.type === 'delete+insert' + ? 'Optional override. Defaults to the model\u2019s partition column when omitted.' + : 'The unique key(s) to use for incremental updates' + } + error={(() => { + // `incremental_strategy` is a discriminated union, so + // `unique_key` is not present on the `append` variant. + // Access via index so react-hook-form errors resolve + // correctly regardless of the active variant. + const uniqueKeyError = ( + errors.incremental_strategy as + | Record + | undefined + )?.['unique_key']; + return uniqueKeyError?.message; + })()} + /> + ); + }} + /> + )} {/* Second Row: Merge strategy-specific fields (only shown when type is 'merge') */}