diff --git a/plugins/databases-on-aws/skills/dsql/SKILL.md b/plugins/databases-on-aws/skills/dsql/SKILL.md index ee3db2ba..a0a79d7c 100644 --- a/plugins/databases-on-aws/skills/dsql/SKILL.md +++ b/plugins/databases-on-aws/skills/dsql/SKILL.md @@ -1,9 +1,9 @@ --- name: dsql -description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, and query plan explainability. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow." +description: "Build with Aurora DSQL — manage schemas, execute queries, handle migrations, diagnose query plans, and develop applications with a serverless, distributed SQL database. Covers IAM auth, multi-tenant patterns, MySQL-to-DSQL migration, DDL operations, query plan explainability, and SQL compatibility validation. Triggers on phrases like: DSQL, Aurora DSQL, create DSQL table, DSQL schema, migrate to DSQL, distributed SQL database, serverless PostgreSQL-compatible database, DSQL query plan, DSQL EXPLAIN ANALYZE, why is my DSQL query slow." license: Apache-2.0 metadata: - tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp + tags: aws, aurora, dsql, distributed-sql, distributed, distributed-database, database, serverless, serverless-database, postgresql, postgres, sql, schema, migration, multi-tenant, iam-auth, aurora-dsql, mcp, orm --- # Amazon Aurora DSQL Skill @@ -114,6 +114,13 @@ sampled in [.mcp.json](../../.mcp.json) **When:** MUST load all four at Workflow 8 Phase 0 — [query-plan/plan-interpretation.md](references/query-plan/plan-interpretation.md), [query-plan/catalog-queries.md](references/query-plan/catalog-queries.md), [query-plan/guc-experiments.md](references/query-plan/guc-experiments.md), [query-plan/report-format.md](references/query-plan/report-format.md) **Contains:** DSQL node types + Node Duration math + estimation-error bands, pg_class/pg_stats/pg_indexes SQL + correlated-predicate verification, GUC experiment procedures + 30-second skip protocol, required report structure + element checklist + support request template +### SQL Compatibility Validation: + +#### [dsql-lint.md](references/dsql-lint.md) + +**When:** MUST load before running `dsql_lint`, processing externally-sourced SQL (pg_dump, ORM migrations, user-pasted DDL), or resolving `fixed_with_warning` / unfixable diagnostics +**Contains:** `dsql_lint` MCP tool reference, fix statuses, ORM integration, unfixable error resolution + --- ## MCP Tools Available @@ -126,6 +133,10 @@ The `aurora-dsql` MCP server provides these tools: 2. **transact** - Execute DDL/DML statements in transaction (takes list of SQL statements) 3. **get_schema** - Get table structure for a specific table +**SQL Validation:** + +1. **dsql_lint** - Validate SQL for DSQL compatibility and optionally auto-fix issues. Use before executing externally-sourced SQL. + **Documentation & Knowledge:** 1. **dsql_search_documentation** - Search Aurora DSQL documentation @@ -168,29 +179,9 @@ See [scripts/README.md](../../scripts/README.md) for usage and hook configuratio ## Quick Start -### 1. List tables and explore schema - -``` -Use readonly_query with information_schema to list tables -Use get_schema to understand table structure -``` - -### 2. Query data - -``` -Use readonly_query for SELECT queries -Always include tenant_id in WHERE clause for multi-tenant apps -MUST build SQL with safe_query.build() — see mcp/tools/input-validation.md -``` - -### 3. Execute schema changes - -``` -Use transact tool with list of SQL statements -Follow one-DDL-per-transaction rule -Always use CREATE INDEX ASYNC in separate transaction -ALTER COLUMN TYPE, DROP COLUMN, DROP CONSTRAINT → Table Recreation Pattern (Workflow 6) -``` +1. **Explore:** Use `readonly_query` with `information_schema` to list tables. Use `get_schema` for table structure. +2. **Query:** Use `readonly_query` for SELECT queries. **MUST** include `tenant_id` in WHERE for multi-tenant apps. **MUST** build SQL with `safe_query.build()`. +3. **Schema changes:** Use `transact` with one DDL per transaction. **MUST** batch DML under 3,000 rows. **MUST** use `CREATE INDEX ASYNC` in a separate call. Use `dsql_lint` to validate first. --- @@ -210,11 +201,15 @@ ALTER COLUMN TYPE, DROP COLUMN, DROP CONSTRAINT → Table Recreation Pattern (Wo ### Workflow 2: Safe Data Migration -1. Add column using transact: `transact(["ALTER TABLE ... ADD COLUMN ..."])` -2. Populate existing rows with UPDATE in separate transact calls (batched under 3,000 rows) -3. Verify migration with readonly_query using COUNT -4. Create async index for new column using transact if needed +Every DDL statement generated in this workflow MUST be validated with `dsql_lint(fix=true)` before its `transact` call — applies to step 2 (ADD COLUMN) and step 5 (async index). DML (`UPDATE` in step 3) does not require linting. + +1. Validate ALTER TABLE DDL with `dsql_lint(sql=..., fix=true)` — handle diagnostics per [dsql-lint.md](references/dsql-lint.md) +2. Add column using transact: `transact(["ALTER TABLE ... ADD COLUMN ..."])` +3. Populate existing rows with UPDATE in separate transact calls (batched under 3,000 rows) +4. Verify migration with readonly_query using COUNT +5. If an index is needed: validate CREATE INDEX ASYNC DDL with `dsql_lint(sql=..., fix=true)`, then create via transact +- MUST validate every externally-sourced or generated DDL statement with `dsql_lint` before executing - MUST add column first, populate later - MUST issue ADD COLUMN with only name and type; apply DEFAULT via separate UPDATE - MUST batch updates under 3,000 rows in separate transact calls @@ -242,13 +237,18 @@ MUST load [access-control.md](references/access-control.md) for role setup, IAM ### Workflow 6: Table Recreation DDL Migration -DSQL does NOT support direct `ALTER COLUMN TYPE`, `DROP COLUMN`, `DROP CONSTRAINT`, or `MODIFY PRIMARY KEY`. These operations require the **Table Recreation Pattern** — creating a new table, copying data, dropping the original, and renaming. This is a destructive workflow that requires user confirmation at each step. +DSQL does NOT support direct `ALTER COLUMN TYPE`, `DROP COLUMN`, `DROP CONSTRAINT`, or `MODIFY PRIMARY KEY`. These require the **Table Recreation Pattern**. This is a destructive workflow that requires user confirmation at each step. Every generated DDL in the pattern (CREATE new, INSERT ... SELECT, DROP old, RENAME) MUST be validated with `dsql_lint(sql=..., fix=true)` before execution. MUST load [ddl-migrations/overview.md](references/ddl-migrations/overview.md) before attempting any of these operations. -### Workflow 7: MySQL to DSQL Schema Migration +### Workflow 7: Validate and Migrate to DSQL + +MUST load [dsql-lint.md](references/dsql-lint.md) before running `dsql_lint` — it defines diagnostic handling, the three `fix_result.status` values (`fixed`, `fixed_with_warning`, `unfixable`), and user-confirmation gates. + +Run `dsql_lint(sql=source_sql, fix=true)` to validate and auto-convert PostgreSQL-compatible SQL. `dsql_lint` uses a PostgreSQL parser, so MySQL dialect syntax that PostgreSQL cannot parse (e.g., `PARTITION BY HASH`, `AUTO_INCREMENT` in some positions) surfaces as a `parse_error` rule rather than individual diagnostics. -MUST load [mysql-migrations/type-mapping.md](references/mysql-migrations/type-mapping.md) for type mappings, feature alternatives, and migration steps. +- For MySQL-origin SQL, MUST cross-check the source against [mysql-migrations/type-mapping.md](references/mysql-migrations/type-mapping.md) even when lint returns clean — `ENGINE=` clauses and `SET(...)` column types can pass silently through the PostgreSQL parser. +- On `parse_error`, fall back to [mysql-migrations/type-mapping.md](references/mysql-migrations/type-mapping.md) for manual conversion, then re-run `dsql_lint` on the converted output before executing. ### Workflow 8: Query Plan Explainability @@ -283,6 +283,7 @@ PGPASSWORD="$TOKEN" psql "host=$HOST port=5432 user=admin dbname=postgres sslmod ## Error Scenarios - **`awsknowledge` returns no results:** Use the default limits in the table above and note that limits should be verified against [DSQL documentation](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/). +- **`dsql_lint` unavailable or timing out:** See the Error Handling section of [dsql-lint.md](references/dsql-lint.md). Do not silently skip validation — inform the user and require explicit confirmation before proceeding with manual rules from [development-guide.md](references/development-guide.md). - **OCC serialization error:** Retry the transaction. If persistent, check for hot-key contention — see [troubleshooting.md](references/troubleshooting.md). - **Transaction exceeds limits:** Split into batches under 3,000 rows — see [batched-migration.md](references/ddl-migrations/batched-migration.md). - **Token expiration mid-operation:** Generate a fresh IAM token — see [authentication-guide.md](references/auth/authentication-guide.md). See [troubleshooting.md](references/troubleshooting.md) for other issues. diff --git a/plugins/databases-on-aws/skills/dsql/references/dsql-lint.md b/plugins/databases-on-aws/skills/dsql/references/dsql-lint.md new file mode 100644 index 00000000..f21715e3 --- /dev/null +++ b/plugins/databases-on-aws/skills/dsql/references/dsql-lint.md @@ -0,0 +1,122 @@ +# DSQL Lint — SQL Compatibility Validation + +`dsql_lint` is an MCP tool that validates SQL for Aurora DSQL compatibility and auto-fixes +common issues. It provides deterministic, rule-based analysis — more reliable than heuristic +reasoning for catching DSQL-specific constraints. + +--- + +## MCP Tool Reference + +### dsql_lint + +| Parameter | Type | Required | Description | +| --------- | ------- | -------- | ------------------------------------------------- | +| `sql` | string | Yes | SQL to validate (max 1,000,000 characters) | +| `fix` | boolean | No | Return DSQL-compatible fixed SQL (default: false) | + +Server timeout: 30 seconds per call. + +**Returns:** + +Concrete example (from `dsql_lint(sql="CREATE INDEX idx ON t (c);", fix=true)`): + +```json +{ + "diagnostics": [ + { + "rule": "index_async", + "line": 1, + "message": "CREATE INDEX without ASYNC is not supported in DSQL. Index: idx", + "suggestion": "Use `CREATE INDEX ASYNC ...` instead.", + "fix_result": { "status": "fixed", "detail": "Added ASYNC keyword to CREATE INDEX" }, + "statement_preview": "CREATE INDEX idx ON t (c);" + } + ], + "fixed_sql": "CREATE INDEX ASYNC idx ON t (c);\n", + "summary": { "errors": 0, "warnings": 0, "fixed": 1 } +} +``` + +**Schema notes:** + +- `rule` is a snake_case string identifying the rule (e.g., `index_async`, `truncate`, `json_type`, `set_transaction`); `line` is 1-indexed. +- `fix_result.status` is one of three values: `fixed`, `fixed_with_warning`, or `unfixable`. Always check this field — `fix_result` is present for every diagnostic when `fix=true`. +- `fix_result.detail` is present for `fixed` and `fixed_with_warning`; absent for `unfixable`. +- `fixed_sql` is always a string when `fix=true` (may include the original text verbatim for `unfixable` portions that could not be rewritten); `null` when `fix=false`. Presence of `fixed_sql` does NOT mean the SQL is safe to execute — check every diagnostic first. +- `summary.errors` counts `unfixable` diagnostics; `summary.warnings` counts `fixed_with_warning`; `summary.fixed` counts `fixed`. +- `statement_preview` is the linter's pointer to the offending statement — useful when presenting diagnostics to the user. + +--- + +## Fix Result Statuses + +| `fix_result.status` | Meaning | Agent action | +| -------------------- | --------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | +| `fixed` | Safe mechanical transformation | Accept; for destructive DDL (`DROP`, `RENAME`, `TRUNCATE`) confirm with user before executing | +| `fixed_with_warning` | Fix applied, may need app-layer changes | Present to user, explain implications, obtain acknowledgement before executing | +| `unfixable` | Cannot auto-fix | Present to user with a proposed rewrite from the Unfixable Errors table, obtain confirmation before substituting | + +--- + +## Workflow: Validate & Migrate SQL to DSQL + +Use for any SQL that was not composed by the agent itself from skill knowledge — including user-pasted SQL, migration files, ORM output (Django, Rails, Prisma, TypeORM, Sequelize, SQLAlchemy), pg_dump exports, and hand-written schemas. Applies to DDL and schema-mutating DML; do **not** lint ad-hoc read-only `SELECT`s. + +1. Obtain source SQL from user (migration file, ORM output, schema dump, or inline SQL). `dsql_lint` accepts multi-statement SQL in a single call — pass the whole batch. +2. Run `dsql_lint(sql=source_sql, fix=true)`. Default to `fix=true` for any migration scenario; use `fix=false` only when the user explicitly asked for validation-only output, or when re-verifying manually rewritten SQL. +3. For each diagnostic, emit a user-visible bullet showing `rule`, `message`, `suggestion`, `statement_preview`, and `fix_result.status`. Handle per the Fix Result Statuses table: `fixed` applies automatically (confirm for destructive DDL); `fixed_with_warning` needs user acknowledgement; `unfixable` needs user confirmation of a proposed rewrite. +4. If **any** diagnostic is `unfixable`, do NOT execute the returned `fixed_sql` — it still contains the unfixable portion verbatim. Collect user-confirmed rewrites from the Unfixable Errors table, merge them into the SQL, then re-run `dsql_lint(fix=true)` on the combined SQL to confirm it is clean. +5. Also surface the `fixed_sql` body itself to the user before executing — prompt-injection can hide inside rewritten statements. +6. Once diagnostics are resolved and the user has acknowledged, split the clean `fixed_sql` on statement boundaries. +7. For destructive DDL (`DROP`, `RENAME`, `TRUNCATE`) confirm with the user before executing, matching Workflow 6's confirmation gate. +8. Execute each DDL with `transact([""])` — one DDL per call. +9. Verify schema with `get_schema`. + +**Critical rules:** + +- **MUST** run `dsql_lint` on any externally-sourced SQL before executing it with `transact`. +- **MUST** surface each diagnostic and the `fixed_sql` body to the user before executing. +- **MUST NOT** execute `fixed_sql` while any diagnostic has `fix_result.status == "unfixable"` — resolve first, then re-lint until clean. +- **MUST** re-run `dsql_lint` on manually rewritten SQL before executing it. +- **MUST** issue each DDL in its own `transact` call. + +**User override:** If the user explicitly declines validation ("just run it"), warn once that deterministic validation is being skipped and record the skip; proceed only when the user repeats the request. + +**ORM-specific guidance:** + +- **Django:** Run `python manage.py sqlmigrate ` to get raw SQL, then lint. +- **Rails (6.1+):** Set `config.active_record.schema_format = :sql`, then run `rails db:schema:dump` (legacy `db:structure:dump` still works in older Rails). Lint the generated `db/structure.sql`. +- **Prisma:** Use `prisma migrate diff --from-empty --to-schema-datamodel ./prisma/schema.prisma --script` to emit SQL to stdout, then lint. +- **TypeORM/Sequelize:** Generate migration SQL to a file, then lint. +- **SQLAlchemy:** Compile DDL without executing — e.g., `for table in metadata.tables.values(): print(CreateTable(table).compile(engine))`. Do **not** call `metadata.create_all(engine)` with a real engine — it executes the DDL before lint. Alternatively use `create_mock_engine` to capture DDL. + +--- + +## Handling Unfixable Errors + +When `dsql_lint` returns a diagnostic with `fix_result.status == "unfixable"`, **MUST** present the proposed rewrite to the user and obtain confirmation before substituting. Use skill knowledge to resolve: + +Only diagnostics with `fix_result.status == "unfixable"` need user-confirmed rewrites — these are the most common: + +| Rule | Resolution | +| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------- | +| `create_table_as` | CREATE TABLE with explicit columns, then `INSERT ... SELECT` | +| `truncate` | Use `DELETE FROM table_name` (batch if > 3,000 rows) | +| `unsupported_alter_table_op` | Use Table Recreation Pattern — see [ddl-migrations/overview.md](ddl-migrations/overview.md) and Workflow 6 | +| `add_column_constraint` | ADD COLUMN with name + type only, then backfill via UPDATE. If NOT NULL/DEFAULT required, use Table Recreation Pattern. | +| `index_expression` | Create a computed column, then index that column | +| `index_partial` | Create a full index; filter at query time | +| `set_transaction` | Omit — DSQL uses Repeatable Read (fixed); `SET TRANSACTION ISOLATION LEVEL` is not supported | + +Other rules such as `temp_table`, `inherits`, `index_using`, and `transaction_isolation` are emitted as `fixed` or `fixed_with_warning` — follow the Fix Result Statuses table rather than rewriting manually. + +--- + +## Error Handling + +If `dsql_lint` is unavailable, returns a parse error, or times out: + +- **MCP unavailable:** Inform the user that deterministic validation is unavailable and ask whether to (a) retry later or (b) proceed with manual validation using [development-guide.md](development-guide.md) DDL rules and type constraints. Proceed only on explicit user confirmation — the MUST-validate gate is not silently bypassed. +- **Parse error (`parse_error` rule):** The SQL contains syntax the PostgreSQL parser cannot handle (MySQL-specific dialect, malformed SQL, etc.). Fall back to [mysql-migrations/type-mapping.md](mysql-migrations/type-mapping.md) for manual conversion. Present the proposed rewrite to the user and obtain confirmation before re-running `dsql_lint(fix=true)`; execute only when the re-lint is clean. +- **Timeout:** Retry once. If the retry also times out, inform the user and obtain confirmation before falling back to splitting the SQL at statement boundaries and linting each in a bounded single-pass loop. If an individual statement still times out, stop and surface to the user — do not recurse further. diff --git a/tools/evals/databases-on-aws/dsql/dsql_lint_eval_results.md b/tools/evals/databases-on-aws/dsql/dsql_lint_eval_results.md new file mode 100644 index 00000000..11781c21 --- /dev/null +++ b/tools/evals/databases-on-aws/dsql/dsql_lint_eval_results.md @@ -0,0 +1,179 @@ +# dsql_lint Eval Results — With-Skill vs Baseline + +**Date:** 2026-05-08 +**MCP Server:** awslabs.aurora-dsql-mcp-server (local build from `feature/dsql-lint-mcp-tool` branch; upstream mirror PR not yet merged) +**dsql-lint version:** 0.1.4 +**Evaluation method:** Manual behavioral comparison — subagent run with skill loaded vs. subagent run without skill. Automated grading for these evals is not yet wired into `run_functional_evals.py`; PASS/FAIL is a human assessment of transcripts against the expectations in `dsql_lint_evals.json`. + +## Summary + +| Eval | Scenario | With Skill | Baseline | Delta | +| ---- | ------------------------- | ---------- | --------------- | --------------------------------------------------------------- | +| 100 | pg_dump PostgreSQL schema | **PASS** | FAIL (3 errors) | Skill corrects JSON, index, transaction handling | +| 101 | Django ORM migration | **PASS** | FAIL (3 errors) | Skill corrects JSON, index, provides actionable Django guidance | +| 102 | Clean DSQL-compatible SQL | **PASS** | N/A | Tool correctly reports no issues; agent does not execute | +| 103 | MySQL unsupported syntax | **PASS** | N/A | Tool returns parse error; agent falls back to mysql-migrations | + +The skill demonstrably changes agent behavior. The baseline agent hallucinates incorrect +DSQL constraints (JSONB support, synchronous indexes) while the skill-guided agent uses +`dsql_lint` for deterministic validation and produces correct output. + +--- + +## Eval 100: PostgreSQL pg_dump Schema + +**Prompt:** "I have this PostgreSQL schema from pg_dump. Can you check if it's compatible +with DSQL and fix any issues?" + +```sql +CREATE TABLE users ( + id SERIAL PRIMARY KEY, + email VARCHAR(255) NOT NULL, + preferences JSON, + team_id INT REFERENCES teams(id) +); +CREATE INDEX idx_users_email ON users(email); +``` + +### Behavior Comparison + +| Behavior | With Skill | Baseline | Correct? | +| ----------------------- | ------------------------------------- | -------------------------- | --------------------------------------------------------------- | +| Used deterministic tool | PASS Called `dsql_lint` | FAIL Relied on memory | Skill wins | +| SERIAL replacement | BIGINT IDENTITY (CACHE 1) | UUID gen_random_uuid() | Both valid, skill matches `dsql_lint` output | +| JSON handling | PASS TEXT | FAIL JSONB | **Baseline wrong** — DSQL does not support JSONB as column type | +| Index handling | PASS CREATE INDEX ASYNC | FAIL "Index is fine as-is" | **Baseline wrong** — DSQL requires ASYNC | +| Transaction splitting | PASS Explicitly stated one DDL per tx | FAIL Not mentioned | **Baseline misses** | +| Foreign key guidance | PASS App-layer enforcement | PASS App-layer enforcement | Both correct | + +### With-Skill Output (summary) + +- Called `dsql_lint(sql=..., fix=true)` +- Reported 4 diagnostics: serial_type, json_type, foreign_key, index_async +- Presented fixed SQL with IDENTITY, TEXT, removed FK, ASYNC index +- Explained each warning and what the user needs to do at the application layer +- Stated "issue each DDL as a separate transaction" + +### Baseline Output (summary) + +- Did NOT use any validation tool +- Recommended `JSONB` for the JSON column (incorrect — DSQL rejects JSONB as a column type) +- Said the CREATE INDEX statement "is fine" (incorrect — DSQL requires ASYNC) +- Did not mention transaction splitting +- Recommended UUID for SERIAL (valid but different from `dsql_lint`'s IDENTITY approach) + +### Baseline Failures + +1. **JSON → JSONB (wrong):** Would cause DDL rejection at execution time +2. **Index "is fine" (wrong):** Synchronous CREATE INDEX is not supported in DSQL +3. **No transaction guidance:** Agent would likely issue both DDL in one transact call + +--- + +## Eval 101: Django ORM Migration (multi-DDL transaction) + +**Prompt:** "I'm migrating my Django app to DSQL. Here's the output of +`python manage.py sqlmigrate myapp 0001`:" + +```sql +BEGIN; +CREATE TABLE myapp_order ( + id SERIAL PRIMARY KEY, + customer_id INT REFERENCES myapp_customer(id), + total DECIMAL(10,2), + metadata JSON +); +CREATE INDEX myapp_order_customer_idx ON myapp_order(customer_id); +COMMIT; +``` + +### Behavior Comparison + +| Behavior | With Skill | Baseline | Correct? | +| ----------------------- | ---------------------------------------- | ----------------------------------------------- | ----------------------- | +| Used deterministic tool | PASS Called `dsql_lint` | FAIL Relied on memory | Skill wins | +| SERIAL replacement | BIGINT IDENTITY | UUID | Both valid | +| JSON handling | PASS TEXT | FAIL JSONB | **Baseline wrong** | +| Index handling | PASS CREATE INDEX ASYNC | FAIL "Index is okay" | **Baseline wrong** | +| Multi-DDL detection | PASS Split into separate BEGIN/COMMIT | PARTIAL Said "remove BEGIN/COMMIT" but no split | **Baseline incomplete** | +| Django-specific advice | PASS "sqlmigrate → lint → execute fixed" | PARTIAL Generic (custom backend, atomic=False) | Skill more actionable | + +### With-Skill Output (summary) + +- Called `dsql_lint(sql=..., fix=true)` +- Reported 5 diagnostics: `serial_type`, `foreign_key`, `json_type`, `index_async`, `multi_ddl_transaction` +- Produced fixed SQL with each DDL in its own BEGIN/COMMIT block +- Gave specific Django advice: run sqlmigrate, lint output, execute fixed SQL directly +- Warned about foreign key removal requiring app-layer enforcement + +### Baseline Output (summary) + +- Did NOT use any validation tool +- Recommended `JSONB` (incorrect) +- Said CREATE INDEX "is okay as-is" (incorrect — needs ASYNC) +- Said "remove BEGIN/COMMIT" but didn't show the correct split pattern +- Gave generic Django advice (custom backend, atomic=False) without a concrete workflow + +### Baseline Failures + +1. **JSON → JSONB (wrong):** Same error as eval 100 +2. **Index "is okay" (wrong):** Same error as eval 100 +3. **Incomplete transaction handling:** Told user to remove BEGIN/COMMIT but didn't show + that each DDL needs its own transaction — user would likely run both DDL bare without + any transaction isolation + +--- + +## Eval 102: Clean DSQL-Compatible SQL + +**Prompt:** "Validate this SQL for DSQL compatibility but don't execute it yet: …" (UUID PK with `gen_random_uuid()`, TEXT payload, `CREATE INDEX ASYNC`). + +**Baseline:** Not run — this eval tests that the agent calls `dsql_lint` even when no compatibility issues are expected, and does not execute when the user said "don't execute." Baseline behavior is not a meaningful comparison for this expectation (either a baseline agent would also decline to execute, or it would over-modify compatible SQL — both are failure modes the skill change addresses by deferring to the deterministic tool). + +### With-Skill Output (summary) + +- Called `dsql_lint(sql=..., fix=false)` (validation-only mode appropriate for "don't execute") +- Tool returned `diagnostics: []`, `summary: { errors: 0, warnings: 0, fixed: 0 }` +- Agent reported to user that SQL is DSQL-compatible with no changes needed +- Agent did NOT call `transact` (honored the "don't execute" instruction) + +### Verdict + +PASS on all four expectations in `dsql_lint_evals.json` eval 102. The skill's "user said don't execute" handling works as documented in [Workflow: Validate & Migrate SQL to DSQL](../../../plugins/databases-on-aws/skills/dsql/references/dsql-lint.md). + +--- + +## Eval 103: MySQL Unsupported Syntax (`parse_error` fallback) + +**Prompt:** MySQL `CREATE TABLE` with `AUTO_INCREMENT`, `SET(...)` column, `ENGINE=InnoDB`, `PARTITION BY HASH(id)`, and explicit `FOREIGN KEY`. + +**Baseline:** Not run. Goal of this eval is to verify the `parse_error` fallback path — a baseline agent with no skill would hallucinate DSQL-compatible transformations without ever invoking the tool, so the signal (did the agent correctly fall back to `mysql-migrations/type-mapping.md`?) does not translate to a baseline comparison. + +### With-Skill Output (summary) + +- Called `dsql_lint(sql=..., fix=true)` +- Tool returned a single `parse_error` diagnostic at `AUTO_INCREMENT` (the PostgreSQL parser short-circuits on the first unsupported token; `AUTO_INCREMENT` and `PARTITION BY` reliably trigger `parse_error`, while `ENGINE=` clauses and `SET(...)` column types can pass silently through the PostgreSQL parser) +- Agent recognized `parse_error` rule and followed the Error Handling guidance in [dsql-lint.md](../../../plugins/databases-on-aws/skills/dsql/references/dsql-lint.md) to load [mysql-migrations/type-mapping.md](../../../plugins/databases-on-aws/skills/dsql/references/mysql-migrations/type-mapping.md) +- Agent proposed manual conversion (`INT AUTO_INCREMENT` → `BIGINT GENERATED ALWAYS AS IDENTITY`; `SET(...)` → TEXT with app-layer validation; omit `ENGINE=` and `PARTITION BY`) and offered to re-run `dsql_lint` on the converted SQL + +### Verdict + +PASS on the expectations in `dsql_lint_evals.json` eval 103. Agents MUST cross-check MySQL-origin SQL against `mysql-migrations/type-mapping.md` even when `dsql_lint` returns clean — `ENGINE=` and `SET(...)` pass silently. + +--- + +## Conclusion + +The skill produces measurably better outcomes by: + +1. **Eliminating hallucination** — `dsql_lint` provides deterministic validation instead of + the model guessing at DSQL constraints from training data +2. **Catching the JSON/JSONB error** — the baseline consistently recommends JSONB (which DSQL + rejects as a column type). This is a real data-loss-risk mistake that would fail at DDL + execution time. +3. **Enforcing ASYNC indexes** — the baseline misses this requirement entirely +4. **Providing actionable migration workflows** — the skill-guided agent gives concrete steps + (lint → review → execute) rather than generic advice + +The iron law holds: **the agent fails without this skill change** (gets JSON wrong, misses +ASYNC, doesn't split transactions). The skill teaches something the model does not already know. diff --git a/tools/evals/databases-on-aws/dsql/dsql_lint_evals.json b/tools/evals/databases-on-aws/dsql/dsql_lint_evals.json new file mode 100644 index 00000000..d1680da4 --- /dev/null +++ b/tools/evals/databases-on-aws/dsql/dsql_lint_evals.json @@ -0,0 +1,55 @@ +{ + "skill_name": "dsql", + "evals": [ + { + "id": 100, + "prompt": "I have this PostgreSQL schema from pg_dump. Can you check if it's compatible with DSQL and fix any issues?\n\nCREATE TABLE users (\n id SERIAL PRIMARY KEY,\n email VARCHAR(255) NOT NULL,\n preferences JSON,\n team_id INT REFERENCES teams(id)\n);\n\nCREATE INDEX idx_users_email ON users(email);", + "expected_output": "Calls dsql_lint with fix=true, presents diagnostics to user, shows the fixed SQL", + "files": [], + "expectations": [ + "Calls the dsql_lint MCP tool with the provided SQL", + "Uses fix=true to get DSQL-compatible output", + "Surfaces each diagnostic (fixed, fixed_with_warning, unfixable) to the user before executing", + "For fixed_with_warning diagnostics, explains application-layer implications before proceeding", + "Does NOT execute the SQL before dsql_lint returns and diagnostics are presented" + ] + }, + { + "id": 101, + "prompt": "I'm migrating my Django app to DSQL. Here's the output of `python manage.py sqlmigrate myapp 0001`:\n\nBEGIN;\nCREATE TABLE myapp_order (\n id SERIAL PRIMARY KEY,\n customer_id INT REFERENCES myapp_customer(id),\n total DECIMAL(10,2),\n metadata JSON\n);\nCREATE INDEX myapp_order_customer_idx ON myapp_order(customer_id);\nCOMMIT;", + "expected_output": "Calls dsql_lint with fix=true, identifies multi-DDL transaction issue, presents fixed SQL with each DDL in separate transaction", + "files": [], + "expectations": [ + "Calls the dsql_lint MCP tool", + "Identifies that the SQL has compatibility issues", + "Splits the multi-DDL transaction into separate transact calls (one DDL per call), not a single transact with all statements", + "Warns the user about removed foreign key constraint requiring app-layer enforcement", + "Does NOT execute fixed_sql while any diagnostic has fix_result.status == unfixable" + ] + }, + { + "id": 102, + "prompt": "Validate this SQL for DSQL compatibility but don't execute it yet:\n\nCREATE TABLE events (\n id UUID DEFAULT gen_random_uuid() PRIMARY KEY,\n tenant_id VARCHAR(255) NOT NULL,\n payload TEXT,\n created_at TIMESTAMP DEFAULT now()\n);\n\nCREATE INDEX ASYNC idx_events_tenant ON events(tenant_id);", + "expected_output": "Calls dsql_lint, reports that the SQL is already DSQL-compatible (no issues), does NOT execute", + "files": [], + "expectations": [ + "Calls the dsql_lint MCP tool to validate", + "Reports that the SQL is compatible (diagnostics array is empty, summary errors and warnings are zero)", + "Does NOT call transact (user explicitly said don't execute)" + ] + }, + { + "id": 103, + "prompt": "I need to migrate this MySQL table to DSQL:\n\nCREATE TABLE products (\n id INT AUTO_INCREMENT PRIMARY KEY,\n name VARCHAR(100),\n category_id INT,\n tags SET('electronics','clothing','food'),\n details JSON,\n FOREIGN KEY (category_id) REFERENCES categories(id)\n) ENGINE=InnoDB PARTITION BY HASH(id) PARTITIONS 4;", + "expected_output": "Calls dsql_lint with fix=true, identifies multiple issues including unfixable ones (PARTITION BY), presents what can be auto-fixed vs what needs manual rewrite", + "files": [], + "expectations": [ + "Calls the dsql_lint MCP tool with fix=true", + "Recognizes that the tool returned a parse_error diagnostic (the PostgreSQL parser short-circuits on AUTO_INCREMENT before reaching SET / ENGINE / PARTITION BY)", + "Does NOT claim all issues can be auto-fixed", + "Loads references/mysql-migrations/type-mapping.md and manually scans the source SQL for MySQL-specific syntax (AUTO_INCREMENT, SET column type, ENGINE=, PARTITION BY) rather than trusting a post-fix clean lint as sufficient", + "Proposes conversions for each MySQL-specific construct and offers to re-run dsql_lint on the converted SQL before executing" + ] + } + ] +}