Skip to content

feat(wren): add dbt import workflow#2279

Open
douenergy wants to merge 9 commits into
mainfrom
codex/dbt-integration
Open

feat(wren): add dbt import workflow#2279
douenergy wants to merge 9 commits into
mainfrom
codex/dbt-integration

Conversation

@douenergy
Copy link
Copy Markdown
Collaborator

@douenergy douenergy commented May 14, 2026

Summary

Adds dbt import support to Wren and includes an agent-agnostic Spodbtify A/B eval harness for comparing schema-only and dbt-integrated semantic-layer workflows.

dbt profile import

  • Adds wren profile import dbt.
  • Reads dbt_project.yml and dbt profiles.yml.
  • Resolves dbt env_var() references.
  • Maps dbt adapters to Wren datasources.
  • Converts the active dbt target into a Wren profile.

dbt context import

  • Adds wren context import dbt.
  • Reads manifest.json, catalog.json, optional run_results.json, and compiled SQL artifacts.
  • Generates Wren project files including models, relationships, instructions, agent guidance, and seed queries.
  • Skips ephemeral models and catalog-missing nodes.
  • Uses catalog.json as the authoritative column list so manifest-only columns are not imported.

Semantic enrichment and memory

  • Maps dbt tests into Wren metadata for nullability, primary keys, accepted values, and relationships.
  • Keeps dbt relationship tests in relationships.yml instead of stamping FK columns with relationship dereferences.
  • Enriches memory schema text and seed query generation with dbt layers, accepted values, and test metadata.

Agent eval harness

  • Adds a Spodbtify A/B eval spec, prompt runner, and structured-output schema.
  • Keeps the eval agent-agnostic so it can run with Codex, Claude, or another CLI runner.
  • Uses environment-variable placeholders for dataset locations.
  • Excludes scored result fixtures, agent output files, and machine-specific paths from the PR.

Docs

  • Adds a dbt integration guide.
  • Updates CLI reference for the new profile/context import commands.
  • Documents the eval runner workflow and output contract.

Test plan

  • python3 evals/spodbtify_ab/run_eval.py validate
  • python3 -m py_compile evals/spodbtify_ab/run_eval.py
  • Verified staged eval files contain no scored baseline/results or personal local paths.

Summary by CodeRabbit

  • New Features

    • Added wren profile import dbt and wren context import dbt for importing dbt targets and generating Wren projects (supports dry-run and force).
    • New dbt-to-Wren conversion: imports manifests/catalogs, maps adapters, and generates project files, profiles, metadata, relationships, and seed queries.
    • New eval CLI and spec for A/B comparison workflows.
  • Documentation

    • Added dbt integration guide and updated CLI reference.
  • Tests

    • Extensive unit/integration tests for dbt import, profile import, context import, memory/seed behaviors, and CLI flows.

Review Change Stack

@github-actions github-actions Bot added documentation Improvements or additions to documentation python Pull requests that update Python code core labels May 14, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Caution

Review failed

Failed to post review comments

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds full dbt import integration: a core dbt loader/converter, two CLI import commands (profile & context), schema/memory enrichment with dbt metadata, seed-query enhancements, expanded tests, docs, and a new Spodbtify A/B eval runner and assets.

Changes

dbt Integration Feature

Layer / File(s) Summary
Core dbt Loading and Conversion Module
core/wren/src/wren/dbt.py
Foundational module with DbtTarget/DbtArtifacts/DbtProjectImport dataclasses, DbtLoadError, YAML env_var resolution, dbt project/profile/artifact loading, compiled SQL ingestion, and end-to-end conversion to Wren profiles and project files across multiple datasource types with semantic metadata extraction.
Profile Import CLI Command
core/wren/src/wren/profile_cli.py
New wren profile import dbt command resolving dbt targets, converting to Wren profile config, validating connectivity, saving with optional custom names and activation.
Context Import CLI Command & write_project_files change
core/wren/src/wren/context_cli.py, core/wren/src/wren/context.py
New wren context import dbt command with --dry-run/--force support; write_project_files(force=True) now uses a mutable managed_paths set and conditionally includes queries.yml for deletion when present in inputs.
Schema Indexing with dbt Metadata
core/wren/src/wren/memory/schema_indexer.py
Extended describe_schema and extract_schema_items with optional dbt model properties (dbtLayer, dataScope) and column metadata (derivedFrom, acceptedValues, dbtTests, dbtTestStatus); refactored property extraction helpers.
Seed Query Generation with dbt Filtering
core/wren/src/wren/memory/seed_queries.py
Updated seed generation to skip raw models, filter relationships with raw endpoints, and generate WHERE-clause seeds from acceptedValues; added property helpers for normalization.
dbt Module Unit Tests
core/wren/tests/unit/test_dbt.py
Tests for adapter mapping, env_var resolution, profile/target resolution, conversion to Wren profiles, and artifact/compiled-SQL loading with error handling.
CLI Integration Tests
core/wren/tests/unit/test_context_cli.py, core/wren/tests/test_profile_cli.py
Integration tests for wren profile import dbt and wren context import dbt including dry-run, write, force overwrite behavior, custom naming, activation, and minimal dbt project fixtures.
Memory & Seed Tests
core/wren/tests/unit/test_memory.py, core/wren/tests/unit/test_seed_queries.py
Updated _MANIFEST to include dbt properties; adjusted schema_items count (10 → 11); tests for acceptedValues-driven seeds, raw model seed skipping, and relationship filtering involving raw models.
dbt Integration Documentation
docs/core/guides/dbt-integration.md, docs/core/README.md, docs/core/reference/cli.md
User guide and CLI reference covering prerequisites, import steps, generated artifacts, skip behaviors, dbt test mapping, build workflow, DuckDB example, and troubleshooting.
Spodbtify A/B Eval
evals/spodbtify_ab/*
New eval spec, runner CLI, agent-output schema, README, and .gitignore supporting an A/B evaluation comparing schema-only vs dbt-integrated workflows.

Sequence Diagram(s)

sequenceDiagram
  participant UserCLI
  participant ContextCLI
  participant DbtModule
  participant Filesystem
  UserCLI->>ContextCLI: run `wren context import dbt` args
  ContextCLI->>DbtModule: convert_dbt_project_to_wren_project(...)
  DbtModule->>ContextCLI: DbtProjectImport(files, counts)
  ContextCLI->>Filesystem: write_project_files(output_dir, files, force)
  Filesystem->>ContextCLI: report success / errors
  ContextCLI->>UserCLI: print summary & next steps
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • yichieh-lu
  • paopa
  • cyyeh

Poem

🐰 I nibble code from dawn to noon,
I turn dbt files into Wren's bright tune.
Profiles, seeds, and schemas bloom—so neat,
Raw rows skipped, accepted-values meet.
Tests and docs snug in tidy suite.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(wren): add dbt import workflow' clearly and concisely summarizes the main change—adding dbt import capabilities to Wren—which is the primary focus of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/dbt-integration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/wren/src/wren/context.py (1)

291-299: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid unconditional queries.yml deletion in shared force-overwrite path.

Line 298 currently deletes queries.yml for every force=True write. Because this helper is shared, context init --from-mdl --force can now erase curated query pairs even when the conversion output does not include a replacement queries.yml.

Proposed fix
     if force and output_dir.exists():
         import shutil  # noqa: PLC0415
 
-        for managed in (
+        managed_paths = {
             "models",
             "views",
             "relationships.yml",
             "instructions.md",
             "wren_project.yml",
             "AGENTS.md",
-            "queries.yml",
-        ):
+        }
+        if any(f.relative_path == "queries.yml" for f in files):
+            managed_paths.add("queries.yml")
+
+        for managed in managed_paths:
             target = output_dir / managed
             if target.is_dir():
                 shutil.rmtree(target)
             elif target.exists():
                 target.unlink()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/context.py` around lines 291 - 299, The loop that
unconditionally iterates over the tuple including "queries.yml" is causing
forced overwrites to always delete queries.yml; update the logic in the function
in context.py that contains the for managed in (... "queries.yml", ...) loop so
that "queries.yml" is not always deleted: remove "queries.yml" from the
unconditional managed-items tuple and instead perform a conditional deletion
only when the conversion/write operation actually includes a replacement (e.g.,
check the set/list of files the conversion will produce or an explicit flag like
files_to_write or created_files and only delete if "queries.yml" is present
there), or expose and use a caller-provided parameter to control removal; this
ensures context init --from-mdl --force does not erase curated queries unless a
new queries.yml is being written.
🧹 Nitpick comments (4)
core/wren/src/wren/dbt.py (4)

434-442: 💤 Low value

Postgres branch is inconsistent with sibling branches — password: None leaks through.

Unlike mysql/redshift/mssql/clickhouse/snowflake/trino/athena/bigquery, the postgres branch returns the dict directly instead of running it through _filter_none. When no password is configured (e.g., trust auth, IAM auth via PGPASSWORD elsewhere), the resulting profile contains "password": None, which can confuse profile validation/serialization that distinguishes "absent" from "null".

♻️ Align with sibling branches
-    if datasource == "postgres":
-        return {
-            "datasource": "postgres",
-            "host": str(_require_output_field(output, "host")),
-            "port": str(output.get("port", "5432")),
-            "database": str(_require_output_field(output, "dbname", "database")),
-            "user": str(_require_output_field(output, "user")),
-            "password": str(output["password"]) if output.get("password") else None,
-        }
+    if datasource == "postgres":
+        return _filter_none(
+            {
+                "datasource": "postgres",
+                "host": str(_require_output_field(output, "host")),
+                "port": str(output.get("port", "5432")),
+                "database": str(_require_output_field(output, "dbname", "database")),
+                "user": str(_require_output_field(output, "user")),
+                "password": str(output["password"]) if output.get("password") else None,
+            }
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/dbt.py` around lines 434 - 442, The Postgres branch
currently builds and returns a dict containing "password": None when no password
is present; change it to mirror sibling branches by passing the constructed dict
through _filter_none before returning to remove None values. Locate the postgres
branch handling (datasource == "postgres") that uses _require_output_field and
output, construct the same dict as now but return _filter_none({...}) instead of
the raw dict so absent passwords are omitted.

1070-1077: 💤 Low value

Redundant outer guard in _aggregate_status.

The if any(...) check duplicates the inner loop's condition. If none of failing/error/warning are present, the loop simply doesn't return.

♻️ Drop the outer check
 def _aggregate_status(statuses: list[str]) -> str:
-    if any(status in {"failing", "error", "warning"} for status in statuses):
-        for priority in ("failing", "error", "warning"):
-            if priority in statuses:
-                return priority
-    if any(status == "verified" for status in statuses):
+    for priority in ("failing", "error", "warning"):
+        if priority in statuses:
+            return priority
+    if "verified" in statuses:
         return "verified"
     return "unknown"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/dbt.py` around lines 1070 - 1077, The outer if any(...) in
_aggregate_status is redundant; remove that guard and instead directly iterate
the priority tuple ("failing","error","warning") and return the first priority
found in statuses, then keep the existing check for "verified" and finally
return "unknown"; update the function _aggregate_status to perform the priority
loop without the extra any(...) check so behavior and return order remain
unchanged.

793-802: ⚡ Quick win

Dead loop — remove or convert to a real branch.

The for name in manifest_columns: if name not in merged_names: pass block has no side effect; it iterates and discards. The intent comment ("manifest-only: skip") is fine, but the empty loop is misleading on review and obscures whether manifest-only columns were ever supposed to be included.

♻️ Simplify column merging
-    # When catalog data is available, restrict to columns that actually exist in
-    # the database. Manifest-only columns (documented in schema.yml but not
-    # materialized) are skipped to avoid referencing non-existent columns.
-    if catalog_columns:
-        merged_names = list(catalog_columns)
-        for name in manifest_columns:
-            if name not in merged_names:
-                pass  # manifest-only: skip
-    else:
-        merged_names = list(manifest_columns)
+    # When catalog data is available, restrict to columns that actually exist in
+    # the database; manifest-only columns (documented in schema.yml but not
+    # materialized) are intentionally skipped to avoid referencing non-existent
+    # columns.
+    merged_names = list(catalog_columns) if catalog_columns else list(manifest_columns)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/dbt.py` around lines 793 - 802, The empty loop over
manifest_columns is dead code and should be removed; keep merged_names =
list(catalog_columns) when catalog_columns is present and drop the for name in
manifest_columns: ... pass block, or if you want to be explicit, replace it with
a short comment explaining that manifest-only columns are intentionally skipped;
update references to merged_names, catalog_columns, and manifest_columns in
dbt.py accordingly so the intent is clear without an empty iteration.

838-843: 💤 Low value

Simplify membership checks in infer_dbt_layer.

any("staging" == part for part in fqn) is "staging" in fqnfqn is already a list of lowercased strings on line 832. Same for "marts" and "intermediate".

♻️ Use direct membership
-    if any("staging" == part for part in fqn) or name.startswith("stg_"):
+    if "staging" in fqn or name.startswith("stg_"):
         return "staging"
-    if any("marts" == part for part in fqn) or name.startswith(("fct_", "dim_")):
+    if "marts" in fqn or name.startswith(("fct_", "dim_")):
         return "mart"
-    if any("intermediate" == part for part in fqn) or name.startswith("int_"):
+    if "intermediate" in fqn or name.startswith("int_"):
         return "intermediate"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/src/wren/dbt.py` around lines 838 - 843, In infer_dbt_layer,
replace the generator-based membership checks like any("staging" == part for
part in fqn) with direct membership tests ("staging" in fqn) (and do the same
for "marts" and "intermediate") since fqn is already a list of lowercased
strings; keep the existing name.startswith checks for "stg_", ("fct_", "dim_"),
and "int_" unchanged and ensure the function still returns "staging", "mart", or
"intermediate" as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren/src/wren/dbt.py`:
- Around line 161-175: load_dbt_profiles currently jumps straight to
"~/.dbt/profiles.yml" when profiles_path is None; update it to follow dbt
precedence: if profiles_path is provided use it (and if it’s a directory append
_DBT_PROFILES_FILE), otherwise check for DBT_PROFILES_DIR environment variable
and use that (append _DBT_PROFILES_FILE if it’s a directory), then check the
current working directory for a "profiles.yml" file, and finally fall back to
Path.home()/".dbt"/_DBT_PROFILES_FILE; ensure _load_yaml_file is called with the
resolved Path and raise the same DbtLoadError cases if the file is
missing/empty/invalid.
- Around line 707-717: The stored accepted_values in the column properties is
currently a comma-joined string, which loses commas inside values; change the
assignment in the accepted_values branch so column.setdefault("properties",
{})["accepted_values"] stores a list of strings (e.g., [str(v) for v in values])
to mirror event["values"] and preserve value fidelity; update any code paths in
this module that assumed a CSV string (references: the accepted_values branch
handling, the _record_column_test call and the event["values"] population, and
keep _build_verified_constraint_lines using event["values"] unchanged) and
ensure downstream consumers (seed_queries.py and schema_indexer.py) are adjusted
to accept a list rather than splitting a CSV string.

In `@core/wren/src/wren/memory/seed_queries.py`:
- Around line 100-117: The loop currently treats accepted_values (from
_prop_value) as CSV text and always calls .split(','), which breaks when
accepted_values is already a sequence; change the logic in the columns loop to
first detect sequence types (e.g., list/tuple or collections.abc.Sequence
excluding str/bytes) and extract the first non-empty element as first_value,
otherwise fall back to splitting the string by commas and picking the first
non-empty token; then continue to quote (replace "'" with "''") and append the
pair as before (references: _prop_value, accepted_values, first_value, quoted,
pairs.append in the columns loop).

---

Outside diff comments:
In `@core/wren/src/wren/context.py`:
- Around line 291-299: The loop that unconditionally iterates over the tuple
including "queries.yml" is causing forced overwrites to always delete
queries.yml; update the logic in the function in context.py that contains the
for managed in (... "queries.yml", ...) loop so that "queries.yml" is not always
deleted: remove "queries.yml" from the unconditional managed-items tuple and
instead perform a conditional deletion only when the conversion/write operation
actually includes a replacement (e.g., check the set/list of files the
conversion will produce or an explicit flag like files_to_write or created_files
and only delete if "queries.yml" is present there), or expose and use a
caller-provided parameter to control removal; this ensures context init
--from-mdl --force does not erase curated queries unless a new queries.yml is
being written.

---

Nitpick comments:
In `@core/wren/src/wren/dbt.py`:
- Around line 434-442: The Postgres branch currently builds and returns a dict
containing "password": None when no password is present; change it to mirror
sibling branches by passing the constructed dict through _filter_none before
returning to remove None values. Locate the postgres branch handling (datasource
== "postgres") that uses _require_output_field and output, construct the same
dict as now but return _filter_none({...}) instead of the raw dict so absent
passwords are omitted.
- Around line 1070-1077: The outer if any(...) in _aggregate_status is
redundant; remove that guard and instead directly iterate the priority tuple
("failing","error","warning") and return the first priority found in statuses,
then keep the existing check for "verified" and finally return "unknown"; update
the function _aggregate_status to perform the priority loop without the extra
any(...) check so behavior and return order remain unchanged.
- Around line 793-802: The empty loop over manifest_columns is dead code and
should be removed; keep merged_names = list(catalog_columns) when
catalog_columns is present and drop the for name in manifest_columns: ... pass
block, or if you want to be explicit, replace it with a short comment explaining
that manifest-only columns are intentionally skipped; update references to
merged_names, catalog_columns, and manifest_columns in dbt.py accordingly so the
intent is clear without an empty iteration.
- Around line 838-843: In infer_dbt_layer, replace the generator-based
membership checks like any("staging" == part for part in fqn) with direct
membership tests ("staging" in fqn) (and do the same for "marts" and
"intermediate") since fqn is already a list of lowercased strings; keep the
existing name.startswith checks for "stg_", ("fct_", "dim_"), and "int_"
unchanged and ensure the function still returns "staging", "mart", or
"intermediate" as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 09d7d1a4-b275-4ef0-a97a-a6cd0af7bdc3

📥 Commits

Reviewing files that changed from the base of the PR and between 6732d97 and 9c3084d.

📒 Files selected for processing (14)
  • core/wren/src/wren/context.py
  • core/wren/src/wren/context_cli.py
  • core/wren/src/wren/dbt.py
  • core/wren/src/wren/memory/schema_indexer.py
  • core/wren/src/wren/memory/seed_queries.py
  • core/wren/src/wren/profile_cli.py
  • core/wren/tests/test_profile_cli.py
  • core/wren/tests/unit/test_context_cli.py
  • core/wren/tests/unit/test_dbt.py
  • core/wren/tests/unit/test_memory.py
  • core/wren/tests/unit/test_seed_queries.py
  • docs/core/README.md
  • docs/core/guides/dbt-integration.md
  • docs/core/reference/cli.md

Comment thread core/wren/src/wren/dbt.py
Comment thread core/wren/src/wren/dbt.py Outdated
Comment thread core/wren/src/wren/memory/seed_queries.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren/tests/unit/test_context_cli.py`:
- Around line 572-575: The assertion on
relationships["relationships"][0]["models"] is order-sensitive and should be
made order-independent; modify the test in test_context_cli.py to compare the
models as sets (e.g., set(relationships["relationships"][0]["models"]) ==
{"fct_orders","stg_orders"}) or by sorting both lists before comparing, so the
test no longer depends on list order for the "models" field.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c563a6ad-ff05-4117-8d72-4bdccb01784f

📥 Commits

Reviewing files that changed from the base of the PR and between 9c3084d and 5ba7ac3.

📒 Files selected for processing (8)
  • core/wren/src/wren/context.py
  • core/wren/src/wren/dbt.py
  • core/wren/src/wren/memory/schema_indexer.py
  • core/wren/src/wren/memory/seed_queries.py
  • core/wren/tests/unit/test_context_cli.py
  • core/wren/tests/unit/test_dbt.py
  • core/wren/tests/unit/test_memory.py
  • core/wren/tests/unit/test_seed_queries.py
🚧 Files skipped from review as they are similar to previous changes (6)
  • core/wren/src/wren/memory/seed_queries.py
  • core/wren/src/wren/context.py
  • core/wren/src/wren/memory/schema_indexer.py
  • core/wren/tests/unit/test_seed_queries.py
  • core/wren/src/wren/dbt.py
  • core/wren/tests/unit/test_memory.py

Comment thread core/wren/tests/unit/test_context_cli.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@evals/spodbtify_ab/run_eval.py`:
- Around line 100-105: During score validation, replace the current set-based
check (scores -> seen_ids) with a check that first builds the list of question
IDs (e.g., ids_list = [entry.get("question_id") for entry in scores]) and
rejects runs that contain duplicates by comparing len(ids_list) to
len(set(ids_list)); if duplicates exist, append an error mentioning the
duplicate ids and run_index, otherwise compare the unique id set to expected_ids
as before. Apply the same change to the other similar blocks referenced (the
sections around lines 107-115 and 122-132) so duplicate question_id entries are
detected and rejected before computing totals (use variables seen_ids, scores,
expected_ids, errors, run_index to locate the exact spots).
- Around line 361-366: The current try/except around
load_json(AGENT_OUTPUT_SCHEMA_PATH) only catches ValueError and will crash with
FileNotFoundError when the optional agent_output.schema.json is missing; update
the block in run_eval.py so that if load_json raises FileNotFoundError you treat
it as "no schema" (set schema_errors = []), and still handle ValueError by
capturing the validation error string (schema_errors = [str(exc)]); reference
the load_json call and AGENT_OUTPUT_SCHEMA_PATH so you modify that exact
try/except and ensure errors = spec_errors + schema_errors continues to work.
- Around line 318-320: Add a new CLI argument named --timeout-seconds, parse it
into a variable (e.g., timeout_seconds) and pass it into the subprocess.run call
that currently invokes subprocess.run(command, shell=True, cwd=str(Path.cwd())).
Update the invocation to include timeout=timeout_seconds and handle
subprocess.TimeoutExpired around that call: on timeout, log/print an error and
exit with a non-zero status (similar to the existing returncode handling).
Ensure you reference the same local variables (command, result) and
import/handle subprocess.TimeoutExpired so the eval run won't hang indefinitely.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7b9c711e-55de-403d-9d13-86db6440b318

📥 Commits

Reviewing files that changed from the base of the PR and between 5ba7ac3 and 72123f6.

📒 Files selected for processing (5)
  • evals/spodbtify_ab/.gitignore
  • evals/spodbtify_ab/README.md
  • evals/spodbtify_ab/agent_output.schema.json
  • evals/spodbtify_ab/run_eval.py
  • evals/spodbtify_ab/spodbtify_ab_eval.json
✅ Files skipped from review due to trivial changes (3)
  • evals/spodbtify_ab/.gitignore
  • evals/spodbtify_ab/agent_output.schema.json
  • evals/spodbtify_ab/spodbtify_ab_eval.json

Comment thread evals/spodbtify_ab/run_eval.py
Comment thread evals/spodbtify_ab/run_eval.py Outdated
Comment thread evals/spodbtify_ab/run_eval.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
evals/spodbtify_ab/run_eval.py (1)

92-129: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard malformed score files before dereferencing nested entries.

validate_score_file() still crashes on malformed JSON instead of reporting validation errors. A non-object run, a non-list scores, or an unhashable question_id will blow up on run.get(...), entry.get(...), or seen_ids.add(qid) before this function can return errors.

Suggested fix
 def validate_score_file(score_file: dict[str, Any], spec: dict[str, Any]) -> list[str]:
     errors: list[str] = []
     expected_ids = {q["id"] for q in spec["questions"]}
-    for run_index, run in enumerate(score_file.get("runs", []), start=1):
+    runs = score_file.get("runs", [])
+    if not isinstance(runs, list):
+        return ["runs must be a list"]
+
+    for run_index, run in enumerate(runs, start=1):
+        if not isinstance(run, dict):
+            errors.append(f"run {run_index}: run must be an object")
+            continue
+
         workflow = run.get("workflow")
         if workflow not in WORKFLOWS:
             errors.append(f"run {run_index}: unknown workflow {workflow!r}")

         scores = run.get("scores", [])
+        if not isinstance(scores, list):
+            errors.append(f"run {run_index}: scores must be a list")
+            continue
+
         seen_ids: set[Any] = set()
         duplicate_ids: set[Any] = set()
         for entry in scores:
+            if not isinstance(entry, dict):
+                errors.append(f"run {run_index}: each score entry must be an object")
+                continue
             qid = entry.get("question_id")
+            if not isinstance(qid, int):
+                errors.append(
+                    f"run {run_index}: question_id must be an integer, got {qid!r}"
+                )
+                continue
             if qid in seen_ids:
                 duplicate_ids.add(qid)
             seen_ids.add(qid)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@evals/spodbtify_ab/run_eval.py` around lines 92 - 129, validate_score_file
currently dereferences nested fields assuming types; this can raise on malformed
input (non-dict run, non-list scores, non-dict entry, or unhashable
question_id). Update validate_score_file to defensive-check types: ensure each
run is a mapping (dict) before calling run.get (append an error and continue if
not), ensure scores is a list before iterating (append an error and continue),
ensure each score entry is a mapping before accessing entry.get (append an error
and continue), and guard seen_ids.add(qid) by checking hashability (or convert
qid to a string and record an error for unhashable ids). Keep the existing
validations (workflow, duplicates, expected ids, DIMENSIONS) but only run them
when the necessary types are valid; reference the validate_score_file function,
WORKFLOWS, DIMENSIONS, and the local variables run, scores, entry, qid when
making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@evals/spodbtify_ab/run_eval.py`:
- Around line 92-129: validate_score_file currently dereferences nested fields
assuming types; this can raise on malformed input (non-dict run, non-list
scores, non-dict entry, or unhashable question_id). Update validate_score_file
to defensive-check types: ensure each run is a mapping (dict) before calling
run.get (append an error and continue if not), ensure scores is a list before
iterating (append an error and continue), ensure each score entry is a mapping
before accessing entry.get (append an error and continue), and guard
seen_ids.add(qid) by checking hashability (or convert qid to a string and record
an error for unhashable ids). Keep the existing validations (workflow,
duplicates, expected ids, DIMENSIONS) but only run them when the necessary types
are valid; reference the validate_score_file function, WORKFLOWS, DIMENSIONS,
and the local variables run, scores, entry, qid when making these changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bdd615b8-8005-44e1-a514-fe1c25af545a

📥 Commits

Reviewing files that changed from the base of the PR and between 72123f6 and 2585bc6.

📒 Files selected for processing (1)
  • evals/spodbtify_ab/run_eval.py

@@ -0,0 +1,134 @@
# dbt Integration

Import a dbt project into Wren AI Core to query dbt models through the Wren semantic layer.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Import a dbt project into Wren AI Core to query dbt models through the Wren semantic layer.
Import a dbt project into Wren AI to query dbt models through the Wren semantic layer.

Comment thread core/wren/src/wren/dbt.py
Comment on lines +365 to +369
"dbt": {
"project_dir": dbt_binding_dir,
"profile": target.profile_name,
"target": target.target_name,
},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be documented in the doc.

Comment thread core/wren/src/wren/dbt.py
Comment on lines +440 to +450
if datasource == "postgres":
return _filter_none(
{
"datasource": "postgres",
"host": str(_require_output_field(output, "host")),
"port": str(output.get("port", "5432")),
"database": str(_require_output_field(output, "dbname", "database")),
"user": str(_require_output_field(output, "user")),
"password": str(output["password"]) if output.get("password") else None,
}
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coding the field name is a big maintenance effort. It's better to use the connection info pydanitc from wren.model.data_source to validate or build the connection info.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation already happens in profile_cli (via datasource.get_connection_info(...)), but switching the builder to a Pydantic model_dump() still has value.

Comment thread core/wren/src/wren/dbt.py
Comment on lines +821 to +823
data_type = (
catalog_col.get("type") or manifest_col.get("data_type") or "VARCHAR"
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use wren.type_mapping.parse_type to handle the dialect data type.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should provide more explanation of what we get from the DBT project, how we store it in the WREN project, and why we get it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core documentation Improvements or additions to documentation python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants