feat(wren): add dbt import workflow#2279
Conversation
|
Caution Review failedFailed to post review comments Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds full dbt import integration: a core dbt loader/converter, two CLI import commands (profile & context), schema/memory enrichment with dbt metadata, seed-query enhancements, expanded tests, docs, and a new Spodbtify A/B eval runner and assets. Changesdbt Integration Feature
Sequence Diagram(s)sequenceDiagram
participant UserCLI
participant ContextCLI
participant DbtModule
participant Filesystem
UserCLI->>ContextCLI: run `wren context import dbt` args
ContextCLI->>DbtModule: convert_dbt_project_to_wren_project(...)
DbtModule->>ContextCLI: DbtProjectImport(files, counts)
ContextCLI->>Filesystem: write_project_files(output_dir, files, force)
Filesystem->>ContextCLI: report success / errors
ContextCLI->>UserCLI: print summary & next steps
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/wren/src/wren/context.py (1)
291-299:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid unconditional
queries.ymldeletion in shared force-overwrite path.Line 298 currently deletes
queries.ymlfor everyforce=Truewrite. Because this helper is shared,context init --from-mdl --forcecan now erase curated query pairs even when the conversion output does not include a replacementqueries.yml.Proposed fix
if force and output_dir.exists(): import shutil # noqa: PLC0415 - for managed in ( + managed_paths = { "models", "views", "relationships.yml", "instructions.md", "wren_project.yml", "AGENTS.md", - "queries.yml", - ): + } + if any(f.relative_path == "queries.yml" for f in files): + managed_paths.add("queries.yml") + + for managed in managed_paths: target = output_dir / managed if target.is_dir(): shutil.rmtree(target) elif target.exists(): target.unlink()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/context.py` around lines 291 - 299, The loop that unconditionally iterates over the tuple including "queries.yml" is causing forced overwrites to always delete queries.yml; update the logic in the function in context.py that contains the for managed in (... "queries.yml", ...) loop so that "queries.yml" is not always deleted: remove "queries.yml" from the unconditional managed-items tuple and instead perform a conditional deletion only when the conversion/write operation actually includes a replacement (e.g., check the set/list of files the conversion will produce or an explicit flag like files_to_write or created_files and only delete if "queries.yml" is present there), or expose and use a caller-provided parameter to control removal; this ensures context init --from-mdl --force does not erase curated queries unless a new queries.yml is being written.
🧹 Nitpick comments (4)
core/wren/src/wren/dbt.py (4)
434-442: 💤 Low valuePostgres branch is inconsistent with sibling branches —
password: Noneleaks through.Unlike
mysql/redshift/mssql/clickhouse/snowflake/trino/athena/bigquery, the postgres branch returns the dict directly instead of running it through_filter_none. When no password is configured (e.g., trust auth, IAM auth via PGPASSWORD elsewhere), the resulting profile contains"password": None, which can confuse profile validation/serialization that distinguishes "absent" from "null".♻️ Align with sibling branches
- if datasource == "postgres": - return { - "datasource": "postgres", - "host": str(_require_output_field(output, "host")), - "port": str(output.get("port", "5432")), - "database": str(_require_output_field(output, "dbname", "database")), - "user": str(_require_output_field(output, "user")), - "password": str(output["password"]) if output.get("password") else None, - } + if datasource == "postgres": + return _filter_none( + { + "datasource": "postgres", + "host": str(_require_output_field(output, "host")), + "port": str(output.get("port", "5432")), + "database": str(_require_output_field(output, "dbname", "database")), + "user": str(_require_output_field(output, "user")), + "password": str(output["password"]) if output.get("password") else None, + } + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/dbt.py` around lines 434 - 442, The Postgres branch currently builds and returns a dict containing "password": None when no password is present; change it to mirror sibling branches by passing the constructed dict through _filter_none before returning to remove None values. Locate the postgres branch handling (datasource == "postgres") that uses _require_output_field and output, construct the same dict as now but return _filter_none({...}) instead of the raw dict so absent passwords are omitted.
1070-1077: 💤 Low valueRedundant outer guard in
_aggregate_status.The
if any(...)check duplicates the inner loop's condition. If none offailing/error/warningare present, the loop simply doesn't return.♻️ Drop the outer check
def _aggregate_status(statuses: list[str]) -> str: - if any(status in {"failing", "error", "warning"} for status in statuses): - for priority in ("failing", "error", "warning"): - if priority in statuses: - return priority - if any(status == "verified" for status in statuses): + for priority in ("failing", "error", "warning"): + if priority in statuses: + return priority + if "verified" in statuses: return "verified" return "unknown"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/dbt.py` around lines 1070 - 1077, The outer if any(...) in _aggregate_status is redundant; remove that guard and instead directly iterate the priority tuple ("failing","error","warning") and return the first priority found in statuses, then keep the existing check for "verified" and finally return "unknown"; update the function _aggregate_status to perform the priority loop without the extra any(...) check so behavior and return order remain unchanged.
793-802: ⚡ Quick winDead loop — remove or convert to a real branch.
The
for name in manifest_columns: if name not in merged_names: passblock has no side effect; it iterates and discards. The intent comment ("manifest-only: skip") is fine, but the empty loop is misleading on review and obscures whether manifest-only columns were ever supposed to be included.♻️ Simplify column merging
- # When catalog data is available, restrict to columns that actually exist in - # the database. Manifest-only columns (documented in schema.yml but not - # materialized) are skipped to avoid referencing non-existent columns. - if catalog_columns: - merged_names = list(catalog_columns) - for name in manifest_columns: - if name not in merged_names: - pass # manifest-only: skip - else: - merged_names = list(manifest_columns) + # When catalog data is available, restrict to columns that actually exist in + # the database; manifest-only columns (documented in schema.yml but not + # materialized) are intentionally skipped to avoid referencing non-existent + # columns. + merged_names = list(catalog_columns) if catalog_columns else list(manifest_columns)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/dbt.py` around lines 793 - 802, The empty loop over manifest_columns is dead code and should be removed; keep merged_names = list(catalog_columns) when catalog_columns is present and drop the for name in manifest_columns: ... pass block, or if you want to be explicit, replace it with a short comment explaining that manifest-only columns are intentionally skipped; update references to merged_names, catalog_columns, and manifest_columns in dbt.py accordingly so the intent is clear without an empty iteration.
838-843: 💤 Low valueSimplify membership checks in
infer_dbt_layer.
any("staging" == part for part in fqn)is"staging" in fqn—fqnis already a list of lowercased strings on line 832. Same for"marts"and"intermediate".♻️ Use direct membership
- if any("staging" == part for part in fqn) or name.startswith("stg_"): + if "staging" in fqn or name.startswith("stg_"): return "staging" - if any("marts" == part for part in fqn) or name.startswith(("fct_", "dim_")): + if "marts" in fqn or name.startswith(("fct_", "dim_")): return "mart" - if any("intermediate" == part for part in fqn) or name.startswith("int_"): + if "intermediate" in fqn or name.startswith("int_"): return "intermediate"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/wren/src/wren/dbt.py` around lines 838 - 843, In infer_dbt_layer, replace the generator-based membership checks like any("staging" == part for part in fqn) with direct membership tests ("staging" in fqn) (and do the same for "marts" and "intermediate") since fqn is already a list of lowercased strings; keep the existing name.startswith checks for "stg_", ("fct_", "dim_"), and "int_" unchanged and ensure the function still returns "staging", "mart", or "intermediate" as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/wren/src/wren/dbt.py`:
- Around line 161-175: load_dbt_profiles currently jumps straight to
"~/.dbt/profiles.yml" when profiles_path is None; update it to follow dbt
precedence: if profiles_path is provided use it (and if it’s a directory append
_DBT_PROFILES_FILE), otherwise check for DBT_PROFILES_DIR environment variable
and use that (append _DBT_PROFILES_FILE if it’s a directory), then check the
current working directory for a "profiles.yml" file, and finally fall back to
Path.home()/".dbt"/_DBT_PROFILES_FILE; ensure _load_yaml_file is called with the
resolved Path and raise the same DbtLoadError cases if the file is
missing/empty/invalid.
- Around line 707-717: The stored accepted_values in the column properties is
currently a comma-joined string, which loses commas inside values; change the
assignment in the accepted_values branch so column.setdefault("properties",
{})["accepted_values"] stores a list of strings (e.g., [str(v) for v in values])
to mirror event["values"] and preserve value fidelity; update any code paths in
this module that assumed a CSV string (references: the accepted_values branch
handling, the _record_column_test call and the event["values"] population, and
keep _build_verified_constraint_lines using event["values"] unchanged) and
ensure downstream consumers (seed_queries.py and schema_indexer.py) are adjusted
to accept a list rather than splitting a CSV string.
In `@core/wren/src/wren/memory/seed_queries.py`:
- Around line 100-117: The loop currently treats accepted_values (from
_prop_value) as CSV text and always calls .split(','), which breaks when
accepted_values is already a sequence; change the logic in the columns loop to
first detect sequence types (e.g., list/tuple or collections.abc.Sequence
excluding str/bytes) and extract the first non-empty element as first_value,
otherwise fall back to splitting the string by commas and picking the first
non-empty token; then continue to quote (replace "'" with "''") and append the
pair as before (references: _prop_value, accepted_values, first_value, quoted,
pairs.append in the columns loop).
---
Outside diff comments:
In `@core/wren/src/wren/context.py`:
- Around line 291-299: The loop that unconditionally iterates over the tuple
including "queries.yml" is causing forced overwrites to always delete
queries.yml; update the logic in the function in context.py that contains the
for managed in (... "queries.yml", ...) loop so that "queries.yml" is not always
deleted: remove "queries.yml" from the unconditional managed-items tuple and
instead perform a conditional deletion only when the conversion/write operation
actually includes a replacement (e.g., check the set/list of files the
conversion will produce or an explicit flag like files_to_write or created_files
and only delete if "queries.yml" is present there), or expose and use a
caller-provided parameter to control removal; this ensures context init
--from-mdl --force does not erase curated queries unless a new queries.yml is
being written.
---
Nitpick comments:
In `@core/wren/src/wren/dbt.py`:
- Around line 434-442: The Postgres branch currently builds and returns a dict
containing "password": None when no password is present; change it to mirror
sibling branches by passing the constructed dict through _filter_none before
returning to remove None values. Locate the postgres branch handling (datasource
== "postgres") that uses _require_output_field and output, construct the same
dict as now but return _filter_none({...}) instead of the raw dict so absent
passwords are omitted.
- Around line 1070-1077: The outer if any(...) in _aggregate_status is
redundant; remove that guard and instead directly iterate the priority tuple
("failing","error","warning") and return the first priority found in statuses,
then keep the existing check for "verified" and finally return "unknown"; update
the function _aggregate_status to perform the priority loop without the extra
any(...) check so behavior and return order remain unchanged.
- Around line 793-802: The empty loop over manifest_columns is dead code and
should be removed; keep merged_names = list(catalog_columns) when
catalog_columns is present and drop the for name in manifest_columns: ... pass
block, or if you want to be explicit, replace it with a short comment explaining
that manifest-only columns are intentionally skipped; update references to
merged_names, catalog_columns, and manifest_columns in dbt.py accordingly so the
intent is clear without an empty iteration.
- Around line 838-843: In infer_dbt_layer, replace the generator-based
membership checks like any("staging" == part for part in fqn) with direct
membership tests ("staging" in fqn) (and do the same for "marts" and
"intermediate") since fqn is already a list of lowercased strings; keep the
existing name.startswith checks for "stg_", ("fct_", "dim_"), and "int_"
unchanged and ensure the function still returns "staging", "mart", or
"intermediate" as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 09d7d1a4-b275-4ef0-a97a-a6cd0af7bdc3
📒 Files selected for processing (14)
core/wren/src/wren/context.pycore/wren/src/wren/context_cli.pycore/wren/src/wren/dbt.pycore/wren/src/wren/memory/schema_indexer.pycore/wren/src/wren/memory/seed_queries.pycore/wren/src/wren/profile_cli.pycore/wren/tests/test_profile_cli.pycore/wren/tests/unit/test_context_cli.pycore/wren/tests/unit/test_dbt.pycore/wren/tests/unit/test_memory.pycore/wren/tests/unit/test_seed_queries.pydocs/core/README.mddocs/core/guides/dbt-integration.mddocs/core/reference/cli.md
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/wren/tests/unit/test_context_cli.py`:
- Around line 572-575: The assertion on
relationships["relationships"][0]["models"] is order-sensitive and should be
made order-independent; modify the test in test_context_cli.py to compare the
models as sets (e.g., set(relationships["relationships"][0]["models"]) ==
{"fct_orders","stg_orders"}) or by sorting both lists before comparing, so the
test no longer depends on list order for the "models" field.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c563a6ad-ff05-4117-8d72-4bdccb01784f
📒 Files selected for processing (8)
core/wren/src/wren/context.pycore/wren/src/wren/dbt.pycore/wren/src/wren/memory/schema_indexer.pycore/wren/src/wren/memory/seed_queries.pycore/wren/tests/unit/test_context_cli.pycore/wren/tests/unit/test_dbt.pycore/wren/tests/unit/test_memory.pycore/wren/tests/unit/test_seed_queries.py
🚧 Files skipped from review as they are similar to previous changes (6)
- core/wren/src/wren/memory/seed_queries.py
- core/wren/src/wren/context.py
- core/wren/src/wren/memory/schema_indexer.py
- core/wren/tests/unit/test_seed_queries.py
- core/wren/src/wren/dbt.py
- core/wren/tests/unit/test_memory.py
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@evals/spodbtify_ab/run_eval.py`:
- Around line 100-105: During score validation, replace the current set-based
check (scores -> seen_ids) with a check that first builds the list of question
IDs (e.g., ids_list = [entry.get("question_id") for entry in scores]) and
rejects runs that contain duplicates by comparing len(ids_list) to
len(set(ids_list)); if duplicates exist, append an error mentioning the
duplicate ids and run_index, otherwise compare the unique id set to expected_ids
as before. Apply the same change to the other similar blocks referenced (the
sections around lines 107-115 and 122-132) so duplicate question_id entries are
detected and rejected before computing totals (use variables seen_ids, scores,
expected_ids, errors, run_index to locate the exact spots).
- Around line 361-366: The current try/except around
load_json(AGENT_OUTPUT_SCHEMA_PATH) only catches ValueError and will crash with
FileNotFoundError when the optional agent_output.schema.json is missing; update
the block in run_eval.py so that if load_json raises FileNotFoundError you treat
it as "no schema" (set schema_errors = []), and still handle ValueError by
capturing the validation error string (schema_errors = [str(exc)]); reference
the load_json call and AGENT_OUTPUT_SCHEMA_PATH so you modify that exact
try/except and ensure errors = spec_errors + schema_errors continues to work.
- Around line 318-320: Add a new CLI argument named --timeout-seconds, parse it
into a variable (e.g., timeout_seconds) and pass it into the subprocess.run call
that currently invokes subprocess.run(command, shell=True, cwd=str(Path.cwd())).
Update the invocation to include timeout=timeout_seconds and handle
subprocess.TimeoutExpired around that call: on timeout, log/print an error and
exit with a non-zero status (similar to the existing returncode handling).
Ensure you reference the same local variables (command, result) and
import/handle subprocess.TimeoutExpired so the eval run won't hang indefinitely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7b9c711e-55de-403d-9d13-86db6440b318
📒 Files selected for processing (5)
evals/spodbtify_ab/.gitignoreevals/spodbtify_ab/README.mdevals/spodbtify_ab/agent_output.schema.jsonevals/spodbtify_ab/run_eval.pyevals/spodbtify_ab/spodbtify_ab_eval.json
✅ Files skipped from review due to trivial changes (3)
- evals/spodbtify_ab/.gitignore
- evals/spodbtify_ab/agent_output.schema.json
- evals/spodbtify_ab/spodbtify_ab_eval.json
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
evals/spodbtify_ab/run_eval.py (1)
92-129:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard malformed score files before dereferencing nested entries.
validate_score_file()still crashes on malformed JSON instead of reporting validation errors. A non-objectrun, a non-listscores, or an unhashablequestion_idwill blow up onrun.get(...),entry.get(...), orseen_ids.add(qid)before this function can returnerrors.Suggested fix
def validate_score_file(score_file: dict[str, Any], spec: dict[str, Any]) -> list[str]: errors: list[str] = [] expected_ids = {q["id"] for q in spec["questions"]} - for run_index, run in enumerate(score_file.get("runs", []), start=1): + runs = score_file.get("runs", []) + if not isinstance(runs, list): + return ["runs must be a list"] + + for run_index, run in enumerate(runs, start=1): + if not isinstance(run, dict): + errors.append(f"run {run_index}: run must be an object") + continue + workflow = run.get("workflow") if workflow not in WORKFLOWS: errors.append(f"run {run_index}: unknown workflow {workflow!r}") scores = run.get("scores", []) + if not isinstance(scores, list): + errors.append(f"run {run_index}: scores must be a list") + continue + seen_ids: set[Any] = set() duplicate_ids: set[Any] = set() for entry in scores: + if not isinstance(entry, dict): + errors.append(f"run {run_index}: each score entry must be an object") + continue qid = entry.get("question_id") + if not isinstance(qid, int): + errors.append( + f"run {run_index}: question_id must be an integer, got {qid!r}" + ) + continue if qid in seen_ids: duplicate_ids.add(qid) seen_ids.add(qid)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@evals/spodbtify_ab/run_eval.py` around lines 92 - 129, validate_score_file currently dereferences nested fields assuming types; this can raise on malformed input (non-dict run, non-list scores, non-dict entry, or unhashable question_id). Update validate_score_file to defensive-check types: ensure each run is a mapping (dict) before calling run.get (append an error and continue if not), ensure scores is a list before iterating (append an error and continue), ensure each score entry is a mapping before accessing entry.get (append an error and continue), and guard seen_ids.add(qid) by checking hashability (or convert qid to a string and record an error for unhashable ids). Keep the existing validations (workflow, duplicates, expected ids, DIMENSIONS) but only run them when the necessary types are valid; reference the validate_score_file function, WORKFLOWS, DIMENSIONS, and the local variables run, scores, entry, qid when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@evals/spodbtify_ab/run_eval.py`:
- Around line 92-129: validate_score_file currently dereferences nested fields
assuming types; this can raise on malformed input (non-dict run, non-list
scores, non-dict entry, or unhashable question_id). Update validate_score_file
to defensive-check types: ensure each run is a mapping (dict) before calling
run.get (append an error and continue if not), ensure scores is a list before
iterating (append an error and continue), ensure each score entry is a mapping
before accessing entry.get (append an error and continue), and guard
seen_ids.add(qid) by checking hashability (or convert qid to a string and record
an error for unhashable ids). Keep the existing validations (workflow,
duplicates, expected ids, DIMENSIONS) but only run them when the necessary types
are valid; reference the validate_score_file function, WORKFLOWS, DIMENSIONS,
and the local variables run, scores, entry, qid when making these changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: bdd615b8-8005-44e1-a514-fe1c25af545a
📒 Files selected for processing (1)
evals/spodbtify_ab/run_eval.py
| @@ -0,0 +1,134 @@ | |||
| # dbt Integration | |||
|
|
|||
| Import a dbt project into Wren AI Core to query dbt models through the Wren semantic layer. | |||
There was a problem hiding this comment.
| Import a dbt project into Wren AI Core to query dbt models through the Wren semantic layer. | |
| Import a dbt project into Wren AI to query dbt models through the Wren semantic layer. |
| "dbt": { | ||
| "project_dir": dbt_binding_dir, | ||
| "profile": target.profile_name, | ||
| "target": target.target_name, | ||
| }, |
There was a problem hiding this comment.
This should also be documented in the doc.
| if datasource == "postgres": | ||
| return _filter_none( | ||
| { | ||
| "datasource": "postgres", | ||
| "host": str(_require_output_field(output, "host")), | ||
| "port": str(output.get("port", "5432")), | ||
| "database": str(_require_output_field(output, "dbname", "database")), | ||
| "user": str(_require_output_field(output, "user")), | ||
| "password": str(output["password"]) if output.get("password") else None, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Hard-coding the field name is a big maintenance effort. It's better to use the connection info pydanitc from wren.model.data_source to validate or build the connection info.
There was a problem hiding this comment.
Validation already happens in profile_cli (via datasource.get_connection_info(...)), but switching the builder to a Pydantic model_dump() still has value.
| data_type = ( | ||
| catalog_col.get("type") or manifest_col.get("data_type") or "VARCHAR" | ||
| ) |
There was a problem hiding this comment.
Use wren.type_mapping.parse_type to handle the dialect data type.
There was a problem hiding this comment.
We should provide more explanation of what we get from the DBT project, how we store it in the WREN project, and why we get it.
Summary
Adds dbt import support to Wren and includes an agent-agnostic Spodbtify A/B eval harness for comparing schema-only and dbt-integrated semantic-layer workflows.
dbt profile import
wren profile import dbt.dbt_project.ymland dbtprofiles.yml.env_var()references.dbt context import
wren context import dbt.manifest.json,catalog.json, optionalrun_results.json, and compiled SQL artifacts.catalog.jsonas the authoritative column list so manifest-only columns are not imported.Semantic enrichment and memory
relationships.ymlinstead of stamping FK columns with relationship dereferences.Agent eval harness
Docs
Test plan
python3 evals/spodbtify_ab/run_eval.py validatepython3 -m py_compile evals/spodbtify_ab/run_eval.pySummary by CodeRabbit
New Features
wren profile import dbtandwren context import dbtfor importing dbt targets and generating Wren projects (supports dry-run and force).Documentation
Tests