feat: cube PyO3 binding + wren cube CLI + memory indexing by goldmedal · Pull Request #2277 · Canner/WrenAI

goldmedal · 2026-05-14T05:03:17Z

Summary

Phase D + E of the cube rollout (impl-cube-cli.md): expose the Rust cube translator to Python, surface it through the wren CLI, and teach the memory indexer about cubes.

Phase D — `wren-core-py` binding (#0674865c)

New cube_query_to_sql(cube_query_json, manifest_json) -> str pyfunction in core/wren-core-py/src/cube.rs. Serde-driven: both inputs are JSON, output is the generated SQL. Bad JSON / unknown cube / cyclic measures all raise ValueError.
8 pytest cases in tests/test_cube.py covering the happy path plus error modes.

Phase E1+E2 — YAML pipeline (#d05a3efe)

core/wren/src/wren/context.py gains load_cubes(), wires cubes into build_manifest() / build_json() (snake → camel), and extends validate_project() with structural cube checks (name uniqueness, baseObject exists, hierarchy levels reference declared dimensions). Deep validation (measure cycles) still happens Rust-side at analyze time.
wren context init now scaffolds an empty cubes/ directory.
8 new unit tests.

Phase E3+E4+E6 — `wren cube` CLI (#25fcfa1c)

New core/wren/src/wren/cube_cli.py with three commands:
- wren cube list
- wren cube describe <name>
- wren cube query — build a CubeQuery from CLI flags (--cube, --measures, --dimensions, --time-dimension name:granularity[:start,end], --filter dim:op[:value], --limit, --offset) or load it from JSON via --from <file|->. --sql-only prints the generated SQL without executing.
Registered via app.add_typer(cube_app) in cli.py.
10 new unit tests.

Phase E5 — Memory indexing (#3b73847c)

extract_schema_items() now emits records for cubes, measures, cube dimensions, and time dimensions so semantic search surfaces them alongside models and views.
describe_schema() gains a cube section printing measures/dimensions/time dimensions/hierarchies.
7 new unit tests covering the new record types and describe_schema output.

Test plan

cargo test --lib -p wren-core cube:: — 33 cube unit tests pass (no regressions)
cd core/wren-core-py && just test-py — 33 pass (8 new + 25 existing)
cd core/wren && uv run pytest tests/unit/ — 167 pass for cube-related modules (test_cube_cli, test_memory, test_context); a single unrelated test_validate_strict_warns continues to fail on this branch and on feat/wasm-cube HEAD, so the regression is pre-existing.
cd core/wren && just lint — clean
cd core/wren-core-py && cargo clippy -- -D warnings — clean
CI on the wren / wren-core-py path-filtered workflows passes

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added cube management commands: wren cube list, describe, and query for working with cubes in your project.
- Cube definitions are now integrated into project structure with automatic validation.
- Ability to convert cube queries to SQL and execute them directly.

PyO3 binding for the Rust cube translator added in PR #2265. The function takes CubeQuery JSON and manifest JSON, returns the generated SQL, and raises ValueError on bad input or translation errors (unknown cube/measure/dimension, cyclic derived measures, …). Includes 8 pytest cases covering the basic happy path plus error modes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

context build now picks up cube definitions from <project>/cubes/*.yml and includes them in the generated manifest (snake_case for build_manifest, camelCase via build_json for the engine). context init scaffolds an empty cubes/ directory next to models/ and views/, and context validate gains structural checks for cube name uniqueness, baseObject reference, and hierarchy levels — deep semantic checks (cycle detection, measure resolution) still happen Rust-side at analyze time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds three commands under `wren cube`: * list — show all cubes and their measures/dimensions * describe NAME — pretty-print the full cube schema (JSON) * query — build a CubeQuery from CLI flags (--cube, --measures, --dimensions, --time-dimension, --filter, --limit, --offset) or load it from JSON via --from <file|->, translate it to SQL through the new wren_core.cube_query_to_sql binding, then run it through WrenEngine. --sql-only prints the generated SQL without executing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

extract_schema_items now emits records for cubes, measures, cube dimensions, and time dimensions so semantic search surfaces them alongside models and views. describe_schema gains a cube section that prints measures (with expressions), dimensions, time dimensions, and hierarchies. Both walks tolerate manifests without a cubes key. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-14T05:03:23Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bbf673af-ea75-495e-b015-7d433871d723

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/cube-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

core/wren/tests/unit/test_context.py (1)

876-889: ⚡ Quick win

Add a regression test for non-string hierarchy levels.

Given the hierarchy validation path, a malformed level entry (e.g., object/list) should be reported as a validation error rather than crashing. A targeted test here would lock that behavior.

Proposed test addition

+def test_validate_cube_hierarchy_level_type(tmp_path):
+    _make_v3_cube_project(tmp_path)
+    cubes_dir = tmp_path / "cubes"
+    cubes_dir.mkdir()
+    (cubes_dir / "bad_hierarchy_type.yml").write_text(
+        "name: order_metrics\n"
+        "base_object: orders\n"
+        "measures: [{name: c, expression: 'COUNT(*)', type: BIGINT}]\n"
+        "dimensions: [{name: status, expression: o_orderstatus, type: VARCHAR}]\n"
+        "hierarchies:\n"
+        "  drill: [status, {bad: level}]\n"
+    )
+    errors = validate_project(tmp_path)
+    assert any("hierarchy levels must be strings" in e.message for e in errors)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/tests/unit/test_context.py` around lines 876 - 889, Add a
regression test that ensures malformed hierarchy levels (non-string types) are
reported as validation errors instead of causing a crash: extend or create a
test similar to test_validate_cube_bad_hierarchy that writes a cube YAML where a
hierarchy level is a non-string value (e.g., a list or dict) and then calls
validate_project (the same function used in the diff) and asserts that the
returned errors include a message referencing the bad level (e.g., check
any("nonexistent_dim" or an appropriate indicator in e.message for e in
errors)); ensure the test targets the hierarchy validation path so
validate_project handles the non-string level gracefully.

core/wren/tests/unit/test_memory.py (1)

205-259: ⚡ Quick win

Add malformed cube-shape coverage for extract/describe.

The current suite is strong on happy paths; adding one malformed-shape case would guard against runtime regressions in cube parsing.

Proposed test addition

+    def test_extract_tolerates_malformed_cube_entries(self):
+        malformed = {
+            **_CUBE_MANIFEST,
+            "cubes": [
+                {
+                    "name": "order_metrics",
+                    "baseObject": "orders",
+                    "measures": ["not-an-object"],
+                    "dimensions": [123],
+                    "timeDimensions": [{"name": "created_at", "expression": "o_orderdate"}],
+                    "hierarchies": {"time_drill": ["created_at", {"bad": "node"}]},
+                }
+            ],
+        }
+        items = extract_schema_items(malformed)
+        assert any(i["item_type"] == "cube" for i in items)
+
+    def test_describe_schema_tolerates_malformed_hierarchy_levels(self):
+        malformed = {
+            **_CUBE_MANIFEST,
+            "cubes": [{**_CUBE_MANIFEST["cubes"][0], "hierarchies": {"time_drill": ["created_at", 1]}}],
+        }
+        text = describe_schema(malformed)
+        assert "### Cube: order_metrics (base: orders)" in text

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren/tests/unit/test_memory.py` around lines 205 - 259, Add a unit test
that supplies a deliberately malformed cube-shaped manifest (e.g., missing
"cubes" entry, a cube with missing "measures" or malformed measure entries, or a
cube with non-dict children) to extract_schema_items and describe_schema and
assert the functions do not raise and either skip the malformed cube or return a
safe/empty representation; specifically create a new test similar to
TestCubeSchemaItems that calls extract_schema_items(_MALFORMED_CUBE_MANIFEST)
and asserts no items with item_type in
{"cube","measure","cube_dimension","time_dimension"} are produced (or that
malformed entries are excluded), and call
describe_schema(_MALFORMED_CUBE_MANIFEST) to assert it returns a string without
throwing (optionally contains an explicit error or notice text if your
implementation emits one). Ensure the test references extract_schema_items and
describe_schema and uses a local _MALFORMED_CUBE_MANIFEST fixture constructed in
the test file.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren/src/wren/context.py`:
- Around line 997-1005: The validation error message in context.py uses
camelCase "baseObject" which mismatches the YAML field name; update the
ValidationError invocation that builds the message f"{src_path} > {name}" ...
f"baseObject '{base}' is not a defined model or view" (and the earlier "cube
missing 'baseObject'") to use snake_case "base_object" instead so users see the
YAML field name they edit (refer to the ValidationError calls around the
variables base, name, src_path and all_entity_names).
- Around line 1022-1038: The loop over hierarchies may perform "level not in
known_dims" on non-string values causing TypeError; in the block that iterates
hierarchies (use symbols hierarchies, hname, levels, level, known_dims, errors),
first guard that each level is a string (e.g., isinstance(level, str)); if it is
not, append a validation error to errors describing the invalid hierarchy level
type (include hname and the offending level) and continue, otherwise perform the
membership check and append the existing "unknown dimension" error when
appropriate.

In `@core/wren/src/wren/cube_cli.py`:
- Around line 110-119: The helper _load_cube_query_from currently calls
json.loads on stdin or a file without handling malformed JSON; wrap the
json.loads calls in try/except catching json.JSONDecodeError (from json) and
when caught call typer.echo with a clear "Error: malformed JSON in CubeQuery:
<error msg>" to stderr and raise typer.Exit(1); apply the same pattern to the
other JSON-parsing sites referenced (the other functions that call json.loads at
the same module lines), so all malformed JSON returns a user-friendly error and
exit code instead of a traceback.
- Around line 59-63: The parser that handles filter values in cube_cli.py
currently sets f["value"] = [] when op is "in" or "not_in" (using variables op,
raw, f) which allows empty lists to proceed; change that block so after
computing vals = [v.strip() for v in raw.split(",") if v.strip()] you validate
that vals is non-empty and, if empty, raise a CLI validation error (e.g.,
argparse.ArgumentTypeError or the CLI framework's appropriate error) with a
clear message like "filter '<field>:in:' requires at least one value"; otherwise
set f["value"] = vals.

In `@core/wren/src/wren/memory/schema_indexer.py`:
- Around line 125-175: In _describe_cube, guard against malformed shapes by
checking types before accessing dict keys: confirm cube is a dict, and for
measures/dimensions/timeDimensions only iterate items that are dicts (skip
otherwise); when reading fields (name, expression, type) coerce to str or use
safe defaults instead of assuming .get exists; for hierarchies ensure it's a
dict and that each levels value is a list, and include only stringifiable level
entries when joining (skip non-strings); apply the same defensive checks and
safe string coercions to the other similar describer blocks referenced (lines
208-217, 328-429) so malformed YAML won't raise AttributeError/TypeError.

---

Nitpick comments:
In `@core/wren/tests/unit/test_context.py`:
- Around line 876-889: Add a regression test that ensures malformed hierarchy
levels (non-string types) are reported as validation errors instead of causing a
crash: extend or create a test similar to test_validate_cube_bad_hierarchy that
writes a cube YAML where a hierarchy level is a non-string value (e.g., a list
or dict) and then calls validate_project (the same function used in the diff)
and asserts that the returned errors include a message referencing the bad level
(e.g., check any("nonexistent_dim" or an appropriate indicator in e.message for
e in errors)); ensure the test targets the hierarchy validation path so
validate_project handles the non-string level gracefully.

In `@core/wren/tests/unit/test_memory.py`:
- Around line 205-259: Add a unit test that supplies a deliberately malformed
cube-shaped manifest (e.g., missing "cubes" entry, a cube with missing
"measures" or malformed measure entries, or a cube with non-dict children) to
extract_schema_items and describe_schema and assert the functions do not raise
and either skip the malformed cube or return a safe/empty representation;
specifically create a new test similar to TestCubeSchemaItems that calls
extract_schema_items(_MALFORMED_CUBE_MANIFEST) and asserts no items with
item_type in {"cube","measure","cube_dimension","time_dimension"} are produced
(or that malformed entries are excluded), and call
describe_schema(_MALFORMED_CUBE_MANIFEST) to assert it returns a string without
throwing (optionally contains an explicit error or notice text if your
implementation emits one). Ensure the test references extract_schema_items and
describe_schema and uses a local _MALFORMED_CUBE_MANIFEST fixture constructed in
the test file.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: eb3df530-d0ad-47d5-a523-f8d6794f4ce5

📥 Commits

Reviewing files that changed from the base of the PR and between 2ace7cc and 3b73847.

📒 Files selected for processing (11)

core/wren-core-py/src/cube.rs
core/wren-core-py/src/lib.rs
core/wren-core-py/tests/test_cube.py
core/wren/src/wren/cli.py
core/wren/src/wren/context.py
core/wren/src/wren/context_cli.py
core/wren/src/wren/cube_cli.py
core/wren/src/wren/memory/schema_indexer.py
core/wren/tests/unit/test_context.py
core/wren/tests/unit/test_cube_cli.py
core/wren/tests/unit/test_memory.py

* `context validate` now uses the YAML field name (`base_object`) in error messages so users edit the same identifier they see. * Hierarchy-level type check rejects non-string entries up front, avoiding a TypeError when nested lists / null leak through YAML. * `wren cube query --filter dim:in:` (empty value list) is now rejected with a CLI BadParameter instead of producing an empty `IN ()` clause. * `--from <file|->` and MDL JSON loading wrap json.loads in try/except so malformed input surfaces as `Error: ...` + exit 1 rather than a raw traceback. The JSON must also be an object. * Memory schema indexer (`extract_schema_items` / `describe_schema`) skips non-dict cube/measure/dimension entries and non-string hierarchy levels instead of crashing — keeps memory rebuilds resilient. Adds regression tests for each of the five paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

goldmedal and others added 4 commits May 14, 2026 12:16

github-actions Bot added python Pull requests that update Python code core labels May 14, 2026

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread core/wren/src/wren/context.py Outdated

Comment thread core/wren/src/wren/context.py

Comment thread core/wren/src/wren/cube_cli.py

Comment thread core/wren/src/wren/cube_cli.py

Comment thread core/wren/src/wren/memory/schema_indexer.py

goldmedal merged commit 889dd3f into feat/wasm-cube May 14, 2026
11 checks passed

goldmedal deleted the feat/cube-cli branch May 14, 2026 06:25

coderabbitai Bot mentioned this pull request May 14, 2026

feat(wren-core-wasm): cubeQuery + listCubes browser API #2278

Merged

6 tasks

This was referenced May 14, 2026

docs(wren): cube skills and user-facing docs for phase G #2280

Merged

docs(mdl-schema): align with Cube types (PR #1574 follow-up) #2281

Merged

feat(wasm): full Cube support — validate, translate, PyO3, CLI, WASM, docs #2282

Merged

coderabbitai Bot mentioned this pull request May 15, 2026

feat(wren): add dbt import workflow #2279

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cube PyO3 binding + wren cube CLI + memory indexing#2277

feat: cube PyO3 binding + wren cube CLI + memory indexing#2277
goldmedal merged 5 commits into
feat/wasm-cubefrom
feat/cube-cli

goldmedal commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

goldmedal commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase D — wren-core-py binding (#0674865c)

Phase E1+E2 — YAML pipeline (#d05a3efe)

Phase E3+E4+E6 — wren cube CLI (#25fcfa1c)

Phase E5 — Memory indexing (#3b73847c)

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

goldmedal commented May 14, 2026 •

edited by coderabbitai Bot

Loading

Phase D — `wren-core-py` binding (#0674865c)

Phase E3+E4+E6 — `wren cube` CLI (#25fcfa1c)

coderabbitai Bot commented May 14, 2026 •

edited

Loading