Skip to content

feat(wren-core): add cube_query_to_sql translator#2265

Merged
goldmedal merged 2 commits into
feat/wasm-cubefrom
feat/cube-sql
May 14, 2026
Merged

feat(wren-core): add cube_query_to_sql translator#2265
goldmedal merged 2 commits into
feat/wasm-cubefrom
feat/cube-sql

Conversation

@goldmedal
Copy link
Copy Markdown
Collaborator

@goldmedal goldmedal commented May 14, 2026

Summary

  • Port the wren-engine-saas Python translator (ibis-server/app/mdl/cube.py) to Rust so the wren CLI (via PyO3) and the upcoming WASM build share one implementation.
  • New module core/wren-core/core/src/mdl/cube.rs exposes CubeQuery, cube_query_to_sql(&CubeQuery, &Manifest) -> Result<String>, plus the supporting TimeDimensionFilter / Granularity / CubeFilter / FilterOperator / FilterValue types with camelCase / snake_case serde to match saas JSON.
  • Derived-measure inlining mirrors saas: word-boundary regex replacement, longest dependency name first, topological resolution.
  • dateRange JSON shape ["start","end"] is accepted via a custom deserializer that rejects lists of length ≠ 2.

Test plan

  • RUST_MIN_STACK=8388608 cargo test --lib --tests --bins — 115 tests pass (31 new + 84 prior)
  • cargo clippy --all-targets --all-features -- -D warnings — clean
  • cargo fmt --all -- --check — clean
  • 31 unit tests transcribed from saas tests/mdl/test_cube.py covering SELECT/FROM, time dimensions, derived-measure inlining, filters, validation errors, JSON deserialization, and quote_value
  • Verify CI passes on the wren-core path-filtered workflow

Notes

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added cube query functionality enabling selection of measures, dimensions, and time dimensions
    • Introduced date range filtering for time-based queries with start (inclusive) and end (exclusive) boundaries
    • Supported multiple filter operators including equality, comparisons, pattern matching, null checks, and list-based filters
    • Added temporal granularity options (year, quarter, month, week, day, hour, minute)

Review Change Stack

Port the wren-engine-saas Python cube SQL translator (ibis-server
app/mdl/cube.py) to Rust so the CLI (via PyO3) and the upcoming WASM
build share one implementation. The new module accepts a CubeQuery
(JSON/serde) and emits SELECT … GROUP BY SQL that references the
cube's baseObject; wren-core MDL analysis then resolves the underlying
model or view as usual.

Includes 31 unit tests transcribed from saas tests/mdl/test_cube.py
covering SELECT/FROM, time dimensions, derived-measure inlining,
filters, validation errors, JSON deserialization, and quote_value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b4ca2438-9681-4c38-950e-55cb2e751eca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cube-sql

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added rust Pull requests that update rust code core labels May 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
core/wren-core/core/src/mdl/mod.rs (2)

15-15: 💤 Low value

Consider placing re-exports after module declarations.

The pub use cube::{...} appears before pub mod cube; (line 42). While Rust allows forward references, the conventional pattern is to declare modules before re-exporting from them for better readability.

♻️ Suggested reordering

Move this line to after line 42 (the pub mod cube; declaration) to follow the pattern used elsewhere in the file where re-exports follow their module declarations.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren-core/core/src/mdl/mod.rs` at line 15, Move the re-export statement
pub use cube::{cube_query_to_sql, CubeQuery}; so it appears after the module
declaration pub mod cube; (i.e., place the pub use below the pub mod cube; line)
to follow the file's convention of declaring modules before re-exporting their
items.

42-42: ⚡ Quick win

Clarify intended API surface for the cube module.

The cube module exposes 8 public items (CubeQuery, TimeDimensionFilter, Granularity, CubeFilter, FilterOperator, FilterValue, cube_query_to_sql, quote_value), but only 2 are re-exported at the crate level (cube_query_to_sql and CubeQuery). This creates inconsistent API visibility.

Either:

  1. Make the module pub(crate) mod cube; and rely solely on re-exports (aligning with the dataset pattern):
-pub mod cube;
+pub(crate) mod cube;
  1. Or accept that all 8 items are part of the public API and potentially add missing re-exports if needed.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren-core/core/src/mdl/mod.rs` at line 42, The cube module currently
exposes many public items but only re-exports CubeQuery and cube_query_to_sql at
crate level; pick one approach: either change the module declaration to
pub(crate) mod cube so only the intended crate-level API (CubeQuery and
cube_query_to_sql) is visible, or keep pub mod cube and add explicit crate-level
re-exports for the other public types/functions (pub use
crate::mdl::cube::{TimeDimensionFilter, Granularity, CubeFilter, FilterOperator,
FilterValue, quote_value, ...}) so the public API is consistent; update the mod
declaration or add the missing pub use lines accordingly and ensure CubeQuery
and cube_query_to_sql remain re-exported.
core/wren-core/core/src/mdl/cube.rs (2)

207-254: ⚡ Quick win

Resolve only the measures the query needs.

resolve_measures walks the entire measure_map (every measure defined on the cube). A single broken or cyclic derived measure that the current query does not reference will fail every query against the cube with a "possible cycle" error, even though the offending measure is unused. Restrict resolution to the transitive closure of query.measures to make failures local to what the query actually requests.

♻️ Sketch
fn resolve_measures(
    requested: &[String],
    measure_map: &HashMap<&str, &Measure>,
) -> Result<HashMap<String, String>> {
    // Seed `remaining` with the transitive closure of `requested`
    // by BFS over find_measure_refs, then run the existing fixpoint loop.
}

Then pass &query.measures from cube_query_to_sql.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren-core/core/src/mdl/cube.rs` around lines 207 - 254, The current
resolve_measures(measure_map: &HashMap<&str,&Measure>) traverses every measure
and fails on unused cycles; change its signature to accept the requested
measures (e.g. resolve_measures(requested: &[String], measure_map:
&HashMap<&str,&Measure>) ) and compute the transitive closure of requested by
BFS using find_measure_refs to collect only measures the query actually needs,
seed remaining with that closure, then run the existing fixpoint substitution
loop exactly as before on that restricted set; finally update the call site in
cube_query_to_sql to pass &query.measures (or an appropriate slice of measure
names) into resolve_measures.

256-272: 💤 Low value

Recompiling regex per measure on every call.

find_measure_refs compiles a fresh Regex for each candidate name on every invocation, and the function itself is called once per remaining measure in each iteration of resolve_measures's fixpoint loop. For typical cube sizes this is negligible, but caching the compiled patterns (e.g. build a Vec<(name, Regex)> once in resolve_measures and reuse it) is a straightforward win if cubes grow.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/wren-core/core/src/mdl/cube.rs` around lines 256 - 272,
find_measure_refs currently recompiles a Regex for each candidate name on every
call; instead, precompile the patterns once in resolve_measures and reuse them:
build a Vec<(name: &str, regex: Regex)> (or HashMap<&str, Regex>) in
resolve_measures before the fixpoint loop, then change find_measure_refs (or add
a helper) to accept the precompiled collection and match using the stored
Regexes rather than calling Regex::new each time; update calls from
resolve_measures to pass the precompiled patterns so regex compilation is done
only once per measure.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/wren-core/core/src/mdl/cube.rs`:
- Around line 222-236: The replacement string passed to Regex::replace_all in
the loop (using variables resolved_expr, sorted_deps, dep, replacement and re)
must be wrapped with regex::NoExpand to avoid `$`-style template expansion
corrupting SQL (e.g., `$1`, `$$tag$$`); change the call from
re.replace_all(&resolved_expr, replacement.as_str()) to using
re.replace_all(&resolved_expr, regex::NoExpand(replacement.as_str())) (adjust
imports if needed) so replacements are treated literally and then call
.into_owned() as before.

---

Nitpick comments:
In `@core/wren-core/core/src/mdl/cube.rs`:
- Around line 207-254: The current resolve_measures(measure_map:
&HashMap<&str,&Measure>) traverses every measure and fails on unused cycles;
change its signature to accept the requested measures (e.g.
resolve_measures(requested: &[String], measure_map: &HashMap<&str,&Measure>) )
and compute the transitive closure of requested by BFS using find_measure_refs
to collect only measures the query actually needs, seed remaining with that
closure, then run the existing fixpoint substitution loop exactly as before on
that restricted set; finally update the call site in cube_query_to_sql to pass
&query.measures (or an appropriate slice of measure names) into
resolve_measures.
- Around line 256-272: find_measure_refs currently recompiles a Regex for each
candidate name on every call; instead, precompile the patterns once in
resolve_measures and reuse them: build a Vec<(name: &str, regex: Regex)> (or
HashMap<&str, Regex>) in resolve_measures before the fixpoint loop, then change
find_measure_refs (or add a helper) to accept the precompiled collection and
match using the stored Regexes rather than calling Regex::new each time; update
calls from resolve_measures to pass the precompiled patterns so regex
compilation is done only once per measure.

In `@core/wren-core/core/src/mdl/mod.rs`:
- Line 15: Move the re-export statement pub use cube::{cube_query_to_sql,
CubeQuery}; so it appears after the module declaration pub mod cube; (i.e.,
place the pub use below the pub mod cube; line) to follow the file's convention
of declaring modules before re-exporting their items.
- Line 42: The cube module currently exposes many public items but only
re-exports CubeQuery and cube_query_to_sql at crate level; pick one approach:
either change the module declaration to pub(crate) mod cube so only the intended
crate-level API (CubeQuery and cube_query_to_sql) is visible, or keep pub mod
cube and add explicit crate-level re-exports for the other public
types/functions (pub use crate::mdl::cube::{TimeDimensionFilter, Granularity,
CubeFilter, FilterOperator, FilterValue, quote_value, ...}) so the public API is
consistent; update the mod declaration or add the missing pub use lines
accordingly and ensure CubeQuery and cube_query_to_sql remain re-exported.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 46d60ebd-85c7-4178-a332-b89c8b24e87c

📥 Commits

Reviewing files that changed from the base of the PR and between 9baada5 and 0f1e1d2.

📒 Files selected for processing (2)
  • core/wren-core/core/src/mdl/cube.rs
  • core/wren-core/core/src/mdl/mod.rs

Comment thread core/wren-core/core/src/mdl/cube.rs
Wrap the Regex::replace_all replacement in regex::NoExpand so that
strings like `$1` (Postgres parameter placeholders) and `$$tag$$`
(dollar-quoted literals) survive measure inlining instead of being
silently consumed as capture-group templates. Adds a regression test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core rust Pull requests that update rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant