Skip to content

Add semantic-model subsystem (sema) with scope resolution#243

Merged
leynos merged 11 commits intomainfrom
implement-symbol-table-scope-resolution-1frlbh
Mar 20, 2026
Merged

Add semantic-model subsystem (sema) with scope resolution#243
leynos merged 11 commits intomainfrom
implement-symbol-table-scope-resolution-1frlbh

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Mar 13, 2026

Summary

  • Introduced a semantic-model subsystem under src/sema providing owned symbol tables and scope resolution to enable rule analysis without per-rule CST traversal.
  • Shared per-run semantic-models are computed once per lint run and exposed to rules via Arc, enabling efficient reasoning about declarations, bindings, and uses.
  • Expanded the lint integration: RuleCtx now carries an Arc<SemanticModel>; Runner builds a single model and passes it to all rules.

Changes

  • New modules under src/sema:
    • sema/build.rs: Public semantic-model construction entrypoints.
    • sema/builder.rs: Owned semantic-model builder state and collection passes.
    • sema/model.rs: Owned semantic-model data structures (Scope, Symbol, UseSite, etc.).
    • sema/mod.rs: Public exports and top-level semantic-analysis API.
    • sema/resolve.rs: Resolution helpers and name lookup logic for the semantic model.
    • sema/traverse.rs: Traversal utilities for AST/rule surfaces to collect semantic data.
    • sema/variables.rs: Variable-use traversal for semantic-model collection.
  • Public API and integration:
    • src/sema/build.rs exposes build, build_from_root, and build_from_parts.
    • src/lib.rs publicly exports sema.
    • src/linter/rule.rs updated to hold an Arc<SemanticModel> in RuleCtx and to construct/use the semantic model via sema::build_from_root / sema::build.
    • src/linter/runner.rs updated to compute and share a single SemanticModel per lint run, passing it into RuleCtx via with_semantic_model.
  • Tests:
    • src/sema/tests.rs: Unit tests for semantic-model construction and predicates.
    • tests/semantic_scope_resolution.rs: Behavioural tests covering scope resolution, rule-local bindings, and use-site resolution.
  • Documentation:
    • docs/ddlint-design.md updated to reflect semantic-model contract.
    • docs/parser-implementation-notes.md updated to describe the sema module integration.
    • docs/roadmap.md updated to reference semantic-model documentation.
    • Added docs/execplans/3-3-1-symbol-table-and-scope-resolution.md describing the ExecPlan for the semantic analysis milestone.
  • Build and docs integration:
    • Added/modified files across sema modules and linter integration to support the new model and APIs.

Rationale

  • Establishes an owned, thread-safe semantic model reused across lint-rule evaluations, enabling rules to reason about declarations, bindings, and uses efficiently and deterministically.

Validation plan

  • Run fmt, lint, test, and behavioural tests as per repository pipelines.
  • Verify unit tests and behavioural tests cover ordered scope visibility and name-use resolution.

Milestones

  • 3.3.x: Introduce semantic-model skeleton and public API; 3.3.x: Implement declaration collection, scope creation, and rule-local bindings; 3.3.x: Implement name-use recording; 3.3.x: Integrate with RuleCtx and Runner.

📎 Task: https://www.devboxer.com/task/10e9fe05-916b-443b-a6d4-1f514314c9e9

📎 Task: https://www.devboxer.com/task/ad0b1c25-dc5a-4143-bc44-1427fd95bf12

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Mar 13, 2026

Reviewer's Guide

Introduces a new owned, thread-safe semantic-model subsystem under src/sema and wires it into RuleCtx and Runner so lint rules share a single SemanticModel per run, while adding resolution logic, traversal helpers, tests, and documentation to describe the model’s contract.

Sequence diagram for Runner, SemanticModel, and RuleCtx during a lint run

sequenceDiagram
  actor Developer
  participant Parser as Parser_parse
  participant Runner as Linter_Runner
  participant Sema as sema_build
  participant Model as SemanticModel
  participant RuleCtx as RuleCtx
  participant Rule as CstRule

  Developer->>Parser: parse(source_text) Parsed
  Parser-->>Developer: Parsed

  Developer->>Runner: Runner_new(store, parsed, config)
  activate Runner
  Runner->>Sema: build(parsed)
  activate Sema
  Sema-->>Runner: SemanticModel
  deactivate Sema
  Runner->>Runner: semantic_model = Arc_SemanticModel
  Runner-->>Developer: Runner
  deactivate Runner

  Developer->>Runner: run()
  activate Runner
  Runner->>Runner: rules = store.all_rules()
  loop for_each_rule_in_parallel
    Runner->>Runner: run_single_rule(rule)
    activate Rule
    Runner->>Runner: Root_from_green(green)
    Runner->>RuleCtx: with_semantic_model(source_text, root, Arc_SemanticModel, config)
    RuleCtx-->>Runner: RuleCtx

    Runner->>Root: syntax().descendants_with_tokens()
    loop for_each_matching_node_or_token
      alt Node
        Runner->>Rule: check_node(node, RuleCtx, diagnostics)
      else Token
        Runner->>Rule: check_token(token, RuleCtx, diagnostics)
      end
    end
    deactivate Rule
  end
  Runner-->>Developer: Vec_LintDiagnostic
  deactivate Runner
Loading

Class diagram for the new semantic-model core types

classDiagram
  class SemanticModel {
    +ScopeId program_scope()
    +Scope[] scopes()
    +Symbol[] symbols()
    +UseSite[] uses()
    +Scope scope(ScopeId)
    +Symbol symbol(SymbolId)
    +Symbol resolved_symbol(UseSite)
  }

  class Scope {
    +ScopeKind kind()
    +Option_ScopeId parent()
    +ScopeOrigin origin()
    +Span span()
  }

  class Symbol {
    +String name()
    +DeclarationKind kind()
    +SymbolOrigin origin()
    +ScopeId scope()
    +Span span()
    +usize source_order()
    +usize visible_from_rule_order()
  }

  class UseSite {
    +String name()
    +UseKind kind()
    +ScopeId scope()
    +Span span()
    +usize source_order()
    +Resolution resolution()
  }

  class ScopeId {
    +usize value
  }

  class SymbolId {
    +usize value
  }

  class ScopeKind {
    <<enumeration>>
    Program
    Rule
    ForLoop
  }

  class ScopeOrigin {
    <<enumeration>>
    Program
    AstRule
    SemanticRule
    ForLoop
  }

  class DeclarationKind {
    <<enumeration>>
    Relation
    Function
    Type
    RuleBinding
  }

  class SymbolOrigin {
    <<enumeration>>
    RelationDeclaration
    FunctionDeclaration
    TypeDeclaration
    RuleHead
    SemanticRuleHead
    AssignmentPattern
    ForPattern
  }

  class UseKind {
    <<enumeration>>
    Relation
    Variable
  }

  class Resolution {
    <<enumeration>>
    Resolved
    Unresolved
    Ignored
  }

  class SemanticModelBuilder {
    -Scope[] scopes
    -Symbol[] symbols
    -UseSite[] uses
    -ScopeId program_scope
    +new(SyntaxNode) SemanticModelBuilder
    +collect_program_declarations(Root) void
    +collect_ast_rules(Root) void
    +collect_semantic_rules(SemanticRule[]) void
    +collect_rule_heads(RuleHeadContext, Rule) void
    +collect_rule_term(VariableUseContext, RuleBodyTerm) void
    +collect_expression_term(Expr, VariableUseContext) void
    +finish() SemanticModel
    +declare_symbol(SymbolSpec) SymbolId
    +new_scope(ScopeSpec) ScopeId
    +walk_variable_uses(Expr, VariableUseContext) void
    +resolve_name(VariableUseContext, UseKind, String) Resolution
  }

  class RuleHeadContext {
    +ScopeId scope
    +Span span
    +SymbolOrigin origin
  }

  class VariableUseContext {
    +ScopeId current_scope()
    +usize literal_index()
    +Span span()
    +usize rule_order_limit()
  }

  SemanticModel --> Scope : contains
  SemanticModel --> Symbol : contains
  SemanticModel --> UseSite : contains

  Scope --> ScopeKind : kind
  Scope --> ScopeOrigin : origin
  Scope --> "0..1" Scope : parent

  Symbol --> ScopeId : scope
  Symbol --> DeclarationKind : kind
  Symbol --> SymbolOrigin : origin

  UseSite --> ScopeId : scope
  UseSite --> UseKind : kind
  UseSite --> Resolution : resolution

  Resolution --> SymbolId : Resolved

  SemanticModelBuilder --> SemanticModel : builds
  SemanticModelBuilder --> Scope : manages
  SemanticModelBuilder --> Symbol : manages
  SemanticModelBuilder --> UseSite : manages
  SemanticModelBuilder --> RuleHeadContext : uses
  SemanticModelBuilder --> VariableUseContext : uses
Loading

Flow diagram for parse, semantic build, and lint rule execution

flowchart TD
  A[Parse_source_text] --> B[Parsed]
  B --> C[sema_build_from_parts
root, semantic_rules]
  C --> D[SemanticModel]

  B --> E[Runner_new
store, parsed, config]
  D --> E

  E --> F[Runner_run]
  F --> G[Compute_Root_from_green]
  F --> H[Create_RuleCtx_with_semantic_model]
  H --> I[Parallel_rule_iteration]

  I --> J[rule_check_node_or_token
with RuleCtx]
  J --> K[Diagnostics_collected]
  K --> L[Sorted_diagnostics_returned]
Loading

File-Level Changes

Change Details Files
Add owned semantic-model data structures and public build APIs for semantic analysis.
  • Define SemanticModel, scopes, symbols, and use-site records with opaque IDs and query helpers.
  • Provide public builders build, build_from_root, and build_from_parts that drive a SemanticModelBuilder over AST roots and parse-time SemanticRule values.
  • Ensure the model is Send + Sync by storing only owned data and Span-based provenance.
src/sema/mod.rs
src/sema/model.rs
src/sema/build.rs
src/sema/builder.rs
Implement semantic collection passes: declaration collection, rule/for-loop scopes, variable and relation uses, and ordered name resolution.
  • Collect program-scope declarations for relations, functions, and types from the AST root into the SemanticModelBuilder.
  • Traverse rule heads, body terms, assignments, aggregations, and for-loops to record rule-local bindings, nested scopes, and variable/relation uses with literal-order metadata.
  • Resolve uses against visible symbols with tri-state Resolution (Resolved/Unresolved/Ignored), including special handling for wildcards and for-loop scope boundaries.
src/sema/resolve.rs
src/sema/traverse.rs
src/sema/variables.rs
src/sema/builder.rs
Integrate the semantic model into RuleCtx and Runner so rules receive shared semantic context.
  • Extend RuleCtx to hold Arc, add semantic_model accessor, and introduce with_semantic_model plus updated new/from_parsed constructors that call sema::build or sema::build_from_root.
  • Update Runner to compute a single Arc in Runner::new via crate::sema::build(parsed) and to call an instance method run_single_rule that constructs RuleCtx with with_semantic_model.
  • Add a semantic-aware rule test that exercises ctx.semantic_model() through Runner and asserts diagnostics based on unresolved variable uses.
src/linter/rule.rs
src/linter/runner.rs
tests/semantic_scope_resolution.rs
src/linter/rule/tests.rs
Document the semantic-model design, invariants, and roadmap linkage, including an execution plan.
  • Add a new semantic-analysis design section describing SemanticModel, scope shapes, resolution modes, and RuleCtx integration.
  • Document current semantic-analysis invariants and ddlog-sema precursor status in parser-implementation-notes.
  • Update roadmap item 3.3.1 references to point at the new design section and add a detailed execplan for symbol-table and scope-resolution passes.
docs/ddlint-design.md
docs/parser-implementation-notes.md
docs/roadmap.md
docs/execplans/3-3-1-symbol-table-and-scope-resolution.md
Add unit and behavioural tests validating semantic-model construction and RuleCtx exposure.
  • Create src/sema/tests.rs with unit tests for program-scope declarations, head vs assignment bindings, wildcard handling, for-loop scoping, and inclusion of parse-time SemanticRule values.
  • Add tests/semantic_scope_resolution.rs to validate end-to-end declarations and resolutions, unresolved-name preservation, and Runner-to-rule semantic context.
  • Expand RuleCtx tests to assert that semantic_model() is populated and has expected program scope and symbol count, and expose RuleBodyExpression::span for span-based semantics.
src/sema/tests.rs
tests/semantic_scope_resolution.rs
src/linter/rule/tests.rs
src/parser/ast/rule.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ec479720-d5c8-4632-9590-eeca27cd0e09

📥 Commits

Reviewing files that changed from the base of the PR and between 872905f and 7e3c8c1.

📒 Files selected for processing (2)
  • src/linter/rule/tests.rs
  • src/parser/top_level_for/lowering.rs

Summary by CodeRabbit

  • New Features

    • Semantic model built once per lint run and exposed to the runner and lint rules; tri-state name resolution (Resolved / Unresolved / Ignored); scoped symbol visibility (program, per-rule, for-loop) with ordered binding visibility and non‑leaking for-loop bindings; top-level for-desugaring participates in semantic analysis.
  • Documentation

    • Design, execplan, roadmap and implementation notes updated to describe model shape, semantics and testing strategy.
  • Tests

    • New unit and end‑to‑end tests plus runner integration covering model construction, resolution, scope visibility, assignment timing and top‑level for diagnostics.

Walkthrough

Introduce an owned, thread‑safe SemanticModel under src/sema/, build it once per Runner invocation, share it as Arc<SemanticModel> via RuleCtx, implement deterministic scope/symbol collection and tri‑state name resolution (Resolved/Unresolved/Ignored) across program, rule and for‑loop scopes, add traversals, tests, parser lowering and documentation.

Changes

Cohort / File(s) Summary
Semantic model core
src/sema/model.rs, src/sema/mod.rs, src/sema/build.rs, src/sema/builder.rs
Add immutable SemanticModel types, opaque IDs (ScopeId, SymbolId), provenance and Resolution enums, builder primitives, and public constructors (build, build_from_root, build_from_parts).
Resolution & binding logic
src/sema/resolve.rs, src/sema/traverse.rs, src/sema/variables.rs
Implement tri‑state resolution (Resolved/Unresolved/Ignored), head/assignment/for pattern binding collection, scope visibility rules, relation extraction, and variable‑use recording across expressions and rule terms.
Builder public API examples
src/sema/build.rs
Add documented rustdoc examples for the build* entry points.
Runner & RuleCtx integration
src/linter/runner.rs, src/linter/rule.rs, src/lib.rs
Build SemanticModel once in Runner::new, store as Arc<SemanticModel>, adapt runner to pass the shared model into RuleCtx::with_semantic_model; expose sema module from crate root.
Tests — unit & integration
src/sema/tests.rs, tests/semantic_scope_resolution.rs, src/linter/rule/tests.rs
Add unit and integration tests covering declaration collection, head/assignment/for scoping and resolution semantics; add runner test verifying rules can access the semantic model and inspect unresolved uses.
Parser surface & lowering
src/parser/ast/rule.rs, src/parser/ast/semantic_rule.rs, src/parser/top_level_for.rs, src/parser/top_level_for/lowering.rs
Make RuleBodyExpression::span() public; change SemanticRule constructor to accept SemanticRuleSpec including patterns; move top‑level for lowering into lowering.rs that emits SemanticRule values and new diagnostics.
Top‑level adjustments & re‑exports
src/lib.rs, src/parser/ast/mod.rs, src/parser/top_level_for.rs
Export sema from crate root; re‑export SemanticRuleSpec; re‑export lowered lower_top_level_for; restrict some diagnostic constants to test visibility.
Docs & roadmap
docs/ddlint-design.md, docs/execplans/3-3-1-symbol-table-and-scope-resolution.md, docs/parser-implementation-notes.md, docs/roadmap.md
Document SemanticModel design, execplan 3.3.1 deliverables, parser invariants and mark roadmap tasks completed.
Tests — parser/top_level_for update
src/parser/tests/rules/top_level_for.rs, tests/top_level_for_desugaring.rs
Adjust tests to the new head/body diagnostic constants and add a new head‑specific diagnostic test.

Sequence Diagram(s)

sequenceDiagram
    actor Runner
    participant Semabuild as SemanticBuild
    participant SemanticModel as SemanticModel
    participant RuleCtx
    participant Rule as LintRule

    Runner->>Semabuild: call crate::sema::build(parsed)
    Semabuild->>SemanticModel: construct owned model
    SemanticModel-->>Runner: return Arc<SemanticModel>

    loop For each rule
        Runner->>RuleCtx: RuleCtx::with_semantic_model(..., Arc clone, ...)
        Note right of RuleCtx: holds Arc<SemanticModel>
        Rule->>RuleCtx: call semantic_model()
        RuleCtx-->>Rule: &SemanticModel
        Rule->>SemanticModel: query scopes / symbols / uses / resolution
        SemanticModel-->>Rule: semantic facts
        Rule-->>Runner: emit diagnostics
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

Build the model once, wrap it tight in Arc,
Lay out scopes from program down to for‑loop heart,
Resolve or mark unknown, ignore the underscore,
Linters probe the facts and point where names fall short,
Ship the map of symbols — let the rules take part.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarises the main change: introduction of a semantic-model subsystem (sema) with scope resolution as the primary deliverable.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering the semantic-model subsystem, new modules, integration points, tests, and documentation updates.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch implement-symbol-table-scope-resolution-1frlbh

Comment @coderabbitai help to get the list of available commands and usage tips.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos leynos changed the title Plan: symbol-table and scope-resolution (3.3.1) Implement symbol-table and scope-resolution (3.3.1) Mar 13, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 17, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

Excess Number of Function Arguments

src/sema/variables.rs: SemanticModelBuilder.walk_for_loop_uses
src/sema/traverse.rs: SemanticModelBuilder.record_top_level_relation_use
src/sema/traverse.rs: SemanticModelBuilder.collect_for_loop_term
src/sema/traverse.rs: SemanticModelBuilder.collect_aggregation_term
src/sema/traverse.rs: SemanticModelBuilder.collect_assignment_term
src/sema/traverse.rs: SemanticModelBuilder.collect_expression_term
src/sema/traverse.rs: SemanticModelBuilder.collect_rule_term
src/sema/traverse.rs: SemanticModelBuilder.collect_head_expr
src/sema/traverse.rs: SemanticModelBuilder.collect_rule_heads
src/sema/resolve.rs: SemanticModelBuilder.resolve_name
src/sema/builder.rs: SemanticModelBuilder.new_scope

What lead to degradation?

SemanticModelBuilder.walk_for_loop_uses has 5 arguments, max arguments = 4
SemanticModelBuilder.record_top_level_relation_use has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_for_loop_term has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_aggregation_term has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_assignment_term has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_expression_term has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_rule_term has 6 arguments, max arguments = 4
SemanticModelBuilder.collect_head_expr has 5 arguments, max arguments = 4
SemanticModelBuilder.collect_rule_heads has 5 arguments, max arguments = 4
SemanticModelBuilder.resolve_name has 5 arguments, max arguments = 4
SemanticModelBuilder.new_scope has 5 arguments, max arguments = 4

Why does this problem occur?

Functions with many arguments indicate either a) low cohesion where the function has too many responsibilities, or b) a missing abstraction that encapsulates those arguments.
The threshold for the Rust language is 4 function arguments.

How to fix it?

Start by investigating the responsibilities of the function. Make sure it doesn't do too many things, in which case it should be split into smaller and more cohesive functions. Consider the refactoring INTRODUCE PARAMETER OBJECT to encapsulate arguments that refer to the same logical concept.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 17, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

Code Duplication

src/sema/variables.rs:
src/sema/tests.rs:

What lead to degradation?

The module contains 2 functions with similar structure: SemanticModelBuilder.walk_match_uses,SemanticModelBuilder.walk_method_call_uses
The module contains 2 functions with similar structure: symbols_named,uses_named

Why does this problem occur?

Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?

A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More
Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 17, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

Complex Method

src/sema/resolve.rs: collect_head_bindings

What lead to degradation?

collect_head_bindings has a cyclomatic complexity of 10, threshold = 9

Why does this problem occur?

A Complex Method has a high cyclomatic complexity. The recommended threshold for the Rust language is a cyclomatic complexity lower than 9.

How to fix it?

There are many reasons for Complex Method. Sometimes, another design approach is beneficial such as a) modeling state using an explicit state machine rather than conditionals, or b) using table lookup rather than long chains of logic. In other scenarios, the function can be split using EXTRACT FUNCTION. Just make sure you extract natural and cohesive functions. Complex Methods can also be addressed by identifying complex conditional expressions and then using the DECOMPOSE CONDITIONAL refactoring.

Helpful refactoring examples

To get a general understanding of what this code health issue looks like - and how it might be addressed - we have prepared some diffs for illustrative purposes.

SAMPLE

# complex_method.js
 function postItem(item) {
   if (!item.id) {
-    if (item.x != null && item.y != null) {
-      post(item);
-    } else {
-      throw Error("Item must have x and y");
-    }
+    // extract a separate function for creating new item
+    postNew(item);
   } else {
-    if (item.x < 10 && item.y > 25) {
-      put(item);
-    } else {
-      throw Error("Item must have an x and y value between 10 and 25");
-    }
+    // and one for updating existing items
+    updateItem(item);
   }
 }
+
+function postNew(item) {
+  validateNew(item);
+  post(item);
+}
+
+function updateItem(item) {
+  validateUpdate(item);
+  put(item);
+}
+

@coderabbitai

This comment was marked as resolved.

@coderabbitai

This comment was marked as resolved.

@coderabbitai

This comment was marked as resolved.

@leynos leynos changed the title Implement symbol-table and scope-resolution (3.3.1) Reduce cyclomatic complexity in collect_head_bindings Mar 17, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 18, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

Excess Number of Function Arguments

src/linter/runner.rs: run_single_rule
src/sema/traverse.rs: SemanticModelBuilder.collect_rule_heads
src/sema/traverse.rs: SemanticModelBuilder.collect_rule_term

What lead to degradation?

run_single_rule has 5 arguments, max arguments = 4
SemanticModelBuilder.collect_rule_heads has 5 arguments, max arguments = 4
SemanticModelBuilder.collect_rule_term has 6 arguments, max arguments = 4

Why does this problem occur?

Functions with many arguments indicate either a) low cohesion where the function has too many responsibilities, or b) a missing abstraction that encapsulates those arguments.
The threshold for the Rust language is 4 function arguments.

How to fix it?

Start by investigating the responsibilities of the function. Make sure it doesn't do too many things, in which case it should be split into smaller and more cohesive functions. Consider the refactoring INTRODUCE PARAMETER OBJECT to encapsulate arguments that refer to the same logical concept.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 18, 2026

@coderabbitai As CodeScene is still reporting this concern, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

Code Duplication

src/sema/tests.rs:

What lead to degradation?

The module contains 2 functions with similar structure: symbols_named,uses_named

Why does this problem occur?

Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?

A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More
Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 18, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat src/sema/tests.rs

Length of output: 5686


@leynos — this issue is already resolved in the current branch. src/sema/tests.rs already contains the filter_named_by_kind helper, and both uses_named and symbols_named delegate to it:

fn filter_named_by_kind<'a, T, K: PartialEq>(
    items: &'a [T],
    name: &str,
    kind: &K,
    name_fn: impl Fn(&T) -> &str,
    kind_fn: impl Fn(&T) -> K,
) -> Vec<&'a T> {}

No further action is needed for this issue.

@coderabbitai

This comment was marked as resolved.

@coderabbitai

This comment was marked as resolved.

@leynos leynos changed the title Reduce cyclomatic complexity in collect_head_bindings Add semantic-model infrastructure with symbol tables and scopes Mar 18, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos leynos changed the title Add semantic-model infrastructure with symbol tables and scopes Refactor collect_head_bindings to reduce complexity; semantic-model Mar 18, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 18, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph.

If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

Code Duplication

src/sema/tests.rs:

What lead to degradation?

The module contains 2 functions with similar structure: symbols_named,uses_named

Why does this problem occur?

Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?

A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More
Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

@leynos leynos changed the title Refactor collect_head_bindings to reduce complexity; semantic-model Introduce semantic-model subsystem; integrate with RuleCtx and Runner Mar 18, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@coderabbitai

This comment was marked as resolved.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos leynos marked this pull request as ready for review March 18, 2026 18:32
sourcery-ai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ba0a16cec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sema/builder.rs
leynos and others added 3 commits March 20, 2026 13:04
…versal context

- Moved run_single_rule to be a private method of Runner with access to self fields.
- Reduced parameters for run_single_rule by using Runner instance fields.
- Simplified lint diagnostics collection in Runner to call internal method.
- Refactored semantic model builder to use a RuleHeadContext struct instead of multiple parameters.
- Updated collect_rule_heads to accept RuleHeadContext, improving clarity and reducing param count.

These changes improve code organization by encapsulating related data and functionality, and reduce parameter passing complexity in semantic model traversal.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Refactor the collect_rule_term method to accept a VariableUseContext struct
instead of separate parameters for scope, literal index, span, and rule order limit.
This simplifies parameter passing and improves consistency when collecting terms
in semantic analysis.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Added a detailed Mermaid class diagram illustrating the structure of the semantic model and relationships between core types (SemanticModel, Scope, Symbol, UseSite, etc.) to the 3-3-1-symbol-table-and-scope-resolution.md documentation. This diagram provides visual clarity on scopes, symbols, uses, and their identifiers, improving understanding of name resolution, symbol provenance, and scope hierarchy within the semantic model.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@devboxerhub devboxerhub Bot force-pushed the implement-symbol-table-scope-resolution-1frlbh branch from 7ba0a16 to f25ca3a Compare March 20, 2026 13:06
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos leynos changed the title Introduce semantic-model subsystem; integrate with RuleCtx and Runner Introduce semantic-model subsystem; refactor collect_head_bindings Mar 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (7)
src/linter/rule/tests.rs (1)

161-162: ⚠️ Potential issue | 🟡 Minor

Assert semantic facts instead of internal allocation details.

Lines 161-162 bind this test to the current ScopeId backing index and the current total symbol-record count. Assert the program scope kind and the presence of relation R instead, so harmless builder refactors do not break the test.

🩹 Suggested change
-    assert_eq!(rule_ctx_fixture.ctx.semantic_model().program_scope().0, 0);
-    assert_eq!(rule_ctx_fixture.ctx.semantic_model().symbols().len(), 1);
+    let sema = rule_ctx_fixture.ctx.semantic_model();
+    let program_scope = sema.program_scope();
+    assert_eq!(
+        sema.scope(program_scope).kind(),
+        crate::sema::ScopeKind::Program
+    );
+    assert!(sema.symbols().iter().any(|symbol| {
+        symbol.name() == "R"
+            && symbol.kind() == crate::sema::DeclarationKind::Relation
+    }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/linter/rule/tests.rs` around lines 161 - 162, The test is asserting
internal allocation details (ScopeId index and total symbol count) which is
brittle; update the assertions to check semantic facts instead by using
rule_ctx_fixture.ctx.semantic_model() to get sema, then assert that
sema.program_scope() refers to a scope whose kind() is
crate::sema::ScopeKind::Program and assert sema.symbols() contains a symbol with
name() == "R" and kind() == crate::sema::DeclarationKind::Relation so the test
validates behavior (program scope and relation R) rather than internal indexes.
docs/roadmap.md (1)

340-347: ⚠️ Potential issue | 🟡 Minor

Mark delivered semantic milestones as complete.

Lines 340-347 still leave 3.3.1–3.3.4 unchecked even though this PR adds symbol-table construction, scope resolution, declaration recording, per-rule bindings, and use-site tracking. Update these checkboxes, or split any unfinished remainder into explicit follow-up tasks.

As per coding guidelines, "Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/roadmap.md` around lines 340 - 347, Update docs/roadmap.md to mark items
3.3.1 through 3.3.4 as completed (or split remaining work into new explicit
follow-up tasks) because this PR implements symbol-table construction, scope
resolution, declaration recording, per-rule bindings and use-site tracking;
specifically check off 3.3.1, 3.3.2, 3.3.3, 3.3.4 in the checklist and, if any
subtask truly remains, add a new bullet with a concise description and link to
the PR/issue and docs/ddlint-design.md §3.2.1 for context.
docs/execplans/3-3-1-symbol-table-and-scope-resolution.md (1)

386-389: ⚠️ Potential issue | 🟡 Minor

Align the integration checklist with the shipped RuleCtx API.

Step 3 states that RuleCtx::new and RuleCtx::from_parsed accept Arc<SemanticModel>, but the implementation adds RuleCtx::with_semantic_model(...) for precomputed semantic-model injection while keeping the convenience constructors additive. Update this step to prevent documentation drift.

📝 Proposed fix
-3. Change `RuleCtx::new` and `RuleCtx::from_parsed` to accept an
-   `Arc<SemanticModel>`.
+3. Add `RuleCtx::with_semantic_model(source_text, ast_root, semantic_model, config)`
+   for precomputed semantic-model injection, while keeping `RuleCtx::new` and
+   `RuleCtx::from_parsed` as additive convenience constructors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/execplans/3-3-1-symbol-table-and-scope-resolution.md` around lines 386 -
389, Update the checklist to reflect the actual RuleCtx API: change step 3 to
state that instead of modifying RuleCtx::new and RuleCtx::from_parsed to accept
Arc<SemanticModel>, the code introduces RuleCtx::with_semantic_model(...) to
accept a precomputed Arc<SemanticModel> while keeping the existing convenience
constructors; keep step 4 as-is (Runner::new should still build an
Arc<SemanticModel> from a Parsed and reuse it for worker-thread contexts) and
mention both RuleCtx::with_semantic_model and the unchanged
RuleCtx::new/from_parsed to prevent documentation drift.
src/sema/resolve.rs (1)

86-90: 🛠️ Refactor suggestion | 🟠 Major

Remove duplicate text_range_to_span and reuse the existing helper.

This function duplicates crate::parser::ast::rule::text_range_to_span (lines 649–653 in src/parser/ast/rule.rs). Consolidate to a single source of truth to prevent drift.

♻️ Proposed fix
-pub(super) fn text_range_to_span(range: rowan::TextRange) -> Span {
-    let start: usize = range.start().into();
-    let end: usize = range.end().into();
-    start..end
-}

Then update usages in src/sema/builder.rs and elsewhere to import and call crate::parser::ast::rule::text_range_to_span directly, or re-export it from a shared utility module.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/resolve.rs` around lines 86 - 90, The duplicate helper
text_range_to_span in sema::resolve should be removed and callers should use the
single existing implementation crate::parser::ast::rule::text_range_to_span
instead; delete the pub(super) fn text_range_to_span in src/sema/resolve.rs,
update imports where that local symbol was used (e.g., references in
sema::builder.rs and any other modules) to import or fully-qualify
crate::parser::ast::rule::text_range_to_span, or add a small re-export in a
shared module if you prefer a local name.
docs/ddlint-design.md (1)

734-736: ⚠️ Potential issue | 🟡 Minor

Insert a comma before "so" in the compound sentence.

Line 735 joins two independent clauses with "so" but lacks the required comma.

📝 Proposed fix
 This pass consumes both AST-backed rules and parse-time `SemanticRule` values
-from top-level `for` desugaring so the semantic view matches the parser's
+from top-level `for` desugaring, so the semantic view matches the parser's
 existing rule surface.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/ddlint-design.md` around lines 734 - 736, In the sentence "This pass
consumes both AST-backed rules and parse-time `SemanticRule` values from
top-level `for` desugaring so the semantic view matches the parser's existing
rule surface." insert a comma before "so" to read "...desugaring, so the
semantic view..." to correctly separate the two independent clauses; update that
sentence in the docs content where the phrase "top-level `for` desugaring so the
semantic view" appears.
src/sema/traverse.rs (1)

135-145: ⚠️ Potential issue | 🟠 Major

Preserve per-term ordering inside for-loop bodies.

The loop at lines 135–145 reuses context.literal_index() for every nested term. This breaks ordered visibility for assignment bindings within the loop body—each term should have a distinct index so visible_from_rule_order calculations remain correct.

🐛 Proposed fix
-        for nested_term in &for_loop.body_terms {
+        for (nested_index, nested_term) in for_loop.body_terms.iter().enumerate() {
             self.collect_rule_term(
                 VariableUseContext::new(
                     child_scope,
-                    context.literal_index(),
+                    nested_index,
                     context.span(),
                     context.rule_order_limit(),
                 ),
                 nested_term,
             );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/traverse.rs` around lines 135 - 145, The loop over
for_loop.body_terms reuses context.literal_index() for every nested term which
breaks per-term ordering (affecting visible_from_rule_order); fix it by
assigning a distinct literal index for each term—capture a starting index from
context.literal_index(), then inside the loop create the VariableUseContext::new
with that starting index + term_offset (or otherwise increment a mutable
term_index) for each nested_term passed to collect_rule_term so each body term
gets a unique literal index.
src/sema/tests.rs (1)

60-201: 🧹 Nitpick | 🔵 Trivial

Convert repeated test scaffolding to rstest fixtures.

Multiple tests duplicate the parse_okbuild → assert pattern. Extract a shared fixture and use #[rstest(...)] parameterised cases to reduce boilerplate and ease future scenario expansion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/tests.rs` around lines 60 - 201, Several tests repeat the same
"parse_ok -> build/build_from_root -> model" scaffolding (see tests like
collects_program_scope_declarations, head_bindings_are_visible_from_rule_start,
assignment_binding_becomes_visible_after_its_literal,
wildcard_variable_uses_are_ignored, for_loop_bindings_do_not_escape_loop_scope,
top_level_for_semantic_rules_participate_in_analysis) — refactor by adding an
rstest fixture (e.g., a function named test_model_fixture that calls parse_ok
and build/build_from_root and returns the model or (model, parsed.root()) as
needed) and replace duplicated setup in each test with a #[rstest] parameter
that injects that fixture; keep specific assertions in each test but remove
repeated setup code and update tests that need build_from_root to request the
appropriate fixture variant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sema/builder.rs`:
- Around line 65-124: In collect_program_declarations extract the repeated logic
that wraps a child into an AST node, checks name(), and calls declare_symbol
into a small helper (e.g. a closure or private fn) to avoid duplication:
implement something like let emit = |name: Smth, kind: DeclarationKind, origin:
SymbolOrigin, span| { self.declare_symbol(SymbolSpec { name, kind, origin,
scope: self.program_scope, span, source_order: order, visible_from_rule_order: 0
}); order += 1; }; then in each match arm construct the specific AST wrapper
(ast::Relation, ast::Function, ast::TypeDef), call .name(), and pass the name
plus appropriate DeclarationKind/ SymbolOrigin and
text_range_to_span(child.text_range()) to emit; keep declare_symbol, SymbolSpec,
program_scope, text_range_to_span and visible_from_rule_order logic unchanged.
- Line 14: Duplicate text_range_to_span implementations (used by resolve, rule,
and builder) should be consolidated into a single shared utility in the parser
module hierarchy: create a new parser-level helper (e.g.,
parser::span::text_range_to_span or parser::util::text_range_to_span) that
implements the rowan::TextRange -> Span conversion, remove the duplicate
definitions from resolve and rule, and update callers (including builder.rs
which currently imports text_range_to_span from resolve and other modules that
import it from rule) to import the new shared function; ensure the function
signature and behavior remain identical so no caller logic changes are required.

In `@src/sema/model.rs`:
- Around line 270-303: Change the public lookup APIs to return Option instead of
panicking: modify scope(&self, id: ScopeId) -> Option<&Scope> to return
self.scopes.get(id.0), and symbol(&self, id: SymbolId) -> Option<&Symbol> to
return self.symbols.get(id.0); update resolved_symbol(&self, use_site: &UseSite)
-> Option<&Symbol> to forward the Option from symbol (i.e., when
Resolution::Resolved(symbol_id) return self.symbol(symbol_id) rather than
Some(self.symbol(symbol_id))). Ensure references to ScopeId, SymbolId, scope(),
symbol(), resolved_symbol(), self.scopes and self.symbols are updated and adjust
callers to handle the returned Option.

In `@src/sema/resolve.rs`:
- Around line 34-38: The wildcard name check ("if name == \"_\" {
Resolution::Ignored }") must be moved before the scope-chain resolution loop so
"_" short-circuits immediately; update the function that performs the scope walk
(the resolver function containing the resolution loop and returning
Resolution::Unresolved/Resolution::Ignored) to check for name == "_" at the top
and return Resolution::Ignored before iterating scopes, removing the post-loop
special case.

In `@src/sema/variables.rs`:
- Around line 161-166: The wildcard "_" handling is redundant here: remove the
explicit if name == "_" branch from record_variable_use and call
self.resolve_name(context, UseKind::Variable, name) directly so resolve_name
supplies Resolution::Ignored for "_" (ensure resolve_name's logic still returns
Resolution::Ignored for "_"); update any callers or tests if they relied on the
old branch and run the existing sema tests to verify behavior.

In `@tests/semantic_scope_resolution.rs`:
- Around line 90-114: Clarify that check_node intentionally checks unresolved
uses across the whole semantic model (not the specific node) by adding a short
comment above the logic that calls RuleCtx::semantic_model().uses() and filters
by UseKind::Variable and Resolution::Unresolved; mention this is a test-only
behavior to demonstrate access to RuleCtx::semantic_model() rather than a
recommended production pattern so future readers won't assume node-scoped
resolution is intended.

---

Duplicate comments:
In `@docs/ddlint-design.md`:
- Around line 734-736: In the sentence "This pass consumes both AST-backed rules
and parse-time `SemanticRule` values from top-level `for` desugaring so the
semantic view matches the parser's existing rule surface." insert a comma before
"so" to read "...desugaring, so the semantic view..." to correctly separate the
two independent clauses; update that sentence in the docs content where the
phrase "top-level `for` desugaring so the semantic view" appears.

In `@docs/execplans/3-3-1-symbol-table-and-scope-resolution.md`:
- Around line 386-389: Update the checklist to reflect the actual RuleCtx API:
change step 3 to state that instead of modifying RuleCtx::new and
RuleCtx::from_parsed to accept Arc<SemanticModel>, the code introduces
RuleCtx::with_semantic_model(...) to accept a precomputed Arc<SemanticModel>
while keeping the existing convenience constructors; keep step 4 as-is
(Runner::new should still build an Arc<SemanticModel> from a Parsed and reuse it
for worker-thread contexts) and mention both RuleCtx::with_semantic_model and
the unchanged RuleCtx::new/from_parsed to prevent documentation drift.

In `@docs/roadmap.md`:
- Around line 340-347: Update docs/roadmap.md to mark items 3.3.1 through 3.3.4
as completed (or split remaining work into new explicit follow-up tasks) because
this PR implements symbol-table construction, scope resolution, declaration
recording, per-rule bindings and use-site tracking; specifically check off
3.3.1, 3.3.2, 3.3.3, 3.3.4 in the checklist and, if any subtask truly remains,
add a new bullet with a concise description and link to the PR/issue and
docs/ddlint-design.md §3.2.1 for context.

In `@src/linter/rule/tests.rs`:
- Around line 161-162: The test is asserting internal allocation details
(ScopeId index and total symbol count) which is brittle; update the assertions
to check semantic facts instead by using rule_ctx_fixture.ctx.semantic_model()
to get sema, then assert that sema.program_scope() refers to a scope whose
kind() is crate::sema::ScopeKind::Program and assert sema.symbols() contains a
symbol with name() == "R" and kind() == crate::sema::DeclarationKind::Relation
so the test validates behavior (program scope and relation R) rather than
internal indexes.

In `@src/sema/resolve.rs`:
- Around line 86-90: The duplicate helper text_range_to_span in sema::resolve
should be removed and callers should use the single existing implementation
crate::parser::ast::rule::text_range_to_span instead; delete the pub(super) fn
text_range_to_span in src/sema/resolve.rs, update imports where that local
symbol was used (e.g., references in sema::builder.rs and any other modules) to
import or fully-qualify crate::parser::ast::rule::text_range_to_span, or add a
small re-export in a shared module if you prefer a local name.

In `@src/sema/tests.rs`:
- Around line 60-201: Several tests repeat the same "parse_ok ->
build/build_from_root -> model" scaffolding (see tests like
collects_program_scope_declarations, head_bindings_are_visible_from_rule_start,
assignment_binding_becomes_visible_after_its_literal,
wildcard_variable_uses_are_ignored, for_loop_bindings_do_not_escape_loop_scope,
top_level_for_semantic_rules_participate_in_analysis) — refactor by adding an
rstest fixture (e.g., a function named test_model_fixture that calls parse_ok
and build/build_from_root and returns the model or (model, parsed.root()) as
needed) and replace duplicated setup in each test with a #[rstest] parameter
that injects that fixture; keep specific assertions in each test but remove
repeated setup code and update tests that need build_from_root to request the
appropriate fixture variant.

In `@src/sema/traverse.rs`:
- Around line 135-145: The loop over for_loop.body_terms reuses
context.literal_index() for every nested term which breaks per-term ordering
(affecting visible_from_rule_order); fix it by assigning a distinct literal
index for each term—capture a starting index from context.literal_index(), then
inside the loop create the VariableUseContext::new with that starting index +
term_offset (or otherwise increment a mutable term_index) for each nested_term
passed to collect_rule_term so each body term gets a unique literal index.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fc45a965-00a8-44b4-a06c-cb0814fd59e6

📥 Commits

Reviewing files that changed from the base of the PR and between 7ba0a16 and f25ca3a.

📒 Files selected for processing (18)
  • docs/ddlint-design.md
  • docs/execplans/3-3-1-symbol-table-and-scope-resolution.md
  • docs/parser-implementation-notes.md
  • docs/roadmap.md
  • src/lib.rs
  • src/linter/rule.rs
  • src/linter/rule/tests.rs
  • src/linter/runner.rs
  • src/parser/ast/rule.rs
  • src/sema/build.rs
  • src/sema/builder.rs
  • src/sema/mod.rs
  • src/sema/model.rs
  • src/sema/resolve.rs
  • src/sema/tests.rs
  • src/sema/traverse.rs
  • src/sema/variables.rs
  • tests/semantic_scope_resolution.rs

Comment thread src/sema/builder.rs Outdated
Comment thread src/sema/builder.rs
Comment thread src/sema/model.rs
Comment thread src/sema/resolve.rs Outdated
Comment thread src/sema/variables.rs Outdated
Comment thread tests/semantic_scope_resolution.rs
…ntic model

- Extended SemanticRule to carry patterns representing lowered `for`-loop bindings.
- SemanticModelBuilder now declares symbols for all pattern-bound names in rules.
- Scope resolution now correctly resolves and indexes bindings introduced by these patterns.
- Updated variable usage tracking to walk all expression kinds, ensuring accurate semantic analysis.
- Added tests covering top-level `for` scope bindings and name resolution.
- Documentation and roadmap updated to mark semantic analysis phases 3.3.1-3.3.4 as complete.

This allows the semantic model to fully track variables introduced by `for` statements, supporting later lint passes and more precise scope resolution.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@leynos leynos changed the title Introduce semantic-model subsystem; refactor collect_head_bindings Add semantic-model subsystem (sema) with scope resolution Mar 20, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/parser/top_level_for.rs (1)

227-288: 🛠️ Refactor suggestion | 🟠 Major

Split src/parser/top_level_for.rs into smaller modules now.

Extract the lowering and token-scanning helpers into focused sibling modules; the file is now over the 400-line cap, which breaches the repository Rust size constraint.

As per coding guidelines, "Files must not exceed 400 lines in length."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parser/top_level_for.rs` around lines 227 - 288, This file exceeds the
400-line limit—extract the lowering helpers into a new sibling module (for
example a top_level_for::lowering) by moving the functions lower_top_level_for
and collect_lowered_terms (and any small dependent constants like
UNSUPPORTED_TOP_LEVEL_FOR_STATEMENT and types imported only for these functions)
into that module, keep the original public API by re-exporting
lower_top_level_for (pub use top_level_for::lowering::lower_top_level_for) from
src/parser/top_level_for.rs, and update imports to reference SemanticRule,
SemanticRuleOrigin, Expr, Span, Simple<SyntaxKind>, and any pattern types from
their current modules so the moved functions compile unchanged.
♻️ Duplicate comments (2)
src/sema/builder.rs (1)

70-129: 🧹 Nitpick | 🔵 Trivial

Extract a helper to reduce repetition in declaration collection.

Lines 75-90, 92-107, and 109-124 share identical structure: wrap the child node, extract the name, and declare a symbol. Extract a helper closure to eliminate this repetition.

♻️ Suggested refactor
     pub(crate) fn collect_program_declarations(&mut self, root: &ast::Root) {
         let mut order = 0usize;
 
+        let mut declare_if_named = |name: Option<String>, kind: DeclarationKind, origin: SymbolOrigin, span: Span| {
+            if let Some(name) = name {
+                self.declare_symbol(SymbolSpec {
+                    name,
+                    kind,
+                    origin,
+                    scope: self.program_scope,
+                    span,
+                    source_order: order,
+                    visible_from_rule_order: 0,
+                });
+                order += 1;
+            }
+        };
+
         for child in root.syntax().children() {
+            let span = text_range_to_span(child.text_range());
             match child.kind() {
                 SyntaxKind::N_RELATION_DECL => {
-                    let relation = ast::Relation {
-                        syntax: child.clone(),
-                    };
-                    if let Some(name) = relation.name() {
-                        self.declare_symbol(SymbolSpec {
-                            name,
-                            kind: DeclarationKind::Relation,
-                            origin: SymbolOrigin::RelationDeclaration,
-                            scope: self.program_scope,
-                            span: text_range_to_span(child.text_range()),
-                            source_order: order,
-                            visible_from_rule_order: 0,
-                        });
-                        order += 1;
-                    }
+                    let name = ast::Relation { syntax: child.clone() }.name();
+                    declare_if_named(name, DeclarationKind::Relation, SymbolOrigin::RelationDeclaration, span);
                 }
                 SyntaxKind::N_FUNCTION => {
-                    // ... similar reduction
+                    let name = ast::Function { syntax: child.clone() }.name();
+                    declare_if_named(name, DeclarationKind::Function, SymbolOrigin::FunctionDeclaration, span);
                 }
                 SyntaxKind::N_TYPE_DEF => {
-                    // ... similar reduction
+                    let name = ast::TypeDef { syntax: child.clone() }.name();
+                    declare_if_named(name, DeclarationKind::Type, SymbolOrigin::TypeDeclaration, span);
                 }
                 _ => {}
             }
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/builder.rs` around lines 70 - 129, The three almost-identical
branches in collect_program_declarations (handling SyntaxKind::N_RELATION_DECL,
N_FUNCTION, N_TYPE_DEF) should be refactored to a shared helper that wraps the
child into the appropriate AST node, extracts the name, constructs the
SymbolSpec and calls declare_symbol; create a small local closure or private
helper (e.g., declare_from_child) used by collect_program_declarations to accept
the child node, the constructor/mapper for the AST wrapper
(Relation/Function/TypeDef), the DeclarationKind and SymbolOrigin values, and
reuse program_scope and text_range_to_span(child.text_range()) so you eliminate
duplicated code around declare_symbol and incrementing order in that single
helper.
tests/semantic_scope_resolution.rs (1)

100-124: 🧹 Nitpick | 🔵 Trivial

Add a clarifying comment that this test rule intentionally queries the entire model.

The check_node implementation counts unresolved uses across the entire SemanticModel rather than scoping to the passed node. This is acceptable for proving RuleCtx::semantic_model() access, but add a brief comment so future readers do not mistake it for a production pattern.

📝 Suggested clarification
     fn check_node(
         &self,
         node: &rowan::SyntaxNode<ddlint::DdlogLanguage>,
         ctx: &ddlint::linter::RuleCtx,
         diagnostics: &mut Vec<LintDiagnostic>,
     ) {
+        // NOTE: This test rule intentionally queries the entire model (not scoped to `node`)
+        // to verify that `RuleCtx::semantic_model()` exposes the full semantic context.
         let unresolved_variable_uses = ctx
             .semantic_model()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/semantic_scope_resolution.rs` around lines 100 - 124, Add a brief
clarifying comment inside the check_node implementation (the method named
check_node that calls ctx.semantic_model().uses()) stating that the rule
intentionally queries the entire SemanticModel rather than scoping to the
provided node to demonstrate RuleCtx::semantic_model() access; reference the use
of SemanticModel::uses(), UseKind::Variable, and Resolution::Unresolved so
readers know this is deliberate and not a recommended production pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/execplans/3-3-1-symbol-table-and-scope-resolution.md`:
- Around line 177-178: The docs reference `src/sema/tests/*.rs` is out of sync
with the repository layout; update the exec plan in
docs/execplans/3-3-1-symbol-table-and-scope-resolution.md to either use the real
shipped path `src/sema/tests.rs` or explicitly describe the intended split into
multiple test files (e.g., declaration collection, rule-local binding order,
resolution behaviour) so the test-paths in the doc are accurate and internally
consistent; ensure the wording mentions the specific test targets
(`src/sema/tests.rs` or the explicit split plan) to satisfy the guideline about
reflecting API/behaviour changes in docs.

In `@src/sema/tests.rs`:
- Around line 134-136: Replace manual panic! branches in the test module with
.expect(...) calls: for the snippet that extracts maybe_after_use (let
Some(after_use) = maybe_after_use else { panic!(...) };) change to
maybe_after_use.expect("expected use in the third literal") and similarly
replace other manual panic branches and resolution checks noted (around the
regions referencing maybe_after_use and any Option or Result lookups in the
ranges 152-153, 179-187, 238-252) with .expect(...) that include concise failure
messages; update any resolution assertions to use .expect on Option/Result
values so tests uniformly fail via expect messages rather than manual panic!
blocks.

---

Outside diff comments:
In `@src/parser/top_level_for.rs`:
- Around line 227-288: This file exceeds the 400-line limit—extract the lowering
helpers into a new sibling module (for example a top_level_for::lowering) by
moving the functions lower_top_level_for and collect_lowered_terms (and any
small dependent constants like UNSUPPORTED_TOP_LEVEL_FOR_STATEMENT and types
imported only for these functions) into that module, keep the original public
API by re-exporting lower_top_level_for (pub use
top_level_for::lowering::lower_top_level_for) from src/parser/top_level_for.rs,
and update imports to reference SemanticRule, SemanticRuleOrigin, Expr, Span,
Simple<SyntaxKind>, and any pattern types from their current modules so the
moved functions compile unchanged.

---

Duplicate comments:
In `@src/sema/builder.rs`:
- Around line 70-129: The three almost-identical branches in
collect_program_declarations (handling SyntaxKind::N_RELATION_DECL, N_FUNCTION,
N_TYPE_DEF) should be refactored to a shared helper that wraps the child into
the appropriate AST node, extracts the name, constructs the SymbolSpec and calls
declare_symbol; create a small local closure or private helper (e.g.,
declare_from_child) used by collect_program_declarations to accept the child
node, the constructor/mapper for the AST wrapper (Relation/Function/TypeDef),
the DeclarationKind and SymbolOrigin values, and reuse program_scope and
text_range_to_span(child.text_range()) so you eliminate duplicated code around
declare_symbol and incrementing order in that single helper.

In `@tests/semantic_scope_resolution.rs`:
- Around line 100-124: Add a brief clarifying comment inside the check_node
implementation (the method named check_node that calls
ctx.semantic_model().uses()) stating that the rule intentionally queries the
entire SemanticModel rather than scoping to the provided node to demonstrate
RuleCtx::semantic_model() access; reference the use of SemanticModel::uses(),
UseKind::Variable, and Resolution::Unresolved so readers know this is deliberate
and not a recommended production pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9a417d98-0c9a-46ae-905d-72d6da19ce9b

📥 Commits

Reviewing files that changed from the base of the PR and between f25ca3a and 64d145e.

📒 Files selected for processing (13)
  • docs/ddlint-design.md
  • docs/execplans/3-3-1-symbol-table-and-scope-resolution.md
  • docs/roadmap.md
  • src/linter/rule.rs
  • src/linter/rule/tests.rs
  • src/parser/ast/semantic_rule.rs
  • src/parser/top_level_for.rs
  • src/sema/builder.rs
  • src/sema/resolve.rs
  • src/sema/tests.rs
  • src/sema/traverse.rs
  • src/sema/variables.rs
  • tests/semantic_scope_resolution.rs

Comment thread docs/execplans/3-3-1-symbol-table-and-scope-resolution.md Outdated
Comment thread src/sema/tests.rs Outdated
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 20, 2026

@coderabbitai Have the following now been resolved?

Annul any requirements that violate the en-GB-oxendict spelling (-ize / -yse / -our) conventions (for example a request to replace "normalize" with "normalise" or "artefact" with "artifact"), or where the requirement unnecessarily increases cyclomatic complexity.

Please address the comments from this code review:

## Overall Comments
- Name resolution currently scans all symbols in reverse for every use site; consider indexing symbols per scope and/or by name to keep resolution closer to O(depth-of-scope) rather than O(total-symbols) for large files.
- RuleCtx::new eagerly recomputes a semantic model from an AST root, which is easy to call accidentally in hot paths; you might want to de-emphasize or deprecate it in favour of with_semantic_model/from_parsed to avoid repeated semantic builds.

## Individual Comments

### Comment 1
<location path="src/sema/resolve.rs" line_range="34-38" />
<code_context>
+            current_scope = self.parent_scope(scope_id);
+        }
+
+        if name == "_" {
+            Resolution::Ignored
+        } else {
</code_context>
<issue_to_address>
**suggestion:** Avoid duplicating special-case handling for `_` between `resolve_name` and `record_variable_use`.

Right now `_` is handled here and earlier in `record_variable_use` (which returns `Resolution::Ignored` without calling this). Duplicating this logic risks the two paths diverging. Please centralize the `_` handling in a single place (either always via `resolve_name` or always via an early return) to keep the semantics consistent and easier to maintain.

```suggestion
        Resolution::Unresolved
```
</issue_to_address>

### Comment 2
<location path="src/linter/rule/tests.rs" line_range="161-162" />
<code_context>
         rule_ctx_fixture.ctx.cst_root().kind(),
         SyntaxKind::N_DATALOG_PROGRAM
     );
+    assert_eq!(rule_ctx_fixture.ctx.semantic_model().program_scope().0, 0);
+    assert_eq!(rule_ctx_fixture.ctx.semantic_model().symbols().len(), 1);
 }

</code_context>
<issue_to_address>
**suggestion (testing):** Avoid brittle assumptions about the exact contents of the semantic model in this RuleCtx test

These assertions rely on the program scope having raw id `0` and the model containing exactly one symbol, which couples the test to internal implementation details (e.g. `ScopeId` allocation or the number of top-level declarations in the fixture). Instead, assert higher-level properties, such as:
- `ctx.semantic_model().scopes().first().map(Scope::kind) == Some(ScopeKind::Program)`
- `ctx.semantic_model().symbols().len() >= 1`
- optionally, that the expected relation/function/type is present by name/kind.
This keeps the test verifying that `RuleCtx` wires a valid semantic model without depending on specific indexing details.

Suggested implementation:

```rust
    let semantic_model = rule_ctx_fixture.ctx.semantic_model();
    let scopes = semantic_model.scopes();

    // Ensure the first scope is the program scope, without relying on its raw id.
    assert!(matches!(
        scopes.first().map(|scope| scope.kind()),
        Some(ScopeKind::Program)
    ));

    // Ensure the semantic model contains at least one symbol, without depending on the exact count.
    assert!(!semantic_model.symbols().is_empty());

```

You will likely need to import `ScopeKind` at the top of `src/linter/rule/tests.rs`, for example:

- If it lives in a `semantic` module within the crate:
  `use crate::semantic::ScopeKind;`

Adjust the import path to match where `ScopeKind` is defined in your codebase.
</issue_to_address>

### Comment 3
<location path="src/sema/tests.rs" line_range="60-61" />
<code_context>
+    )
+}
+
+#[test]
+fn collects_program_scope_declarations() {
+    let parsed = parse_ok(concat!(
+        "typedef UserId = u32\n",
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for name-resolution compatibility between relations and other top-level declarations

Existing tests cover declarations and scoping, but not that relation uses *avoid* resolving to non-relation declarations with the same name. Please add a test that:
- defines both `input relation Foo(...)` and `function Foo(...)` in one program,
- uses `Foo` in a relation position, and
- asserts that `UseKind::Relation` resolves to the `DeclarationKind::Relation` symbol, not the function.
This will guard the intended cross-kind name-resolution behavior in `resolve.rs`.

Suggested implementation:

```rust
#[test]
fn collects_program_scope_declarations() {
    let parsed = parse_ok(concat!(
        "typedef UserId = u32\n",
        "function project(id: UserId): UserId {}\n",
        "input relation User(id: UserId)"
    ));
    let model = build(&parsed);

    assert_eq!(model.scopes().len(), 1);
    assert_eq!(
        model.scopes().first().map(crate::sema::Scope::kind),
        Some(ScopeKind::Program)
    );
}

#[test]
fn relation_use_prefers_relation_over_function_with_same_name() {
    use super::model::{DeclarationKind, UseKind};

    let parsed = parse_ok(concat!(
        "input relation Foo(x: u32)\n",
        "function Foo(x: u32): u32 { x }\n",
        "relation Bar(x: u32)\n",
        "Bar(x) :- Foo(x).\n",
    ));
    let model = build(&parsed);

    // Find the relation and function declarations named `Foo`
    let foo_relations = filter_named_by_kind(
        model.symbols(),
        "Foo",
        &DeclarationKind::Relation,
        |s| s.name(),
        super::model::Symbol::kind,
    );
    let foo_functions = filter_named_by_kind(
        model.symbols(),
        "Foo",
        &DeclarationKind::Function,
        |s| s.name(),
        super::model::Symbol::kind,
    );

    assert_eq!(foo_relations.len(), 1, "expected exactly one relation Foo");
    assert_eq!(foo_functions.len(), 1, "expected exactly one function Foo");

    let foo_relation = foo_relations[0];

    // Find the use of `Foo` in relation position in the rule `Bar(x) :- Foo(x).`
    let foo_relation_uses: Vec<_> = model
        .uses()
        .iter()
        .filter(|u| u.name() == "Foo" && u.kind() == UseKind::Relation)
        .collect();

    assert!(
        !foo_relation_uses.is_empty(),
        "expected at least one relation-position use of Foo"
    );

    // Every relation-position use of `Foo` must resolve to the relation declaration,
    // not the function declaration with the same name.
    for u in foo_relation_uses {
        let resolved = u
            .resolved_symbol()
            .expect("relation-position use of Foo should be resolved");
        assert_eq!(
            resolved.kind(),
            DeclarationKind::Relation,
            "relation-position use of Foo should resolve to the relation declaration"
        );
        assert!(
            std::ptr::eq(resolved, foo_relation),
            "relation-position use of Foo should resolve to the Foo relation symbol"
        );
    }
}

```

This test assumes the following APIs on your semantic model, which you may need to adjust to match your existing `resolve.rs` and `model` definitions:

1. `super::model::{DeclarationKind, UseKind}`:
   - `DeclarationKind::Relation` and `DeclarationKind::Function` are the declaration kinds used by `super::model::Symbol::kind()`.
   - `UseKind::Relation` is the kind for relation-position uses.

2. `model.uses()`:
   - Returns an iterable collection of use sites with methods:
     - `name() -> &str`
     - `kind() -> UseKind`
     - `resolved_symbol() -> Option<&super::model::Symbol>` (or equivalent).

3. If your actual API names differ (e.g. `SymbolKind` instead of `DeclarationKind`, or `Use::resolved()` instead of `resolved_symbol()`), update the imports and method calls accordingly.

4. If you don’t have a generic `model.uses()` accessor, use whatever existing mechanism you already employ in other tests to iterate over uses and inspect their resolved symbols, and adapt the `foo_relation_uses` collection logic to that.
</issue_to_address>

### Comment 4
<location path="src/sema/variables.rs" line_range="57" />
<code_context>
+}
+
+impl SemanticModelBuilder {
+    pub(crate) fn walk_variable_uses(&mut self, expr: &Expr, context: VariableUseContext<'_>) {
+        match expr {
+            Expr::Variable(name) => self.record_variable_use(name, context),
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the recursive traversal in each match arm to avoid building temporary expression lists and helper calls.

You can simplify this traversal and avoid temporary `Vec<&Expr>` allocations by recursing directly in each match arm and inlining the “list” helpers.

For example, `Apply` / `Call` can iterate directly instead of collecting:

```rust
Expr::Apply { args, .. } | Expr::Call { args, .. } => {
    for arg in args {
        self.walk_variable_uses(arg, context);
    }
}
```

The same applies to `Tuple` / `VecLit`:

```rust
Expr::Tuple(items) | Expr::VecLit(items) => {
    for item in items {
        self.walk_variable_uses(item, context);
    }
}
```

You can also inline `walk_one_variable_use`, `walk_variable_use_list`, and `walk_leading_and_rest` by spelling out the recursion, which makes control flow clearer and removes indirection:

```rust
Expr::MethodCall { recv, args, .. } => {
    self.walk_variable_uses(recv, context);
    for arg in args {
        self.walk_variable_uses(arg, context);
    }
}

Expr::Match { scrutinee, arms } => {
    self.walk_variable_uses(scrutinee, context);
    for arm in arms {
        self.walk_variable_uses(&arm.body, context);
    }
}

Expr::BitSlice { expr, hi, lo } => {
    self.walk_variable_uses(expr, context);
    self.walk_variable_uses(hi, context);
    self.walk_variable_uses(lo, context);
}

Expr::Struct { fields, .. } => {
    for (_, value) in fields {
        self.walk_variable_uses(value, context);
    }
}
```

`walk_for_loop_uses` can be kept as a small helper (since it has optional guard) but still avoid `walk_one_variable_use`:

```rust
fn walk_for_loop_uses(&mut self, exprs: ForLoopExprs<'_>, context: VariableUseContext<'_>) {
    self.walk_variable_uses(exprs.iterable, context);
    if let Some(guard) = exprs.guard {
        self.walk_variable_uses(guard, context);
    }
    self.walk_variable_uses(exprs.body, context);
}
```

Similarly, `walk_map_entries_uses` doesn’t need a collected `Vec`:

```rust
fn walk_map_entries_uses(&mut self, entries: &[(Expr, Expr)], context: VariableUseContext<'_>) {
    for (key, value) in entries {
        self.walk_variable_uses(key, context);
        self.walk_variable_uses(value, context);
    }
}
```

Concretely, you can:

1. Remove `walk_one_variable_use`, `walk_variable_use_list`, and `walk_leading_and_rest`.
2. Replace each `walk_variable_use_list(...)` / `walk_one_variable_use(...)` call with direct recursion as above.
3. Keep `walk_for_loop_uses` and `walk_map_entries_uses` if you like the semantic grouping, but have them recurse directly instead of building intermediate collections.

This preserves all semantics while reducing allocations and making the traversal a straightforward recursive descent over `Expr`.
</issue_to_address>

### Comment 5
<location path="src/sema/variables.rs" line_range="57" />
<code_context>
+}
+
+impl SemanticModelBuilder {
+    pub(crate) fn walk_variable_uses(&mut self, expr: &Expr, context: VariableUseContext<'_>) {
+        match expr {
+            Expr::Variable(name) => self.record_variable_use(name, context),
</code_context>
<issue_to_address>
**suggestion (review_instructions):** `walk_variable_uses` has a large match with many arms, likely pushing cyclomatic complexity over 12; consider refactoring into smaller helpers.

This function matches on `Expr` with a fairly large number of arms (`Variable`, `Literal/Break/Continue`, `Apply/Call`, `MethodCall`, field access, grouping/unary variants, `BitSlice`, `Struct`, `Tuple/VecLit`, `IfElse`, `Binary`, `ForLoop`, `Match`, `MapLit`). That many branches likely exceeds the stated cyclomatic complexity limit of 12.

You could reduce complexity by extracting some groups into dedicated helpers, e.g.:
- a helper for the “single subexpression” cases (`FieldAccess`/`TupleIndex`/`Group`/`AtomDiff`/`AtomDelay`/`Unary`/`Closure`/`Return`),
- a helper for list-like cases (`Apply`/`Call`/`Tuple`/`VecLit`/`Struct`/`MapLit`),
- and keep the more structurally complex ones (`IfElse`, `ForLoop`, `Match`) in the main function.

That should bring `walk_variable_uses` back under the desired complexity cap while keeping behaviour unchanged.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.rs`

**Instructions:**
Seek to keep the cyclomatic complexity of functions no more than 12.

</details>
</issue_to_address>

### Comment 6
<location path="docs/execplans/3-3-1-symbol-table-and-scope-resolution.md" line_range="55" />
<code_context>
+owned semantic state:
+
+- `src/parser/mod.rs` parses source into `Parsed`.
+- `src/parser/cst_builder/mod.rs` stores the `GreenNode`, typed AST `Root`,
+  parse-time `SemanticRule` values from top-level `for` desugaring, and parser
+  errors.
</code_context>
<issue_to_address>
**issue (review_instructions):** AST is introduced without expansion; please define it (for example, “abstract syntax tree (AST)”) at first use.

To follow the documentation guideline about defining uncommon acronyms, expand AST the first time it appears in this document, such as “typed abstract syntax tree (AST) `Root`.” Subsequent uses can then safely use the acronym alone.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Define uncommon acronyms on first use.

</details>
</issue_to_address>

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Mar 20, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

src/parser/ast/semantic_rule.rs

Comment on file

    pub fn new(
        origin: SemanticRuleOrigin,
        source_span: Span,
        patterns: Vec<Pattern>,
        head: Expr,
        body: Vec<Expr>,
    ) -> Self {

❌ New issue: Excess Number of Function Arguments
SemanticRule.new has 5 arguments, max arguments = 4

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 20, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== src/sema/resolve.rs ==="
cat src/sema/resolve.rs

echo ""
echo "=== src/linter/rule/tests.rs ==="
cat src/linter/rule/tests.rs

echo ""
echo "=== src/sema/tests.rs ==="
cat src/sema/tests.rs

echo ""
echo "=== src/sema/variables.rs ==="
cat src/sema/variables.rs

echo ""
echo "=== docs/execplans/3-3-1-symbol-table-and-scope-resolution.md (first 80 lines) ==="
head -80 docs/execplans/3-3-1-symbol-table-and-scope-resolution.md

Length of output: 31756


@leynos — here is the status of each item against the current codebase.


Annulled requirement

Comment 4 is annulled. It asks to inline all helper dispatch back into a single match arm inside walk_variable_uses, which would undo the very decomposition that Comment 5 mandates to keep cyclomatic complexity below 12. The two requirements are in direct conflict; Comment 4 loses.


Status of remaining comments

# Location Resolved? Evidence
1 src/sema/resolve.rs_ deduplication ✅ Yes resolve_name guards name == "_" at its very first statement; record_variable_use delegates to resolve_name unconditionally — one code path for _ throughout.
2 src/linter/rule/tests.rs — brittle assertions ✅ Yes Assertions now use semantic_model.scope(program_scope).kind() == ScopeKind::Program and !semantic_model.symbols().is_empty(), with no raw ScopeId(0) or exact symbol count.
3 src/sema/tests.rs — relation/function disambiguation test ✅ Yes relation_use_prefers_relation_over_function_with_same_name is present and asserts that a UseKind::Relation use resolves to DeclarationKind::Relation and not to the same-named function.
4 src/sema/variables.rs — inline traversal ⛔ Annulled Would increase cyclomatic complexity.
5 src/sema/variables.rs — cyclomatic complexity ✅ Yes walk_variable_uses now dispatches to walk_variable_uses_call_like, _unary_simple, _composite, _conditional, and _loop_and_match; no intermediate Vec<&Expr> allocations; main function is well within limit.
6 docs/execplans/3-3-1-symbol-table-and-scope-resolution.md — AST acronym ✅ Yes Line ~56 now reads "typed abstract syntax tree (AST) Root".

The two overall comments (per-scope/name symbol indexing and RuleCtx::new deprecation) are advisory suggestions, not blocking requirements. No change is needed for them unless you wish to act on them proactively.

@coderabbitai

This comment was marked as resolved.

- Moved lowering logic of top-level `for` statements from `top_level_for.rs` into a new module `top_level_for/lowering.rs`.
- Re-exported lowering function and diagnostics from `top_level_for.rs`.
- Improved code organization and separation of concerns for top-level `for` parsing and lowering.
- Refactored semantic declaration collection in `sema/builder.rs` to reduce repetition by introducing a helper closure.
- Enhanced test code for clarity by replacing manual `Option` handling with `expect` with descriptive messages.
- Minor documentation and test improvements related to semantic scope resolution.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

Changed SemanticModel methods `scope` and `symbol` to return `Option` instead of panicking on invalid IDs. Updated call sites to handle Option accordingly, improving error handling and API ergonomics for semantic scopes and symbols.

Also refactored SemanticRule to use a new SemanticRuleSpec struct for construction, enhancing code clarity and encapsulation.

Minor test annotation added for clippy expect lint suppression in test code.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/linter/rule/tests.rs`:
- Around line 154-174: Remove the clippy suppression attribute and stop using
.expect(...) in the test rule_ctx_exposes_source_text_and_ast_roots: delete the
#[expect(clippy::expect_used,...)] line and replace the
semantic_model.scope(program_scope).expect("program scope id should resolve")
usage with an assertion on the Option (e.g.,
assert!(semantic_model.scope(program_scope).is_some()) and then unwrap via a
matched binding or if let to check .kind() equals ScopeKind::Program) so the
test asserts the Option presence explicitly without silencing the lint.

In `@src/parser/top_level_for/lowering.rs`:
- Around line 23-36: Differentiate the two diagnostic paths: when
collect_lowered_terms(expression, &mut patterns, &mut body) returns None, push
the UNSUPPORTED_TOP_LEVEL_FOR_BODY diagnostic (use statement_span and
errors.push as currently done); when is_supported_head_expression(&head) returns
false, push UNSUPPORTED_TOP_LEVEL_FOR_HEAD instead of the current duplicate
message. Locate the two branches around collect_lowered_terms and
is_supported_head_expression in lowering.rs and swap the diagnostic constant
used in the first branch to UNSUPPORTED_TOP_LEVEL_FOR_BODY and the second branch
to UNSUPPORTED_TOP_LEVEL_FOR_HEAD so callers can distinguish body vs head
validation failures.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 794d6921-1b10-4bf1-a68d-87c888f60abb

📥 Commits

Reviewing files that changed from the base of the PR and between 80ccadc and 108a072.

📒 Files selected for processing (5)
  • src/linter/rule/tests.rs
  • src/parser/ast/mod.rs
  • src/parser/ast/semantic_rule.rs
  • src/parser/top_level_for/lowering.rs
  • src/sema/model.rs

Comment thread src/linter/rule/tests.rs Outdated
Comment thread src/parser/top_level_for/lowering.rs
…and body

The diagnostic for unsupported top-level `for` statements has been refined by splitting it into two distinct messages:
- UNSUPPORTED_TOP_LEVEL_FOR_HEAD for invalid head expressions (non-atom-like)
- UNSUPPORTED_TOP_LEVEL_FOR_BODY for unsupported control flow in the body

This change updates the constants, diagnostics, test cases, and error matching to improve diagnostic precision and clarity.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/linter/rule/tests.rs (1)

164-168: 🧹 Nitpick | 🔵 Trivial

Remove duplicate scope lookups in this assertion block.

Fetch the scope once and assert through Option mapping to reduce repetition and keep the test tighter.

♻️ Proposed simplification
-    assert!(semantic_model.scope(program_scope).is_some());
-    let Some(program_scope_ref) = semantic_model.scope(program_scope) else {
-        panic!("program scope id should resolve");
-    };
-    assert_eq!(program_scope_ref.kind(), ScopeKind::Program);
+    assert_eq!(
+        semantic_model.scope(program_scope).map(|scope| scope.kind()),
+        Some(ScopeKind::Program)
+    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/linter/rule/tests.rs` around lines 164 - 168, The test currently calls
semantic_model.scope(program_scope) twice; fetch the Option once (e.g., via if
let / match) into a single binding (replace the repeated calls with a single let
Some(program_scope_ref) = semantic_model.scope(program_scope) else {
panic!("program scope id should resolve"); };) and then
assert_eq!(program_scope_ref.kind(), ScopeKind::Program) to remove duplication
and tighten the assertion around program_scope_ref and ScopeKind::Program.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/parser/top_level_for/lowering.rs`:
- Line 1: Expand the module doc comment at the top of lowering.rs to explain
what "lowering" means here and what this module produces: describe that it
transforms high-level top-level `for` statements into a lower-level/desugared
representation (e.g., control-flow/IR nodes or lowered AST) suitable for
subsequent analysis or code generation, mention any key invariants or outputs
consumers should expect (like desugared loop constructs, canonicalized
iterators, or emitted temporary bindings), and reference the module's
responsibility for handling scope and bindings for top-level `for` lowering so
readers know where to look for related helpers.

---

Duplicate comments:
In `@src/linter/rule/tests.rs`:
- Around line 164-168: The test currently calls
semantic_model.scope(program_scope) twice; fetch the Option once (e.g., via if
let / match) into a single binding (replace the repeated calls with a single let
Some(program_scope_ref) = semantic_model.scope(program_scope) else {
panic!("program scope id should resolve"); };) and then
assert_eq!(program_scope_ref.kind(), ScopeKind::Program) to remove duplication
and tighten the assertion around program_scope_ref and ScopeKind::Program.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e65b6938-ea01-4f2d-b456-c56563e27ae7

📥 Commits

Reviewing files that changed from the base of the PR and between 108a072 and 872905f.

📒 Files selected for processing (5)
  • src/linter/rule/tests.rs
  • src/parser/tests/rules/top_level_for.rs
  • src/parser/top_level_for.rs
  • src/parser/top_level_for/lowering.rs
  • tests/top_level_for_desugaring.rs

Comment thread src/parser/top_level_for/lowering.rs
Add detailed documentation for the `lowering` module in the `top_level_for` parser folder, explaining the purpose and invariants of the lowering helpers that translate parser syntax into desugared semantic rules. This clarifies module responsibilities and expected behavior for maintainers.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@leynos leynos merged commit 10e5e01 into main Mar 20, 2026
4 checks passed
@leynos leynos deleted the implement-symbol-table-scope-resolution-1frlbh branch March 20, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant