Skip to content

Extend Symbol with precise identifier span and two-provenance model#256

Merged
leynos merged 10 commits intomainfrom
add-symbol-name-span-5p0bo1
Apr 11, 2026
Merged

Extend Symbol with precise identifier span and two-provenance model#256
leynos merged 10 commits intomainfrom
add-symbol-name-span-5p0bo1

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Apr 8, 2026

Summary

  • Adds precise identifier span support to Symbol via an optional name_span field.
  • When present, name_span points at the exact identifier token; otherwise span remains coarse.
  • AST nodes (Function, Relation, TypeDef) gain name_span() accessor; parser helpers added.
  • Semantic model construction and traversal propagate name_span to new symbols where CST data is present.
  • Lint diagnostics (e.g., unused-variable) prefer name_span when available.
  • Documents and tests updated to describe and exercise the two-provenance model (span and name_span).

Changes

  • Core

    • Symbol: add name_span: Option and name_span() accessor
  • AST

    • Function, Relation, TypeDef: implement name_span() using shared helper
    • Add find_identifier_span helper in parser/ast/mod.rs
  • Parser

    • New helper find_identifier_span and related range search
  • Semantic model builder/traversal

    • Wire name_span into SymbolSpec creation for Declarations
    • Collect and attach name_span for rule-head and binding symbols
    • Propagate name_span through collect_* where CST available
  • Lint rules

    • unused_variable uses name_span for diagnostics when present
  • Tests

    • New tests in src/sema/name_span.rs and integrated into sema/tests.rs
  • Documentation

    • Update docs/ddlint-design.md to describe new two-provenance model (span and name_span)
  • Docs: Additional parsing notes updated in docs/parser-implementation-notes.md

  • Tests plan

    • Unit tests for identifier spans
    • Regression checks

Notes

  • name_span is optional to preserve backward compatibility; existing spans are preserved for diagnostics when precise spans are unavailable (e.g., parse-time SemanticRule values).
  • The changes include an additive update to the semantic tests suite, with dedicated tests asserting the presence and correctness of name_span-derived spans.

Closes #253

📎 Task: https://www.devboxer.com/task/e372029c-5fe5-45f1-840c-7a52066a4fcf

codescene-delta-analysis[bot]

This comment was marked as outdated.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 8, 2026

Reviewer's Guide

Extends the semantic Symbol model with an optional precise identifier span (name_span), wires it through AST, semantic builder/traversal, and lints so diagnostics like unused-variable can highlight the exact identifier token while preserving existing coarse spans for provenance.

Sequence diagram for diagnostics using Symbol.name_span in unused-variable

sequenceDiagram
    actor Developer
    participant Editor
    participant Linter
    participant Parser
    participant Ast as AstNodes
    participant Builder as SemanticModelBuilder
    participant SymbolTable
    participant Rule as unused_variable_rule

    Developer->>Editor: Save ddlog source file
    Editor->>Linter: Run ddlog lints

    Linter->>Parser: parse(source)
    Parser-->>Linter: AstRoot

    Linter->>Builder: build_semantic_model(AstRoot)
    Builder->>Ast: iterate_declarations()
    Ast-->>Builder: Function/Relation/TypeDef nodes

    Builder->>Ast: function.name()
    Ast-->>Builder: name
    Builder->>Ast: function.name_span()
    Ast->>Ast: find_identifier_span(syntax, name)
    Ast-->>Builder: Option name_span
    Builder->>SymbolTable: declare_symbol(SymbolSpec span, name_span)

    Builder->>Ast: rule.body_expression_nodes()
    Ast-->>Builder: nodes
    Builder->>Ast: rule.body_terms()
    Ast-->>Builder: terms
    loop for each term
        Builder->>Builder: collect_rule_term(context, Some(node.syntax), term)
        Builder->>SymbolTable: declare_symbol(SymbolSpec span, name_span from find_identifier_span)
    end

    Linter->>Rule: run(unused_variable, SymbolTable)
    loop for each rule_local_symbol
        Rule->>SymbolTable: get_symbol()
        SymbolTable-->>Rule: Symbol
        Rule->>Rule: span = symbol.name_span().unwrap_or(symbol.span())
        Rule-->>Linter: LintDiagnostic(span)
    end

    Linter-->>Editor: diagnostics with precise identifier ranges
    Editor-->>Developer: highlight unused variable identifiers
Loading

Class diagram for Symbol, SymbolSpec, and AST name_span propagation

classDiagram
    class Span

    class Symbol {
        +String name
        +DeclarationKind kind
        +SymbolOrigin origin
        +ScopeId scope
        +Span span
        +Option~Span~ name_span
        +usize source_order
        +usize visible_from_rule_order
        +fn span() Span
        +fn name_span() Option~Span~
        +fn source_order() usize
        +fn visible_from_rule_order() usize
    }

    class SymbolSpec {
        +String name
        +DeclarationKind kind
        +SymbolOrigin origin
        +ScopeId scope
        +Span span
        +Option~Span~ name_span
        +usize source_order
        +usize visible_from_rule_order
    }

    class SemanticModelBuilder {
        -Vec~SymbolSpec~ symbols
        +fn declare_symbol(spec SymbolSpec) void
        +fn collect_rule_term(context VariableUseContext, syntax Option~SyntaxNode~, term RuleBodyTerm) void
        +fn collect_assignment_term(assign RuleAssignment, syntax Option~SyntaxNode~, context VariableUseContext) void
        +fn collect_for_loop_term(for_loop RuleForLoop, syntax Option~SyntaxNode~, context VariableUseContext) void
        +fn collect_head_expr(expr RelationHeadExpr, ctx RuleContext) void
    }

    class Function {
        +SyntaxNode syntax
        +fn name() Option~String~
        +fn name_span() Option~Span~
        +fn is_extern() bool
    }

    class Relation {
        +SyntaxNode syntax
        +fn name() Option~String~
        +fn name_span() Option~Span~
        +fn is_input() bool
        +fn is_output() bool
    }

    class TypeDef {
        +SyntaxNode syntax
        +fn name() Option~String~
        +fn name_span() Option~Span~
        +fn is_extern() bool
    }

    class RuleBodyTerm {
    }

    class RuleAssignment {
        +Pattern pattern
        +Expression value
    }

    class RuleForLoop {
        +Pattern pattern
        +Expression iterable
        +RuleBodyTerm body
    }

    class DdlogRule {
        +SyntaxNode syntax
        +fn heads() Result~Vec~RuleHead~~
        +fn body_terms() Result~Vec~RuleBodyTerm~~
        +fn body_expression_nodes() Vec~AstNode~
    }

    class RuleHead {
        +RelationHeadExpr atom
        +Option~LocationExpr~ location
    }

    class VariableUseContext {
        +fn new(scope ScopeId, span Span, rule_order_limit usize, literal_span Span, literal_index usize) VariableUseContext
        +fn current_scope() ScopeId
        +fn span() Span
        +fn rule_order_limit() usize
        +fn literal_index() usize
    }

    class SyntaxNode {
    }

    class find_identifier_span {
        <<function>>
        +fn find_identifier_span(syntax SyntaxNode, name &str) Option~Span~
    }

    class unused_variable_rule {
        <<lint_rule>>
        +fn check(symbol Symbol) Option~LintDiagnostic~
    }

    SymbolSpec ..> Span
    Symbol ..> Span
    Function ..> Span
    Relation ..> Span
    TypeDef ..> Span

    SemanticModelBuilder o--> SymbolSpec : builds
    SemanticModelBuilder o--> Symbol : stores

    DdlogRule --> RuleHead : has
    DdlogRule --> RuleBodyTerm : has

    RuleBodyTerm <|-- RuleAssignment
    RuleBodyTerm <|-- RuleForLoop

    Function ..> SyntaxNode
    Relation ..> SyntaxNode
    TypeDef ..> SyntaxNode
    DdlogRule ..> SyntaxNode

    find_identifier_span ..> SyntaxNode : traverses
    find_identifier_span ..> Span : returns

    Function ..> find_identifier_span : uses
    Relation ..> find_identifier_span : uses
    TypeDef ..> find_identifier_span : uses

    SemanticModelBuilder ..> DdlogRule : reads
    SemanticModelBuilder ..> VariableUseContext : uses

    unused_variable_rule ..> Symbol : reads
    unused_variable_rule ..> Span : uses span_or_name_span
Loading

File-Level Changes

Change Details Files
Add optional precise identifier span to Symbol and plumb it through semantic symbol creation.
  • Extend SymbolSpec and Symbol structs with an optional name_span field and accessor.
  • Pass name_span into Symbol construction in the semantic model builder.
  • Update top-level declaration collection to extract identifier name and name_span from AST nodes when declaring symbols.
  • Ensure rule-head and rule-local binding symbols are created with appropriate span and optional name_span values.
src/sema/builder.rs
src/sema/model.rs
Capture identifier-token spans from CST/AST for declarations and rule-local bindings.
  • Introduce a recursive find_identifier_span helper over syntax trees to locate identifier tokens by text and return their Span.
  • Add name_span() helpers on Function, Relation, and TypeDef AST nodes using find_identifier_span.
  • Thread optional SyntaxNode context into rule term collection so assignment and for-loop bindings can derive name_span from CST nodes.
  • Update traversal of rule heads to declare RuleBinding symbols with identifier-level spans when CST is available.
src/parser/ast/mod.rs
src/parser/ast/function.rs
src/parser/ast/relation.rs
src/parser/ast/type_def.rs
src/sema/traverse.rs
Prefer precise identifier spans in unused-variable diagnostics and validate behavior with tests.
  • Change unused-variable lint to use symbol.name_span().unwrap_or(symbol.span()) when mapping to text ranges for diagnostics.
  • Extend unused-variable tests to assert the diagnostic span covers only the identifier token.
  • Add a dedicated semantic test module validating name_span behavior for top-level declarations and rule-local bindings.
  • Wire the new name_span tests into the existing semantic test suite.
src/linter/rules/correctness/unused_variable.rs
src/sema/tests.rs
src/sema/tests/name_span.rs
Document the dual-span provenance model for symbols and lint behavior.
  • Clarify in the design doc that symbol span and name_span carry distinct provenance (coarse declaration/literal vs. precise identifier).
  • Update the unused-variable rule description to explicitly mention preference for identifier-token spans when available.
  • Adjust provenance wording around rules and rule-local facts to account for additive precise spans.
docs/ddlint-design.md

Assessment against linked issues

Issue Objective Addressed Explanation
#253 Extend the Symbol model with an additive precise identifier span field (e.g., name_span) populated during semantic collection without changing the existing span() behavior.
#253 Update at least one lint rule (e.g., unused-variable) to use the precise identifier span for diagnostics when available, with tests verifying that name_span points at the identifier token rather than the enclosing span.
#253 Update documentation (docs/ddlint-design.md) to describe the new identifier span field on Symbol and clarify its semantics relative to the existing coarse span.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

Walkthrough

Add an additive precise identifier-token provenance field name_span captured during parsing/semantic collection, surface name_span() on relevant AST nodes and Symbol, thread name_span through the semantic builder/traverse, and update unused-variable diagnostics to prefer name_span with fallback to coarse span; update docs and tests.

Changes

Cohort / File(s) Summary
Documentation
docs/ddlint-design.md, docs/parser-implementation-notes.md
Document coarse vs precise provenance policy; describe name_span semantics and diagnostic preference; add implementation notes for identifier-span sourcing.
Parser — Core utilities & tests
src/parser/ast/mod.rs, src/parser/ast/tests/identifier_span.rs
Add find_identifier_span / find_identifier_span_in_range, traversal/span utilities and tests validating identifier-token discovery and range-scoped searches.
Parser — Node APIs & tests
src/parser/ast/function.rs, src/parser/ast/relation.rs, src/parser/ast/type_def.rs, src/parser/ast/rule_head/tests.rs
Add name_span() accessors on AST wrappers, implement rule-head binding span extraction and tests to verify identifier token spans and skipping logic.
Parser — Rule head type
src/parser/ast/rule_head.rs
Extend RuleHead with span and binding_spans: Vec<(String, Span)>; compute binding identifier spans, prune non-binding idents and deduplicate.
Semantic builder & traversal
src/sema/builder.rs, src/sema/traverse.rs
Introduce SymbolSpec.name_span and persist into Symbol; thread optional SyntaxNode into rule-term collection; compute/store name_span for head/assignment/for-loop bindings when CST available; adjust traversal signatures.
Semantic model & tests
src/sema/model.rs, src/sema/tests.rs, src/sema/tests/name_span.rs
Add Symbol.name_span: Option<Span> with accessor name_span(); add comprehensive tests asserting presence/absence and exact identifier-text for various declaration kinds and bindings.
Linter rule & tests
src/linter/rules/correctness/unused_variable.rs
Prefer symbol.name_span() for diagnostic spans, falling back to symbol.span(); update unit test expected diagnostic range.

Sequence Diagram

sequenceDiagram
    participant Parser as Parser (CST / AST)
    participant Builder as SemanticBuilder
    participant Model as SemanticModel
    participant Linter as UnusedVariableRule

    Parser->>Builder: Provide AST nodes and optional SyntaxNode handle
    Builder->>Parser: Call find_identifier_span(syntax, name)
    Builder->>Builder: Create SymbolSpec { span, name_span? }
    Builder->>Model: Declare Symbol including optional name_span
    Linter->>Model: Query Symbol (span, name_span)
    Linter->>Linter: Choose name_span if present else span
    Linter->>Linter: Emit diagnostic at chosen span
Loading

Possibly related issues

Possibly related PRs

Suggested reviewers

  • codescene-delta-analysis

Poem

✨ Pinpoint the token where the name will dwell,
Builders harvest letters and store the spell.
Symbols keep the coarse and a finer trace,
Lints point at the token to show the right place —
Provenance refined, diagnostics well.

🚥 Pre-merge checks | ✅ 7 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Developer Documentation ⚠️ Warning Pull request introduces undocumented internal APIs (PatternBindingSpec, declare_pattern_binding(), nodes_align check) and architectural changes lacking detail in parser-implementation-notes.md; ambiguous phrasing unresolved. Expand documentation with PatternBindingSpec description, parameter bundling rationale, collect_*() method details, nodes_align check purpose, and correct ambiguous phrasing to 'either syntax or semantic collection'.
Module-Level Documentation ⚠️ Warning Module declarations lack docstrings: mod name_span; and mod identifier_span; lack preceding documentation required by the coding standard. Add docstring comments immediately preceding each module declaration using the descriptive text already present in their source files for consistency.
✅ Passed checks (7 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fully addresses all acceptance criteria from #253: Symbol gains additive name_span field [symbol/builder/model], Symbol::span() unchanged [traverse/linter], name_span propagated through builder and traversal [builder/traverse], unused-variable rule uses it [rules/correctness], tests added [sema/tests/name_span.rs], and docs updated [ddlint-design.md/parser-notes.md].
Out of Scope Changes check ✅ Passed All changes directly support the objective of adding precise identifier spans to Symbol: AST accessor methods, parser helpers, semantic builder/traversal threading, lint rule updates, comprehensive tests, and design documentation—no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Testing ✅ Passed All new functionality is guarded by comprehensive unit and behavioural tests across three dedicated test modules with 15 total test cases.
User-Facing Documentation ✅ Passed The PR appropriately documents user-facing behaviour changes within the project's current development phase by updating the design document and inline rule documentation.
Title check ✅ Passed The title accurately summarises the main changeset: extending Symbol with a precise identifier span and establishing a two-provenance model (span and name_span).
Description check ✅ Passed The description thoroughly relates to the changeset, detailing the addition of name_span to Symbol, AST node implementations, semantic model propagation, lint rule updates, and comprehensive test and documentation changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📋 Issue Planner

Built with CodeRabbit's Coding Plans for faster development and fewer bugs.

View plan used: #253

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch add-symbol-name-span-5p0bo1

Comment @coderabbitai help to get the list of available commands and usage tips.

@leynos leynos marked this pull request as ready for review April 8, 2026 20:00
sourcery-ai[bot]

This comment was marked as resolved.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7783aabdc5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sema/traverse.rs Outdated
@coderabbitai coderabbitai Bot added the Issue label Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sema/builder.rs`:
- Around line 156-160: Add a defensive invariant check before zipping body nodes
and terms: verify rule.body_expression_nodes().len() == rule.body_terms().len()
(e.g., with debug_assert_eq!) so node/term provenance cannot silently
desynchronize; locate the block around the if let Ok(terms) = rule.body_terms()
{ ... } that obtains body_nodes and insert the assertion before the for
(literal_index, (node, term)) in body_nodes.into_iter().zip(terms).enumerate()
loop (affecting rule.body_expression_nodes(), rule.body_terms(), span(), and
collect_rule_term()).

In `@src/sema/tests/name_span.rs`:
- Around line 70-88: The test currently validates that a head binding's
name_span points to the identifier token, but doesn't cover the collision case
where the relation name and the head binding share the same text; update the
test function rule_head_bindings_capture_identifier_name_spans to use a source
where the relation and head binding are the same token (e.g. "Output(Output) :-
Output(_)."), then ensure you still call find_symbol with
DeclarationKind::RuleBinding and SymbolOrigin::RuleHead to locate the binding
symbol and assert span_text(source, name_span) equals the binding token text and
span_text(source, symbol.span()) equals the full source so the name_span
resolves to the binding occurrence (not the relation occurrence).

In `@src/sema/traverse.rs`:
- Around line 29-37: The code currently derives name_span by searching the whole
rule text via find_identifier_span(rule.syntax(), &binding_name) which can pick
the wrong token; instead determine the precise identifier token from the binding
site syntax and pass that to find_identifier_span. Concretely, iterate or access
the binding syntax node for each binding produced by
collect_head_binding_names(&head.atom) (i.e., the head/atom binding node), then
call find_identifier_span(binding_syntax_node, &binding_name) and use that
result as SymbolSpec.name_span when calling self.declare_symbol(SymbolSpec { ...
}), keeping SymbolSpec.span as the broader span and leaving name_span optional
if the precise token cannot be found; also ensure diagnostics prefer the precise
name_span when present.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d74ef601-245f-400d-8ac4-de62eb8263aa

📥 Commits

Reviewing files that changed from the base of the PR and between ae5f0bf and 7377cdc.

📒 Files selected for processing (11)
  • docs/ddlint-design.md
  • src/linter/rules/correctness/unused_variable.rs
  • src/parser/ast/function.rs
  • src/parser/ast/mod.rs
  • src/parser/ast/relation.rs
  • src/parser/ast/type_def.rs
  • src/sema/builder.rs
  • src/sema/model.rs
  • src/sema/tests.rs
  • src/sema/tests/name_span.rs
  • src/sema/traverse.rs

Comment thread src/sema/builder.rs Outdated
Comment thread src/sema/tests/name_span.rs
Comment thread src/sema/traverse.rs Outdated
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sema/traverse.rs (1)

147-158: 🧹 Nitpick | 🔵 Trivial

Clarify the intentional loss of precision for nested for-loop bindings.

Passing None for syntax in the recursive collect_rule_term call means nested for-loop body bindings will lack precise name_span. Add a brief comment explaining this is intentional (CST handles are not preserved through nested traversal) to prevent future maintainers from "fixing" this.

Suggested clarification
         for (term_offset, nested_term) in for_loop.body_terms.iter().enumerate() {
             self.collect_rule_term(
                 VariableUseContext::new(
                     child_scope,
                     context.literal_index() + term_offset,
                     context.span(),
                     context.rule_order_limit(),
                 ),
+                // Nested body terms do not retain CST handles; name_span unavailable.
                 None,
                 nested_term,
             );
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/traverse.rs` around lines 147 - 158, The recursive call to
collect_rule_term from the for-loop (using VariableUseContext::new with
context.literal_index() + term_offset) intentionally passes None for the syntax
parameter, which causes nested for-loop bindings to lack precise name_span; add
a concise comment above this call explaining that this is deliberate because CST
handles/name-span information are not preserved across nested traversal and so
we intentionally drop syntax here to avoid misleading precision, referencing
collect_rule_term and VariableUseContext::new so future maintainers know this
was a conscious choice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/sema/traverse.rs`:
- Around line 147-158: The recursive call to collect_rule_term from the for-loop
(using VariableUseContext::new with context.literal_index() + term_offset)
intentionally passes None for the syntax parameter, which causes nested for-loop
bindings to lack precise name_span; add a concise comment above this call
explaining that this is deliberate because CST handles/name-span information are
not preserved across nested traversal and so we intentionally drop syntax here
to avoid misleading precision, referencing collect_rule_term and
VariableUseContext::new so future maintainers know this was a conscious choice.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 281716ce-f2d4-4217-9e68-488aeec27394

📥 Commits

Reviewing files that changed from the base of the PR and between 7377cdc and fe9aff3.

📒 Files selected for processing (5)
  • src/parser/ast/mod.rs
  • src/parser/ast/rule_head.rs
  • src/sema/builder.rs
  • src/sema/tests/name_span.rs
  • src/sema/traverse.rs

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 9, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

src/parser/ast/mod.rs

Comment on file

pub(crate) fn find_identifier_span_in_range(
    syntax: &SyntaxNode<DdlogLanguage>,
    search_span: &Span,
    name: &str,
) -> Option<Span> {
    for child in syntax.children_with_tokens() {
        match child {
            NodeOrToken::Token(token)
                if token.kind() == SyntaxKind::T_IDENT && token.text() == name =>
            {
                let token_span = text_range_to_span(token.text_range());
                if span_contains(search_span, &token_span) {
                    return Some(token_span);
                }
            }
            NodeOrToken::Node(node) => {
                if !ranges_overlap(search_span, &text_range_to_span(node.text_range())) {
                    continue;
                }
                if let Some(span) = find_identifier_span_in_range(&node, search_span, name) {
                    return Some(span);
                }
            }
            NodeOrToken::Token(_) => {}
        }
    }

    None
}

❌ New issue: Deep, Nested Complexity
find_identifier_span_in_range has a nested complexity depth of 4, threshold = 4

codescene-delta-analysis[bot]

This comment was marked as outdated.

@coderabbitai

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sema/traverse.rs (1)

90-101: 🧹 Nitpick | 🔵 Trivial

Extract one helper for pattern-binding symbol declaration.

Remove duplicated declare_symbol(SymbolSpec { ... }) construction across assignment and for-loop paths to prevent drift in provenance fields.

Refactor sketch
+    fn declare_pattern_bindings(
+        &mut self,
+        pattern: &ast::Pattern,
+        syntax: Option<&SyntaxNode<crate::DdlogLanguage>>,
+        origin: SymbolOrigin,
+        scope: ScopeId,
+        span: &Span,
+        visible_from_rule_order: usize,
+    ) {
+        for binding_name in collect_pattern_binding_names(pattern) {
+            let name_span = syntax.and_then(|node| find_identifier_span(node, &binding_name));
+            self.declare_symbol(SymbolSpec {
+                name: binding_name,
+                kind: DeclarationKind::RuleBinding,
+                origin,
+                scope,
+                span: span.clone(),
+                name_span,
+                source_order: self.symbols.len(),
+                visible_from_rule_order,
+            });
+        }
+    }
-
-        for binding_name in collect_pattern_binding_names(&assign.pattern) {
-            self.declare_symbol(SymbolSpec { ... });
-        }
+        self.declare_pattern_bindings(
+            &assign.pattern,
+            syntax,
+            SymbolOrigin::AssignmentPattern,
+            context.current_scope(),
+            context.span(),
+            context.literal_index() + 1,
+        );
-
-        for binding_name in collect_pattern_binding_names(&for_loop.pattern) {
-            self.declare_symbol(SymbolSpec { ... });
-        }
+        self.declare_pattern_bindings(
+            &for_loop.pattern,
+            syntax,
+            SymbolOrigin::ForPattern,
+            child_scope,
+            context.span(),
+            0,
+        );

As per coding guidelines, "Where a Rust function is too long, extract meaningfully named helper functions adhering to separation of concerns and the Command Query Responsibility Segregation pattern."

Also applies to: 134-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sema/traverse.rs` around lines 90 - 101, Duplicate construction and call
to declare_symbol(SymbolSpec { ... }) for pattern bindings should be extracted
into a single helper; create a new function (e.g., declare_pattern_binding or
declare_pattern_binding_symbols) that takes the binding name, context and
relevant metadata and builds the SymbolSpec (using DeclarationKind::RuleBinding
and SymbolOrigin::AssignmentPattern where appropriate) then calls
self.declare_symbol. Replace both the loop using
collect_pattern_binding_names(&assign.pattern) and the other similar site (lines
~134-145) to call this helper so provenance fields (scope, span, name_span,
source_order, visible_from_rule_order) are set in one place and cannot drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sema/traverse.rs`:
- Line 7: Refactor the traversal inside the function
find_identifier_span_in_range to replace its nested recursive branching with an
explicit DFS using a Vec<SyntaxElement> stack: keep the public
find_identifier_span unchanged, add small helpers token_overlaps_range(token,
range)->bool, node_overlaps_range(node, range)->bool and
is_matching_ident(token, name)->bool, push the root SyntaxElement(s) then loop
popping elements, skip tokens/nodes that don't overlap via the helpers
(preserving pruning), for nodes push their children in deterministic order and
for tokens check is_matching_ident and return the first matching identifier span
exactly as current behaviour; update parser AST unit tests to cover repeated
identifier names in separate subtrees, non-overlapping ranges, and exact nested
matches, then rustfmt and fix clippy warnings.

---

Outside diff comments:
In `@src/sema/traverse.rs`:
- Around line 90-101: Duplicate construction and call to
declare_symbol(SymbolSpec { ... }) for pattern bindings should be extracted into
a single helper; create a new function (e.g., declare_pattern_binding or
declare_pattern_binding_symbols) that takes the binding name, context and
relevant metadata and builds the SymbolSpec (using DeclarationKind::RuleBinding
and SymbolOrigin::AssignmentPattern where appropriate) then calls
self.declare_symbol. Replace both the loop using
collect_pattern_binding_names(&assign.pattern) and the other similar site (lines
~134-145) to call this helper so provenance fields (scope, span, name_span,
source_order, visible_from_rule_order) are set in one place and cannot drift.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ca06b703-0a25-496c-8158-c24a613409cf

📥 Commits

Reviewing files that changed from the base of the PR and between fe9aff3 and 590b15d.

📒 Files selected for processing (1)
  • src/sema/traverse.rs

Comment thread src/sema/traverse.rs
leynos and others added 4 commits April 9, 2026 14:28
This introduces an optional `name_span` field on symbols, capturing the exact identifier token span when available. The semantic model builder and AST nodes now extract and store precise name spans for top-level declarations (relations, functions, types) and for rule-local bindings (head bindings, assignments, for-loop patterns).

The unused-variable lint and diagnostics have been updated to use these precise spans where possible for improved diagnostic highlighting.

Also includes new semantic-model tests verifying that the precise identifier spans are captured and differ from the coarser span previously used.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
… function

Refactor collect_program_declarations by removing inlined closure and
pattern matching, replacing it with a new helper method
program_declaration_spec. This improves code readability and modularity
without changing behavior.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
…tic model

This change augments the RuleHead AST with span information for the head atom and its bindings. It introduces functionality in the parser to collect and store identifier-token spans for bindings introduced by the head atom. The semantic model builder now uses these spans to accurately track declaration name spans for rule head bindings.

Enhancements include:
- New functions to find identifier spans within given spans for precise range matching.
- Collection of binding spans in parsed rule heads excluding non-bindings like those following dots or certain punctuations.
- Usage of binding spans in semantic symbol declarations for improved source mapping.
- Additional tests to verify correct capturing and handling of name spans for rule head bindings.

This feature improves diagnostic accuracy and tooling support by providing detailed source location information for rule head bindings.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Nested bodies recurse through `collect_rule_term`, but the `VariableUseContext::new(...)` path no longer retains CST handles for those inner terms. Passing `None` deliberately prevents nested bindings from claiming precise `name_span`s we cannot source faithfully.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@devboxerhub devboxerhub Bot force-pushed the add-symbol-name-span-5p0bo1 branch from 590b15d to 03a0ad5 Compare April 9, 2026 14:30
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sema/builder.rs`:
- Around line 203-214: Add a brief comment above the loop that declares
rule-head bindings explaining the intentional asymmetry: in the AST traversal
(traverse.rs) rule-head bindings get name_span: Some(...) from binding_spans,
but here in the semantic builder we set SymbolSpec.name_span = None for
DeclarationKind::RuleBinding / SymbolOrigin::SemanticRuleHead to mirror the same
asymmetry noted elsewhere (see the adjacent comment pattern around the other
rule-binding handling). Mention traverse.rs and SymbolSpec.name_span in the
comment so readers know where the name_span is captured for AST vs semantic
phases.

In `@src/sema/tests/name_span.rs`:
- Around line 137-158: Add a new test named
nested_for_loop_bindings_lack_name_span that parses the rule with a nested for
(e.g. "for (outer in Items(x)) for (inner in Nested(outer)) ..."), builds the
semantic model via build(parse_ok(...)), locates the inner binding using
find_symbol with DeclarationKind::RuleBinding and SymbolOrigin::ForPattern, and
assert that symbol.name_span().is_none() to lock in the deliberate behaviour
from traverse::collect_rule_term where nested for-loop term syntax handles are
passed as None.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: eacbfeb5-125a-448a-92ee-94af67ed5511

📥 Commits

Reviewing files that changed from the base of the PR and between 590b15d and 03a0ad5.

📒 Files selected for processing (12)
  • docs/ddlint-design.md
  • src/linter/rules/correctness/unused_variable.rs
  • src/parser/ast/function.rs
  • src/parser/ast/mod.rs
  • src/parser/ast/relation.rs
  • src/parser/ast/rule_head.rs
  • src/parser/ast/type_def.rs
  • src/sema/builder.rs
  • src/sema/model.rs
  • src/sema/tests.rs
  • src/sema/tests/name_span.rs
  • src/sema/traverse.rs

Comment thread src/sema/builder.rs
Comment thread src/sema/tests/name_span.rs
Refactored `SemanticModelBuilder` to use a new `declare_pattern_binding` helper
method when declaring pattern bindings for assignment and for-patterns. This
removes duplication and centralizes symbol declaration logic.

Additionally, improved identifier span search in AST by rewriting
`find_identifier_span_in_range` to use iterative traversal with helper functions.
Added related tests for identifier search correctness.

Minor doc formatting fixes also included.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 10, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response.

src/sema/traverse.rs

Comment on file

    fn declare_pattern_binding(
        &mut self,
        binding_name: &str,
        origin: SymbolOrigin,
        scope: ScopeId,
        span: &Span,
        syntax: Option<&SyntaxNode<crate::DdlogLanguage>>,
        visible_from_rule_order: usize,
    ) {
        self.declare_symbol(SymbolSpec {
            name: binding_name.to_string(),
            kind: DeclarationKind::RuleBinding,
            origin,
            scope,
            span: span.clone(),
            name_span: syntax.and_then(|syntax| find_identifier_span(syntax, binding_name)),
            source_order: self.symbols.len(),
            visible_from_rule_order,
        });
    }

❌ New issue: Excess Number of Function Arguments
SemanticModelBuilder.declare_pattern_binding has 7 arguments, max arguments = 4

@coderabbitai

This comment was marked as resolved.

Add a new test verifying that nested for-loop bindings lack precise name_span
information due to missing CST handles, ensuring semantic model correctly
represents symbol spans in this case.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 10, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph.

If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

src/sema/tests/name_span.rs

Comment on file

fn rule_head_bindings_capture_identifier_name_spans() {
    let source = "Output(Output) :- Output(_).";
    let parsed = parse_ok(source);
    let semantic_model = super::super::build(&parsed);
    let symbol = find_symbol(
        &semantic_model,
        "Output",
        DeclarationKind::RuleBinding,
        SymbolOrigin::RuleHead,
    );

    let name_span = symbol
        .name_span()
        .unwrap_or_else(|| panic!("missing head binding name_span"));

    assert_eq!(span_text(source, name_span), "Output");
    assert_eq!(name_span.start, 7);
    assert_eq!(span_text(source, symbol.span()), source);
}

❌ New issue: Code Duplication
The module contains 3 functions with similar structure: assignment_bindings_capture_identifier_name_spans,for_loop_bindings_capture_identifier_name_spans,rule_head_bindings_capture_identifier_name_spans

@coderabbitai

This comment was marked as resolved.

…g parameters

Refactored declare_pattern_binding method to accept a PatternBindingSpec struct instead of multiple separate parameters. This consolidates related data into a single struct for improved readability and maintainability in traverse.rs.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@leynos leynos changed the title Extend Symbol with precise identifier span (name_span) for diagnostics Add precise identifier spans (name_span) for AST, sema, and lint rules Apr 10, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 10, 2026

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix. Include the file and symbol names indicated in the issue at the head of your response. Ensure that this is validated against the current version of the codegraph.

If further refinement to address this finding would be deleterious, please supply a clear explanatory one to two paragraph markdown message I can paste into the CodeScene web ui's diagnostic suppression function so this diagnostic can be silenced.

Code Duplication

src/sema/tests/name_span.rs:

What lead to degradation?

The module contains 3 functions with similar structure: assignment_bindings_capture_identifier_name_spans,for_loop_bindings_capture_identifier_name_spans,rule_head_bindings_capture_identifier_name_spans

Why does this problem occur?

Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?

A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More
Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

@coderabbitai

This comment was marked as resolved.

Refactor binding declaration tests into a single parameterized test
function with multiple cases to improve readability and maintainability.
The new test covers rule head bindings, assignment pattern bindings, and
for loop bindings uniformly.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@leynos leynos changed the title Add precise identifier spans (name_span) for AST, sema, and lint rules Extend Symbol with precise identifier span Apr 10, 2026
codescene-delta-analysis[bot]

This comment was marked as outdated.

…ions and bindings

This change enhances diagnostic precision by capturing exact spans of declaration and rule-local binding identifiers during parsing and semantic collection rather than re-walking the CST during linting.

- Added `find_identifier_span` helpers in `src/parser/ast/mod.rs` for locating identifier tokens within syntax nodes or ranges.
- Exposed `name_span()` accessors in AST nodes for functions, relations, and typedefs.
- Implemented `RuleHead::binding_spans` in `src/parser/ast/rule_head.rs` to record declaration-like binding identifiers from rule heads.
- Modified semantic model builder and traverser to thread and record precise `name_span` information for top-level declarations and rule-local bindings.
- Added extensive tests to validate correct span extraction and binding recognition.
- Documented implementation details in `docs/parser-implementation-notes.md`.

This improves diagnostics by providing accurate source spans that tools and linters can utilize consistently.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 11, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Apr 11, 2026

@coderabbitai Have the following now been resolved?

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Testing ⚠️ Warning Pull request introduces significant new functionality with incomplete test coverage for critical components including binding span collection logic and new AST accessor methods. Add unit tests for collect_head_binding_spans and should_skip_binding_ident helper functions; test new name_span() methods on Function, Relation, and TypeDef AST nodes; expand find_identifier_span tests for edge cases and nested structures.
Developer Documentation ⚠️ Warning PR introduces significant internal APIs and architectural changes without documenting implementation-level details in the designated implementation guide. Update docs/parser-implementation-notes.md to document the find_identifier_span mechanism, RuleHead::binding_spans computation, and syntax parameter threading through semantic builder functions.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 11, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check 1: Docstring coverage – look for doc comments on public/pub(crate) fns added in this PR
echo "=== find_identifier_span doc comments ==="
rg -n '///|//!' src/parser/ast/mod.rs | head -40

echo "=== Function::name_span doc ==="
rg -n '///|//!' src/parser/ast/function.rs | head -20

echo "=== Relation::name_span doc ==="
rg -n '///|//!' src/parser/ast/relation.rs | head -20

echo "=== TypeDef::name_span doc ==="
rg -n '///|//!' src/parser/ast/type_def.rs | head -20

echo "=== Symbol::name_span doc ==="
rg -n '///|//!' src/sema/model.rs | head -20

echo "=== declare_pattern_binding doc ==="
rg -n '///|//!' src/sema/traverse.rs | head -30

echo "=== PatternBindingSpec doc ==="
rg -n 'PatternBindingSpec' src/sema/traverse.rs | head -10

# Check 2: Tests for collect_head_binding_spans, should_skip_binding_ident, name_span() on AST nodes
echo "=== Tests for collect_head_binding_spans ==="
rg -rn 'collect_head_binding_spans\|should_skip_binding_ident' --type rust | head -20

echo "=== Tests for Function/Relation/TypeDef name_span ==="
rg -rn 'Function.*name_span\|Relation.*name_span\|TypeDef.*name_span\|name_span.*Function\|name_span.*Relation\|name_span.*TypeDef' --type rust | head -20

echo "=== find_identifier_span edge case tests ==="
rg -n '#\[test\]\|fn.*test\|#\[rstest\]' src/parser/ast/mod.rs | head -30

# Check 3: docs/parser-implementation-notes.md
echo "=== parser-implementation-notes.md existence ==="
fd 'parser-implementation-notes' docs/

echo "=== parser-implementation-notes.md content (if exists) ==="
cat docs/parser-implementation-notes.md 2>/dev/null || echo "FILE NOT FOUND"

Length of output: 19872


🏁 Script executed:

#!/bin/bash
# Check tests inside src/parser/ast/mod.rs (the #[cfg(test)] block)
echo "=== mod.rs test module ==="
awk '/^#\[cfg\(test\)\]/,0' src/parser/ast/mod.rs | head -80

# Check Relation name_span doc comment specifically
echo "=== Relation name_span method ==="
rg -n 'name_span' src/parser/ast/relation.rs -A 5 -B 3

# Check Symbol name_span doc comment
echo "=== Symbol name_span method ==="
rg -n 'name_span' src/sema/model.rs -A 5 -B 3

# Check if there are any tests for name_span on AST nodes in test files
echo "=== AST name_span tests anywhere ==="
rg -rn 'name_span' --type rust src/parser/ | head -30

# Check rule_head.rs for collect_head_binding_spans test presence
echo "=== rule_head.rs test block ==="
awk '/^#\[cfg\(test\)\]/,0' src/parser/ast/rule_head.rs | head -60

Length of output: 3057


🏁 Script executed:

#!/bin/bash
# Locate and cat the identifier_span test file referenced in mod.rs
echo "=== identifier_span test file ==="
fd 'identifier_span' src/parser/ --exec cat {}

# Locate and cat the rule_head tests file
echo "=== rule_head/tests.rs ==="
fd 'tests.rs' src/parser/ast/rule_head/ --exec cat {}

# Check function.rs tests for name_span
echo "=== function.rs name_span test ==="
rg -n 'name_span\|fn function' src/parser/ast/function.rs | head -20

# Check type_def.rs tests for name_span  
echo "=== type_def.rs name_span test ==="
rg -n 'name_span\|fn typedef' src/parser/ast/type_def.rs | head -20

Length of output: 5142


@leynos Here is the status of each warning based on the current codebase state:


⚠️ Docstring Coverage (66.67% → 80% threshold)

Status: Likely improved, but unverifiable without running the coverage tool.

All new pub and pub(crate) symbols introduced in this PR now carry doc comments:

Symbol Doc comment
find_identifier_span ✓ (src/parser/ast/mod.rs line 24)
find_identifier_span_in_range ✓ (line 30)
Function::name_span ✓ (function.rs line 32)
Relation::name_span ✓ (relation.rs line 61)
TypeDef::name_span ✓ (type_def.rs line 28)
Symbol::name_span ✓ (model.rs line 214)
PatternBindingSpec ✓ (traverse.rs line 23)
declare_pattern_binding ✓ (traverse.rs line 195)

Whether the 80% threshold is crossed now depends on the denominator (total functions across the entire codebase). Run cargo doc --no-deps 2>&1 | grep -c 'missing documentation' or the project's coverage tool to confirm.


✅ Testing — RESOLVED

All three sub-items are now covered:

Sub-item Resolution
collect_head_binding_spans 1 test in src/parser/ast/rule_head/tests.rs: collect_head_binding_spans_skips_relation_callee_collision
should_skip_binding_ident 2 tests in the same file: should_skip_binding_ident_rejects_callee_but_keeps_argument_binding, should_skip_binding_ident_rejects_dotted_and_constructor_positions
name_span() on Function / Relation / TypeDef Per-file inline tests present (relation_name_span_points_to_declaration_identifier confirmed; analogous tests confirmed in function.rs and type_def.rs) ✓
find_identifier_span edge cases 5 tests in src/parser/ast/identifier_span.rs (repeated-name subtree, non-overlapping range, exact nested token, absent name, first-match ordering) ✓

✅ Developer Documentation — RESOLVED

docs/parser-implementation-notes.md now contains a "Precise identifier span sourcing" section that covers all three required items:

  • The find_identifier_span / find_identifier_span_in_range helpers in src/parser/ast/mod.rs.
  • RuleHead::binding_spans computation and the collect_head_binding_spans filter logic.
  • Threading of the syntax parameter through collect_rule_term, collect_assignment_term, and collect_for_loop_term, including the deliberate syntax: None sentinel for nested traversals.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/parser-implementation-notes.md`:
- Around line 218-220: Reword the sentence that currently reads "Precise
diagnostic spans for declarations and rule-local bindings are captured once
during syntax or semantic collection rather than by re-walking the CST in lint
rules" to make the parallel explicit; replace "syntax or semantic collection"
with a clearer construction such as "either during syntax collection or during
semantic collection" (or "during syntax collection or during semantic analysis")
and ensure the phrase parallels "captured once" (e.g., "are captured once,
either during syntax collection or during semantic collection, rather than by
re-walking the CST in lint rules") so readers cannot misread the intended
meaning.

In `@src/parser/ast/function.rs`:
- Around line 87-94: The span_text helper is duplicated across function.rs,
relation.rs, type_def.rs, and tests/identifier_span.rs; extract it into a shared
test utility (e.g., add pub fn span_text(...) into src/test_util/mod.rs or
src/test_util/span.rs) and replace the local copies by importing that function
(use crate::test_util::span_text or appropriate module path) in each file
(function.rs, relation.rs, type_def.rs, tests/identifier_span.rs), removing the
duplicate definitions so all tests call the single shared span_text helper.

In `@src/sema/builder.rs`:
- Around line 155-176: The loop currently indexes body_nodes by literal_index
even when body_nodes.len() != terms.len(), which can mis-pair CST nodes to AST
terms; modify the iteration so once the counts diverge you stop using positional
pairing: compute body_nodes_len = rule.body_expression_nodes().len() and inside
the for loop set maybe_node = if literal_index < body_nodes_len {
body_nodes.get(literal_index) } else { None }, and then derive span via
maybe_node.map_or_else(|| rule_span.clone(), ast::RuleBodyExpression::span)
before calling collect_rule_term (keep usage of rule.body_terms(),
rule.body_expression_nodes(), collect_rule_term, VariableUseContext::new,
ast::RuleBodyExpression::span, rule_span, maybe_node as the targeted
identifiers).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84d19628-12b3-43dd-8f55-8b70780b10c3

📥 Commits

Reviewing files that changed from the base of the PR and between 1ade137 and ffa7d44.

📒 Files selected for processing (11)
  • docs/parser-implementation-notes.md
  • src/parser/ast/function.rs
  • src/parser/ast/mod.rs
  • src/parser/ast/relation.rs
  • src/parser/ast/rule_head.rs
  • src/parser/ast/rule_head/tests.rs
  • src/parser/ast/tests/identifier_span.rs
  • src/parser/ast/type_def.rs
  • src/sema/builder.rs
  • src/sema/tests/name_span.rs
  • src/sema/traverse.rs

Comment thread docs/parser-implementation-notes.md Outdated
Comment thread src/parser/ast/function.rs Outdated
Comment thread src/sema/builder.rs
Replaced duplicated local span_text functions in multiple AST test modules with a single shared implementation from src/test_util/mod.rs. This reduces code duplication and improves test utility usage.

Also adjusted semantic builder logic to safely handle rule body nodes, ensuring debug assertions remain valid during semantic collection.

Minor docs update for identifier span sourcing explanatory text.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@leynos leynos changed the title Extend Symbol with precise identifier span Extend Symbol with precise identifier span and two-provenance model Apr 11, 2026
@leynos leynos merged commit 02e67a4 into main Apr 11, 2026
4 checks passed
@leynos leynos deleted the add-symbol-name-span-5p0bo1 branch April 11, 2026 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add exact identifier-token spans to semantic records (deferred precision milestone)

1 participant