Skip to content

Refactor large parser modules into smaller components under 400 lines#259

Draft
leynos wants to merge 1 commit intomainfrom
refactor-parser-modules-d2nyz5
Draft

Refactor large parser modules into smaller components under 400 lines#259
leynos wants to merge 1 commit intomainfrom
refactor-parser-modules-d2nyz5

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Apr 23, 2026

Summary

  • Refactors large parser modules into smaller, focused components to improve maintainability and testability. No user-facing behavior changes.

Changes

Core Refactor

  • Introduced dedicated S-expression formatting for Expr via a new module: src/parser/ast/expr/sexpr.rs and wired it into the Expr API (Expr::to_sexpr moved from the main file).
  • Moved rule-body expression classification logic into a dedicated module: src/parser/ast/rule/classification.rs and updated rule.rs to delegate classification to it.
  • Extracted Pratt postfix handling into separate modules:
    • src/parser/expression/pratt/delay.rs
    • src/parser/expression/pratt/diff.rs
    • src/parser/expression/pratt/postfix.rs
    • Updated src/parser/expression/pratt.rs to wire in the new modules (mod delay; mod diff; mod postfix).
  • Adjusted module layout and imports to reflect the new structure, reducing file sizes and coupling in core parser files.

Tests and Fixtures

  • Reworked tests to rely on a structured fixtures system for expression parsing:
    • Added new test fixtures module: src/parser/tests/expression/fixtures.rs
    • Split valid expression fixtures into submodules (basic, collections, control_flow, postfix, structured) and wired them into a single expression_cases() entry point
    • Added error fixtures at: src/parser/tests/expression/fixtures/errors.rs
    • Added postfix error fixtures at: src/parser/tests/expression/fixtures/valid/postfix.rs and related fixtures modules
  • Introduced a dedicated sexpr rendering helper for tests (via the new sexpr.rs) and updated tests to use to_sexpr() for compact comparison of parsed structures.

Miscellaneous

  • Minor cleanup to remove inline to_sexpr implementation from Expr and rely on the new sexpr module.
  • Code organization improvements align with the 400-line module guideline mentioned in the task, while preserving functionality.

How to test

  • Run: cargo test
  • Specifically validate that expression parsing tests pass with the new fixtures layout and that S-expression representations render as expected via Expr::to_sexpr().
  • Ensure no public API changes in user-facing behavior; changes are internal restructuring and test shims.

Notes

  • This refactor improves maintainability and testability of parser components by modularizing: S-expression formatting, rule classification, and Pratt postfix handling.
  • If any downstream tooling relies on file locations, please update accordingly; public behavior remains unchanged.

◳ Generated by DevBoxer


ℹ️ Tag @devboxerhub to ask questions and address PR feedback

📎 Task: https://www.devboxer.com/task/4e7f979f-30bc-45b5-844c-4e890802856c

📝 Closes #223

Summary by Sourcery

Refactor parser internals and tests into smaller, focused modules while preserving existing parsing behavior.

Enhancements:

  • Extract rule body classification logic into a dedicated module and delegate from the main rule AST type.
  • Split Pratt parser postfix, diff, and delay handling into separate helper modules to reduce pratt.rs size and coupling.
  • Move S-expression formatting for expressions into its own module to decouple test-oriented rendering from the core AST definition.
  • Introduce a shared expression test fixtures module with structured valid and error cases to simplify and consolidate parser tests.

Tests:

  • Rewrite expression parser tests to consume structured fixtures for literals, control flow, postfix operations, and error conditions, improving reuse and coverage organization.

- Extract rule-body term classification into a dedicated module `rule/classification.rs`.
- Move S-expression formatting for Expr into a separate `expr/sexpr.rs` module.
- Split Pratt parser postfix parsing, delay-postfix, and diff-marker handling into separate files under `expression/pratt/`.
- Remove inline implementations and validations, organizing parser logic into coherent units.
- Revamp expression parser tests to use fixture modules, improving maintainability.

This refactor improves code organization, readability, and test structure without changing external behavior.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6dece2d3-c3d0-4690-8d85-b225e9c9669a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor-parser-modules-d2nyz5

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates Failed
Enforce advisory code health rules (2 files with Code Duplication)

Gates Passed
5 Quality Gates Passed

See analysis details in CodeScene

Reason for failure
Enforce advisory code health rules Violations Code Health Impact
pratt.rs 1 advisory rule 10.00 → 9.39 Suppress
errors.rs 1 advisory rule 9.39 Suppress

Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.

use chumsky::error::Simple;

use crate::parser::ast::{Expr, StringLiteral};
use crate::parser::span_utils::parse_u32_decimal;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ New issue: Code Duplication
The module contains 2 functions with similar structure: Pratt.with_struct_literal_activation,Pratt.with_struct_literals_suspended

Suppress

Comment on lines +217 to +232
fn collection_error_cases() -> Vec<CountedErrorCase> {
vec![
CountedErrorCase {
src: "[1, 2",
min_errs: 1,
},
CountedErrorCase {
src: "{a: 1",
min_errs: 1,
},
CountedErrorCase {
src: "{a, b}",
min_errs: 1,
},
]
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ New issue: Code Duplication
The module contains 4 functions with similar structure: collection_error_cases,control_flow_error_cases,operator_error_cases,postfix_counted_error_cases

Suppress

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 23, 2026

Reviewer's Guide

Refactors parser internals by extracting S-expression formatting, rule-body classification, and Pratt postfix parsing into dedicated modules, and reworks expression tests around a reusable fixtures system without changing observable parsing behavior.

Sequence diagram for the new rule body term parsing pipeline

sequenceDiagram
    participant RuleBodyExpression
    participant RuleClassificationModule as classification
    participant PatternParser as parse_pattern
    participant ExpressionParser as parse_expression
    participant AggregationLogic as aggregation_tracking
    participant Errors as errors_vec

    RuleBodyExpression->>RuleClassificationModule: parse_rule_body_term(raw, literal_span, first_aggregation_span, errors)
    alt raw contains_assignment
        RuleClassificationModule->>PatternParser: parse_pattern(parts.pattern)
        PatternParser-->>RuleClassificationModule: Result_Pattern_or_errors
        alt pattern_ok
            RuleClassificationModule->>ExpressionParser: parse_expression(parts.value)
            ExpressionParser-->>RuleClassificationModule: Result_Expr_or_errors
            alt value_ok
                RuleClassificationModule-->>RuleBodyExpression: Some_RuleBodyTerm_Assignment
            else value_errors
                RuleClassificationModule->>Errors: append(shifted_errors)
                RuleClassificationModule-->>RuleBodyExpression: None
            end
        else pattern_errors
            RuleClassificationModule->>Errors: append(shifted_errors)
            RuleClassificationModule-->>RuleBodyExpression: None
        end
    else raw_is_expression
        RuleClassificationModule->>ExpressionParser: parse_expression(trimmed_raw)
        ExpressionParser-->>RuleClassificationModule: Result_Expr_or_errors
        alt expr_ok
            RuleClassificationModule->>AggregationLogic: classify_expression(expr, ctx)
            AggregationLogic-->>RuleClassificationModule: Option_RuleBodyTerm
            RuleClassificationModule-->>RuleBodyExpression: Option_RuleBodyTerm
        else expr_errors
            RuleClassificationModule->>Errors: append(errors)
            RuleClassificationModule-->>RuleBodyExpression: None
        end
    end
Loading

Class diagram for rule body classification and Expr S-expression formatting

classDiagram
    class Expr {
        +to_sexpr() String
    }

    class Literal {
    }

    class Pattern {
        +to_source() String
    }

    class MatchArm {
        pattern Pattern
        body Expr
    }

    class RuleBodyTerm {
    }

    class RuleAssignment {
        pattern Pattern
        value Expr
    }

    class RuleAggregation {
        project Expr
        key Expr
        source AggregationSource
    }

    class RuleForLoop {
        pattern Pattern
        iterable Expr
        guard Expr
        body_terms Vec_RuleBodyTerm_
    }

    class AggregationSource {
        <<enum>>
        GroupBy
        LegacyAggregate
    }

    class ClassificationContext {
        -literal_span Span
        -first_aggregation_span Option_Span_
        -errors Vec_SimpleSyntaxKind__
    }

    class ForLoopComponents {
        pattern Pattern
        iterable Expr
        guard Option_Expr_
        body Expr
    }

    class RuleClassifier {
        +parse_rule_body_term(raw String, literal_span Span, first_aggregation_span Option_Span_, errors Vec_SimpleSyntaxKind__) Option_RuleBodyTerm_
        -parse_assignment(parts AssignmentParts, literal_span Span) Result_Option_RuleBodyTerm__Vec_SimpleSyntaxKind___
        -classify_expression(expr Expr, ctx ClassificationContext) Option_RuleBodyTerm_
        -classify_aggregation_with_tracking(args Vec_Expr_, source AggregationSource, ctx ClassificationContext) Option_RuleBodyTerm_
        -classify_for_loop(components ForLoopComponents, ctx ClassificationContext) RuleBodyTerm
        -classify_for_body_with_aggregation_tracking(body Expr, ctx ClassificationContext) Vec_RuleBodyTerm_
        -aggregation_source_for(name String) Option_AggregationSource_
        -invocation_aggregation_source(callee Expr) Option_AggregationSource_
        -validate_aggregation(term RuleBodyTerm, literal_span Span, first_aggregation_span Option_Span_, errors Vec_SimpleSyntaxKind__) bool
        -aggregation_arity_error(literal_span Span, source AggregationSource) Vec_SimpleSyntaxKind__
        -multiple_aggregations_error(first_span Span, second_span Span) SimpleSyntaxKind
    }

    Expr --> Literal
    Expr --> Pattern
    Expr --> MatchArm

    RuleBodyTerm <.. RuleAssignment
    RuleBodyTerm <.. RuleAggregation
    RuleBodyTerm <.. RuleForLoop

    RuleForLoop "*" o-- "body_terms" RuleBodyTerm
    RuleAggregation --> AggregationSource

    RuleClassifier ..> ClassificationContext
    RuleClassifier ..> ForLoopComponents
    RuleClassifier ..> RuleBodyTerm
    RuleClassifier ..> RuleAggregation
    RuleClassifier ..> RuleAssignment

    ClassificationContext --> Span
    ClassificationContext --> AggregationSource

    ForLoopComponents --> Pattern
    ForLoopComponents --> Expr
Loading

File-Level Changes

Change Details Files
Extracted Expr S-expression rendering into a dedicated module and updated call sites.
  • Moved Expr::to_sexpr implementation and helpers from expr.rs into a new expr/sexpr.rs module.
  • Reimplemented to_sexpr and helper functions (format_nary, format_match, etc.) with largely identical behavior but localized to expr/sexpr.rs.
  • Adjusted imports and module wiring so tests and other code now use the new sexpr module rather than inline methods in the main Expr definition.
src/parser/ast/expr.rs
src/parser/ast/expr/sexpr.rs
Isolated rule-body term classification and assignment parsing into a dedicated classification module.
  • Introduced src/parser/ast/rule/classification.rs containing parse_rule_body_term, assignment parsing, aggregation classification, for-loop body classification, and aggregation validation helpers.
  • Replaced inlined classification logic in RuleBodyExpression::classify in rule.rs with a call to classification::parse_rule_body_term, passing through spans and error tracking.
  • Relocated helper structs and functions (ClassificationContext, ForLoopComponents, aggregation_* helpers, multiple_aggregations_error, validate_aggregation) from rule.rs into the new module while preserving behavior.
src/parser/ast/rule.rs
src/parser/ast/rule/classification.rs
Split Pratt parser postfix handling into specialized submodules and rewired the main Pratt implementation.
  • Created new modules delay.rs, diff.rs, and postfix.rs under src/parser/expression/pratt/ to house delay, diff marker, and general postfix parsing logic.
  • Removed inline postfix-related methods (parse_postfix, delay/diff helpers, function call/bit-slice/dot-access parsing, parse_args) from pratt.rs and delegated to the new modules instead.
  • Updated pratt.rs module declarations to include the new submodules and adjusted imports (e.g., parse_u32_decimal, qualified callee helpers) into the appropriate new files.
src/parser/expression/pratt.rs
src/parser/expression/pratt/delay.rs
src/parser/expression/pratt/diff.rs
src/parser/expression/pratt/postfix.rs
Reworked expression parser tests to use a structured fixtures system rather than inline rstest cases.
  • Introduced src/parser/tests/expression/fixtures.rs plus submodules for valid and error fixtures (valid/basic.rs, valid/collections.rs, valid/control_flow.rs, valid/postfix.rs, valid/structured.rs, errors.rs).
  • Defined small case structs (ExpressionCase, CountedErrorCase, PostfixErrorCase, SpannedErrorCase) and helper functions (expression_cases, literal_cases, invalid_for_loop_sources, error_cases, postfix_error_cases, match_pattern_error_cases) to centralize test data.
  • Simplified src/parser/tests/expression.rs to iterate over fixture providers in plain #[test] functions instead of many #[rstest] parameterized cases, while preserving the same coverage and assertions.
src/parser/tests/expression.rs
src/parser/tests/expression/fixtures.rs
src/parser/tests/expression/fixtures/errors.rs
src/parser/tests/expression/fixtures/valid.rs
src/parser/tests/expression/fixtures/valid/basic.rs
src/parser/tests/expression/fixtures/valid/collections.rs
src/parser/tests/expression/fixtures/valid/control_flow.rs
src/parser/tests/expression/fixtures/valid/postfix.rs
src/parser/tests/expression/fixtures/valid/structured.rs
Performed minor parser and test infrastructure cleanups to align with the new module structure.
  • Adjusted imports in expression tests to use fixture functions and removed direct dependencies on low-level test_util AST constructors where not needed.
  • Ensured aggregation tracking and error spans continue to use literal_span consistently after refactor by threading spans into the classification helpers.
  • Documented new modules with brief comments clarifying their responsibility (rule classification, Expr S-expressions, test fixtures).
src/parser/tests/expression.rs
src/parser/ast/rule/classification.rs
src/parser/ast/expr/sexpr.rs

Assessment against linked issues

Issue Objective Addressed Explanation
#223 Refactor src/parser/ast/rule.rs by moving rule-body term classification and aggregation tracking logic into a separate dedicated module so that rule.rs is smaller and focused on the public AST surface.
#223 Refactor src/parser/ast/expr.rs by extracting Expr S-expression formatting (to_sexpr and helpers) into its own module, keeping expr.rs focused on the core Expr definition.
#223 Refactor parser Pratt expression and tests by (a) splitting postfix/diff/delay handling code out of src/parser/expression/pratt.rs into focused submodules, and (b) moving the large inline test case tables from src/parser/tests/expression.rs into reusable fixtures modules.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[High] Oversized modules violate local maintainability guideline (400-line cap)

1 participant