Implement Pratt parser for DDlog expressions#98
Conversation
Reviewer's GuideThis PR adds a bespoke Pratt parser for DDlog expressions, defines a new Expr AST with centralized precedence, integrates expression span collection into the scanning and CST‐building pipeline, and provides unit and integration tests along with updated documentation and roadmap completion markers. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 31 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
Summary by CodeRabbitDocumentation
New Features
Bug Fixes
Tests
WalkthroughIntroduce a Pratt parser for DDlog expressions, including a new AST representation, operator precedence management, and integration with the CST builder. Extend span tracking to expression nodes, update the span scanner to extract expression spans, and add comprehensive documentation and tests for parsing and error handling. Refactor documentation for clarity. Changes
Sequence Diagram(s)sequenceDiagram
participant Source as Source Text
participant Lexer as Lexer
participant Parser as Pratt Parser
participant AST as AST Builder
Source->>Lexer: Tokenise input
Lexer->>Parser: Provide token stream
Parser->>Parser: Parse prefix expression
loop While infix operator with sufficient binding power
Parser->>Parser: Parse infix operator and right-hand side
end
Parser->>AST: Build expression tree (Expr)
AST-->>Parser: Return AST node
Estimated code review effort🎯 4 (Complex) | ⏱️ ~35 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Centralize the prefix_binding_power and infix_binding_power definitions into a single table or module to avoid duplication and prevent inconsistencies when adjusting operator precedences.
- Consider changing the parse_expression API to return a Result<Expr, Vec<Simple>> so that parse failures are explicitly handled rather than silently mapping None to an Option.
- Add an end-to-end integration test that feeds a full DDlog statement through the span_scanner to verify the Pratt parser is correctly invoked and the CST’s N_EXPR_NODE contains the expected tokens.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Centralize the prefix_binding_power and infix_binding_power definitions into a single table or module to avoid duplication and prevent inconsistencies when adjusting operator precedences.
- Consider changing the parse_expression API to return a Result<Expr, Vec<Simple<SyntaxKind>>> so that parse failures are explicitly handled rather than silently mapping None to an Option.
- Add an end-to-end integration test that feeds a full DDlog statement through the span_scanner to verify the Pratt parser is correctly invoked and the CST’s N_EXPR_NODE contains the expected tokens.
## Individual Comments
### Comment 1
<location> `src/parser/tests/expression.rs:11` </location>
<code_context>
+#[rstest]
</code_context>
<issue_to_address>
Missing tests for error handling and invalid expressions.
Please add tests for invalid or malformed input, such as missing operands, unmatched parentheses, unsupported tokens, and empty input, to ensure the parser handles errors correctly.
</issue_to_address>
### Comment 2
<location> `src/parser/expression.rs:41` </location>
<code_context>
+ }
+ }
+
+ fn next(&mut self) -> Option<(SyntaxKind, Span)> {
+ while let Some((kind, _)) = self.tokens.get(self.pos) {
+ if matches!(kind, SyntaxKind::T_WHITESPACE | SyntaxKind::T_COMMENT) {
</code_context>
<issue_to_address>
Consider refactoring the parser by using a filtered Peekable iterator, splitting parse_expr into parse_prefix and parse_infix, and centralizing error creation to reduce nesting and duplication.
```markdown
You can simplify a lot of the nesting/duplication by:
1. Pushing whitespace‐/comment‐skipping into a filtered, `Peekable` iterator
2. Splitting `parse_expr` into a small `parse_prefix` + `parse_infix` pair
3. Centralizing all error‐span creation in one helper
---
### 1. Wrap your tokens in a filtered `Peekable` iterator
```rust
use std::iter::Peekable;
struct Pratt<I>
where
I: Iterator<Item = (SyntaxKind, Span)>,
{
tokens: Peekable<I>,
src: String,
errors: Vec<Simple<SyntaxKind>>,
}
impl<I> Pratt<I>
where
I: Iterator<Item = (SyntaxKind, Span)>,
{
fn new(tokens: I, src: &str) -> Self {
let tokens = tokens
.filter(|(k, _)| !matches!(k, SyntaxKind::T_WHITESPACE | SyntaxKind::T_COMMENT))
.peekable();
Self { tokens, src: src.to_owned(), errors: Vec::new() }
}
fn next(&mut self) -> Option<(SyntaxKind, Span)> {
self.tokens.next()
}
fn peek(&mut self) -> Option<SyntaxKind> {
self.tokens.peek().map(|(k, _)| *k)
}
}
```
That removes all the manual `while matches!(..)` in your old `next`/`peek`.
---
### 2. Split `parse_expr` into `parse_prefix` + `parse_infix`
```rust
impl<I> Pratt<I>
where
I: Iterator<Item = (SyntaxKind, Span)>,
{
fn parse_expr(&mut self, min_bp: u8) -> Option<Expr> {
let mut lhs = self.parse_prefix()?;
loop {
let op_kind = self.peek()?;
let (l_bp, r_bp, op) = infix_binding_power(op_kind)?;
if l_bp < min_bp { break; }
self.next(); // consume operator
let rhs = self.parse_expr(r_bp)?;
lhs = Expr::Binary {
op,
lhs: Box::new(lhs),
rhs: Box::new(rhs),
};
}
Some(lhs)
}
fn parse_prefix(&mut self) -> Option<Expr> {
let (kind, span) = self.next()?;
match kind {
SyntaxKind::T_NUMBER => Some(Expr::Literal(Literal::Number(self.slice(span)))),
SyntaxKind::T_STRING => Some(Expr::Literal(Literal::String(self.slice(span)))),
SyntaxKind::K_TRUE => Some(Expr::Literal(Literal::Bool(true))),
SyntaxKind::K_FALSE => Some(Expr::Literal(Literal::Bool(false))),
SyntaxKind::T_IDENT => Some(Expr::Variable(self.slice(span))),
SyntaxKind::T_LPAREN => {
let expr = self.parse_expr(0);
if self.expect(SyntaxKind::T_RPAREN) {
expr.map(|e| Expr::Group(Box::new(e)))
} else {
None
}
}
k if prefix_binding_power(k).is_some() => {
let (_, op) = prefix_binding_power(k).unwrap();
let rhs = self.parse_expr(60)?;
Some(Expr::Unary { op, expr: Box::new(rhs) })
}
_ => {
let span = span.clone();
self.emit_error(span, "unexpected token");
None
}
}
}
}
```
---
### 3. Centralize error creation
```rust
impl<I> Pratt<I>
where
I: Iterator<Item = (SyntaxKind, Span)>,
{
fn emit_error(&mut self, span: Span, msg: &str) {
self.errors.push(Simple::custom(span, msg.to_string()));
}
}
```
Then replace every direct `self.errors.push(Simple::custom(…))` with `self.emit_error(…)`.
These changes preserve *exactly* the same Pratt-parser behavior, but remove the boilerplate/special-case loops, reduce nesting in `parse_expr`, and DRY up your error reporting.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Gates Failed
Prevent hotspot decline
(1 hotspot with Complex Method)
Enforce advisory code health rules
(1 file with Complex Method)
Gates Passed
4 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Prevent hotspot decline | Violations | Code Health Impact | |
|---|---|---|---|
| span_scanner.rs | 1 rule in this hotspot | 10.00 → 9.69 | Suppress |
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| span_scanner.rs | 1 advisory rule | 10.00 → 9.69 | Suppress |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Gates Failed
Prevent hotspot decline
(1 hotspot with Complex Method, Bumpy Road Ahead)
Enforce critical code health rules
(1 file with Bumpy Road Ahead)
Enforce advisory code health rules
(1 file with Complex Method)
Gates Passed
3 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Prevent hotspot decline | Violations | Code Health Impact | |
|---|---|---|---|
| span_scanner.rs | 2 rules in this hotspot | 10.00 → 9.51 | Suppress |
| Enforce critical code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| span_scanner.rs | 1 critical rule | 10.00 → 9.51 | Suppress |
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| span_scanner.rs | 1 advisory rule | 10.00 → 9.51 | Suppress |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Gates Failed
Enforce advisory code health rules
(1 file with Complex Conditional)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| expression.rs | 1 advisory rule | 10.00 → 9.69 | Suppress |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Actionable comments posted: 6
🔭 Outside diff range comments (1)
src/parser/ast/expr.rs (1)
1-93: Runcargo fmt --allto fix formatting issues.The pipeline indicates formatting issues in this file. Fix them before merging.
♻️ Duplicate comments (2)
src/parser/tests/expression.rs (1)
21-31: LGTM: Thorough error handling coverage.The error tests validate parser behaviour for incomplete expressions, unmatched parentheses, unknown operators, and empty input. This addresses the concerns raised in the past review comment about missing error handling tests.
src/parser/span_scanner.rs (1)
568-586: Address the remaining cyclomatic complexity issue.Static analysis still flags this function as having high cyclomatic complexity. The refactoring suggested in the previous review has been partially implemented, but the complexity metrics remain above threshold.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
docs/pratt-parser-for-ddlog-expressions.md(4 hunks)docs/roadmap.md(2 hunks)src/parser/ast/expr.rs(1 hunks)src/parser/ast/mod.rs(5 hunks)src/parser/ast/precedence.rs(1 hunks)src/parser/cst_builder/spans.rs(7 hunks)src/parser/cst_builder/tree.rs(2 hunks)src/parser/expression.rs(1 hunks)src/parser/mod.rs(2 hunks)src/parser/span_scanner.rs(4 hunks)src/parser/tests/expression.rs(1 hunks)src/parser/tests/expression_integration.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
docs/**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
Documentation must use en-GB-oxendict spelling and grammar (with the exception of "license" which is to be left unchanged for community consistency).
docs/**/*.md: Use British English based on the Oxford English Dictionary (en-oxendict) for documentation.
The word "outwith" is acceptable in documentation.
Keep US spelling when used in an API, for examplecolor.
Use the Oxford comma in documentation.
Treat company names as collective nouns in documentation (e.g., "Lille Industries are expanding").
Write headings in sentence case in documentation.
Use Markdown headings (#,##,###, etc.) in order without skipping levels.
Follow markdownlint recommendations for Markdown files.
Provide code blocks and lists using standard Markdown syntax.
Always provide a language identifier for fenced code blocks; useplaintextfor non-code text.
Use-as the first level bullet and renumber lists when items change.
Prefer inline links using[text](url)or angle brackets around the URL in documentation.
Ensure blank lines before and after bulleted lists and fenced code blocks in documentation.
Ensure tables have a delimiter line below the header row in documentation.
Expand any uncommon acronym on first use in documentation (e.g., Continuous Integration (CI)).
Wrap paragraphs at 80 columns in documentation.
Wrap code at 120 columns in documentation.
Do not wrap tables in documentation.
Use footnotes referenced with[^label]in documentation.
Include Mermaid diagrams in documentation where it adds clarity.
When embedding figures in documentation, useand provide concise alt text describing the content.
Add a short description before each Mermaid diagram in documentation for screen readers.
Files:
docs/roadmap.mddocs/pratt-parser-for-ddlog-expressions.md
**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.md: Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks must be wrapped at 120 columns.
Tables and headings must not be wrapped.
Files:
docs/roadmap.mddocs/pratt-parser-for-ddlog-expressions.md
⚙️ CodeRabbit Configuration File
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/roadmap.mddocs/pratt-parser-for-ddlog-expressions.md
docs/**/*
📄 CodeRabbit Inference Engine (docs/complexity-antipatterns-and-refactoring-strategies.md)
Use clear and descriptive names for classes, methods, and variables, especially when dealing with many small components.
Files:
docs/roadmap.mddocs/pratt-parser-for-ddlog-expressions.md
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Comment why, not what. Explain assumptions, edge cases, trade-offs, or complexity. Don't echo the obvious.
Comments must use en-GB-oxendict spelling and grammar.
Function documentation must include clear examples.
Name things precisely. Use clear, descriptive variable and function names. For booleans, prefer names with is, has, or should.
Each file should encapsulate a coherent module. Group related code (e.g., models + utilities + fixtures) close together.
Every module must begin with a module level (//! ) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Place function attributes after doc comments.
Do not use return in single-line functions.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Clippy warnings MUST be disallowed.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use predicate functions for conditional criteria with more than two branches.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Files:
src/parser/cst_builder/tree.rssrc/parser/mod.rssrc/parser/tests/expression_integration.rssrc/parser/expression.rssrc/parser/cst_builder/spans.rssrc/parser/ast/mod.rssrc/parser/tests/expression.rssrc/parser/ast/precedence.rssrc/parser/ast/expr.rssrc/parser/span_scanner.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/parser/cst_builder/tree.rssrc/parser/mod.rssrc/parser/tests/expression_integration.rssrc/parser/expression.rssrc/parser/cst_builder/spans.rssrc/parser/ast/mod.rssrc/parser/tests/expression.rssrc/parser/ast/precedence.rssrc/parser/ast/expr.rssrc/parser/span_scanner.rs
🧬 Code Graph Analysis (2)
src/parser/ast/precedence.rs (1)
src/parser/ast/root.rs (1)
kind(45-47)
src/parser/span_scanner.rs (4)
src/parser/expression.rs (2)
parse_expression(41-64)new(92-99)src/parser/token_stream.rs (2)
tokens(105-107)src(118-120)src/parser/lexer_helpers.rs (2)
inline_ws(98-101)atom(125-142)src/parser/cst_builder/tree.rs (2)
new(19-24)new(51-64)
🪛 LanguageTool
docs/roadmap.md
[style] ~127-~127: Future tense may not be necessary in this context.
Context: ...replacing the simple atom parser. This will allow for detailed analysis of rule bodies. ...
(WILL_ALLOW)
[style] ~242-~242: This phrase is redundant. Consider using “outside”.
Context: ...MBER) that appear in rule bodies outside of a const` definition. Allow a configura...
(OUTSIDE_OF)
[style] ~256-~256: Consider using the typographical ellipsis character here instead.
Context: ... Implement the default linting command: ddlint <FILES...>. - [ ] Implement the `ddlint rules...
(ELLIPSIS)
docs/pratt-parser-for-ddlog-expressions.md
[style] ~349-~349: Would you like to use the Oxford spelling “centralized”? The spelling ‘centralised’ is also correct.
Context: ...ilding the CST. Operator precedence is centralised in src/parser/ast/precedence.rs. Both...
(OXFORD_SPELLING_Z_NOT_S)
🪛 GitHub Check: build-test
src/parser/ast/expr.rs
[warning] 4-4:
Diff in /home/runner/work/ddlint/ddlint/src/parser/ast/expr.rs
🪛 GitHub Actions: CI
src/parser/ast/expr.rs
[warning] 1-1: Prettier formatting check failed. Diff detected in file. Run 'cargo fmt --all' to fix code style issues.
🔇 Additional comments (16)
docs/roadmap.md (2)
3-5: LGTM: Formatting improvements enhance readability.The line wrapping and structural changes improve document flow without altering the semantic content.
256-256: Use typographical ellipsis character.Replace the three-dot sequence with the proper ellipsis character.
- - [ ] Implement the default linting command: `ddlint <FILES...>`. + - [ ] Implement the default linting command: `ddlint <FILES…>`.Likely an incorrect or invalid review comment.
src/parser/mod.rs (2)
22-22: LGTM: Proper module integration.The new expression module is correctly exposed in the public API, following the established pattern of the existing parser modules.
55-56: LGTM: Comprehensive test organisation.The addition of both unit tests (
expression) and integration tests (expression_integration) provides thorough coverage of the new expression parsing functionality.src/parser/cst_builder/tree.rs (2)
11-11: LGTM: Correct cursor count update.The increment to accommodate the new expression cursor maintains consistency with the cursors array size.
61-61: LGTM: Proper expression cursor integration.The new SpanCursor for expressions follows the established pattern and correctly uses the N_EXPR_NODE syntax kind.
src/parser/tests/expression_integration.rs (2)
3-3: LGTM: Proper lint suppression.The lint suppression includes a clear reason explaining the use of
expectfor better test failure messages.
5-20: LGTM: Comprehensive integration test.The test correctly validates the end-to-end expression parsing pipeline, accounting for current parser limitations whilst confirming that expression nodes are properly created in the CST.
src/parser/tests/expression.rs (2)
7-9: LGTM: Clean helper function.The
lit_numhelper reduces test boilerplate and improves readability of the test cases.
11-19: LGTM: Comprehensive expression parsing tests.The parameterised tests cover crucial parsing scenarios including operator precedence (multiplication before addition), left associativity (subtraction), unary operations, and grouping. The test cases validate the core Pratt parser functionality effectively.
src/parser/cst_builder/spans.rs (1)
32-34: Expression span tracking implementation looks good.The addition of expression span tracking follows the established patterns consistently throughout the module. The implementation maintains proper encapsulation and validation.
Also applies to: 46-47, 99-105, 117-118, 128-129, 140-141, 196-201
src/parser/ast/precedence.rs (2)
24-131: Operator precedence implementation is well-structured.The precedence values correctly follow standard mathematical operator precedence, and the lookup functions are efficiently implemented.
11-23: MakePrefixEntryandInfixEntrystructs private.These structs are only used within this module for the static tables. Change their visibility from
pub(super)to private for better encapsulation.#[derive(Debug, Clone, Copy)] -pub(super) struct PrefixEntry { +struct PrefixEntry { pub bp: u8, pub op: UnaryOp, } #[derive(Debug, Clone, Copy)] -pub(super) struct InfixEntry { +struct InfixEntry { pub l_bp: u8, pub r_bp: u8, pub op: BinaryOp, }Likely an incorrect or invalid review comment.
src/parser/ast/mod.rs (1)
32-98: Documentation improvements and new utility function enhance the module.The enhanced documentation for existing functions provides clear parameter and return value descriptions. The new
skip_whitespace_and_commentsutility function is well-implemented and useful for token processing.Also applies to: 115-129, 133-154
src/parser/expression.rs (1)
1-228: Pratt parser implementation successfully addresses past review feedback.The implementation has incorporated all the suggestions from the previous review:
- Uses a filtered
Peekableiterator to skip whitespace/comments- Splits parsing into
parse_prefixandparse_infixmethods- Centralises error creation with
push_errorThe code is well-structured and maintainable.
docs/pratt-parser-for-ddlog-expressions.md (1)
1-352: Well-structured documentation with clear explanations.The documentation effectively describes the Pratt parser design with appropriate formatting, clear section breaks, and comprehensive coverage of the implementation approach. The addition of implementation notes provides valuable context about the current state.
There was a problem hiding this comment.
Gates Failed
Enforce advisory code health rules
(1 file with Complex Conditional)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| expression.rs | 1 advisory rule | 10.00 → 9.69 | Suppress |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/parser/ast/expr.rs (1)
27-39: Add documentation comments toBinaryOpvariants for consistency.The
LiteralandUnaryOpenums have documentation for their variants, butBinaryOplacks them. Add brief descriptions for each operator variant to maintain consistency.#[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum BinaryOp { + /// Addition operator. Add, + /// Subtraction operator. Sub, + /// Multiplication operator. Mul, + /// Division operator. Div, + /// Modulo operator. Mod, + /// Equality operator. Eq, + /// Inequality operator. Neq, + /// Logical AND operator. And, + /// Logical OR operator. Or, }src/parser/expression.rs (1)
172-176: Use the actual binding power fromprefix_binding_powerinstead of hardcoding 60.The function already retrieves the binding power but discards it. Use the actual value for consistency with the precedence table.
- let Some((_, op)) = prefix_binding_power(k) else { + let Some((bp, op)) = prefix_binding_power(k) else { self.push_error(span, "unexpected token"); return None; }; - let rhs = self.parse_expr(60)?; + let rhs = self.parse_expr(bp)?;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/ast/expr.rs(1 hunks)src/parser/expression.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/parser/ast/expr.rssrc/parser/expression.rs
🔇 Additional comments (2)
src/parser/ast/expr.rs (1)
58-91: Well-implemented S-expression conversion method.The
to_sexprmethod correctly handles all expression variants and provides a clean string representation for testing purposes. The recursive structure preserves the AST hierarchy appropriately.src/parser/expression.rs (1)
204-228: Helper methods are well-implemented.The
expectmethod provides clear error reporting with proper span tracking, and theslicemethod safely extracts substrings. Both contribute to clean and maintainable parsing logic.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- The custom Pratt parser implementation in expression.rs is quite large—consider refactoring parse_prefix and parse_infix into smaller helpers or leveraging the upcoming chumsky::pratt combinator to reduce boilerplate once it’s available.
- The logic for extracting and handling expression spans inside collect_rule_spans has grown complex—pull out the expression‐span extraction and expression parsing into a dedicated helper module/function to improve readability and separation of concerns.
- The parser tests for simple_rule and multi_literal_rule now assert that errors exist (assert!(!parsed.errors().is_empty())), which seems unintended—verify the intended error‐recovery behavior and update these assertions to expect no errors when the core parsing succeeds.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The custom Pratt parser implementation in expression.rs is quite large—consider refactoring parse_prefix and parse_infix into smaller helpers or leveraging the upcoming chumsky::pratt combinator to reduce boilerplate once it’s available.
- The logic for extracting and handling expression spans inside collect_rule_spans has grown complex—pull out the expression‐span extraction and expression parsing into a dedicated helper module/function to improve readability and separation of concerns.
- The parser tests for simple_rule and multi_literal_rule now assert that errors exist (assert!(!parsed.errors().is_empty())), which seems unintended—verify the intended error‐recovery behavior and update these assertions to expect no errors when the core parsing succeeds.
## Individual Comments
### Comment 1
<location> `src/parser/tests/parser.rs:584` </location>
<code_context>
#[rstest]
fn simple_rule_parsed(simple_rule: &str) {
let parsed = parse(simple_rule);
- assert!(parsed.errors().is_empty());
+ assert!(!parsed.errors().is_empty());
let rules = parsed.root().rules();
assert_eq!(rules.len(), 1);
</code_context>
<issue_to_address>
Test assertions for error presence may be misleading.
Verify that the assertion aligns with the parser's expected behavior for these cases. If errors are not expected, assert that errors are empty; if errors are expected, add comments explaining why.
</issue_to_address>
### Comment 2
<location> `src/parser/span_scanner.rs:484` </location>
<code_context>
collect_extern_declarations(tokens, src, SyntaxKind::K_TRANSFORMER, transformer_decl)
}
-fn collect_rule_spans(
- tokens: &[(SyntaxKind, Span)],
- src: &str,
</code_context>
<issue_to_address>
Consider collapsing the multiple small helper functions into a single focused loop within collect_rule_spans for improved clarity and reduced top-level bloat.
Here’s one way to collapse all of those tiny free‐functions back into a single focused loop (you can still keep your `rule_decl()` parser and error‐handling exactly as is). The key ideas are:
1. Inline the “at line start?” and “advance or parse” logic in one place.
2. Only two helper functions remain: `extract_expr_span(...)` (small, focused) and your `rule_decl()` parser.
3. Remove all of the top‐level `handle_*` / `parse_and_handle_rule` / `is_at_line_start` / etc.
```rust
fn collect_rule_spans(
tokens: &[(SyntaxKind, Span)],
src: &str,
) -> (Vec<Span>, Vec<Span>, Vec<Simple<SyntaxKind>>) {
let mut st = State::new(tokens, src, Vec::new());
let mut expr_spans = Vec::new();
let mut rule_parser = rule_decl(); // your existing parser
while let Some(&(kind, span)) = st.stream.peek() {
let span = span.clone();
// only IDENT or IMPLIES can start a rule
if !matches!(kind, SyntaxKind::T_IDENT | SyntaxKind::T_IMPLIES) {
st.stream.advance();
continue;
}
// check “new‐line” in one place
let at_line_start = if st.stream.cursor() == 0 {
true
} else {
let prev = &st.stream.tokens()[st.stream.cursor() - 1].1;
st.stream.src()[prev.end..span.start].contains('\n')
};
if !at_line_start {
st.stream.advance();
continue;
}
// remember where we began
let start_idx = st.stream.cursor();
// parse the rule (push its span or record errors)
let (maybe_span, errs) = st.parse_span(rule_parser.clone(), span.start);
if let Some(rspan) = maybe_span {
let end = rspan.end;
st.spans.push(rspan);
// small helper to pull out the body‐expr
if let Some(espan) = extract_expression_span(&st, start_idx, end) {
// optional re‐parse for expression errors
if let Some(text) = st.stream.src().get(espan.clone()) {
if let Err(errs) = parse_expression(text) {
st.extra.extend(errs);
}
}
expr_spans.push(espan);
}
st.stream.skip_until(end);
} else {
st.extra.extend(errs);
let end = st.stream.line_end(st.stream.cursor());
st.stream.skip_until(end);
}
}
let (spans, errors) = st.into_parts();
(spans, expr_spans, errors)
}
```
With this you:
* collapse all of your `handle_*` / `parse_and_handle_rule` / `is_at_line_start` free functions
* keep your small `extract_expression_span` (it’s still < 30 LOC)
* never revert any feature, but reduce nesting and top-level bloat.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@coderabbitai Please create new GitHub issues for the following:
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@leynos I'll create three GitHub issues for the improvements you've identified. Let me create them now: 🏁 Script executed: Length of output: 3347 @leynos I've successfully created three GitHub issues to track the improvements you identified:
All issues have been assigned to you and include backlinks to this PR and comment for context. |
Summary
DDlogexpressionsExprAST definitionsrstestTesting
make fmtmake lintmake testmake nixie(fails: too many arguments)https://chatgpt.com/codex/tasks/task_e_6885513dbcb08322998f9bd67c73902a
Summary by Sourcery
Implement a hand-rolled Pratt parser for DDlog expressions, integrate it into the existing parsing pipeline, update CST construction to include expression nodes, and add tests and documentation.
New Features:
Enhancements:
Documentation:
Tests: