Conversation
Reviewer's GuideThis PR replaces ad-hoc scanning of relation clauses with dedicated chumsky parsers for column lists and primary key clauses, refactors span collection to leverage these parsers with structured error handling, adds targeted tests for empty-column and invalid-primary-key cases, and removes an obsolete export. Class diagram for updated relation clause parsingclassDiagram
class SpanCollector {
+stream: Stream
+extra: Extras
+spans: Vec<Span>
+parse_span(parser, start)
+skip_line()
}
class relation_columns {
+parse(tokens: Stream) Span | Error
}
class primary_key_clause {
+parse(tokens: Stream) Span | Error
}
SpanCollector --> relation_columns : uses
SpanCollector --> primary_key_clause : uses
Flow diagram for relation clause parsing and error handlingflowchart TD
A[Start relation clause parsing] --> B{Parse relation columns}
B -- Success --> C{Check for primary key clause}
B -- Error --> F[Record error, skip line]
C -- Found --> D{Parse primary key clause}
D -- Success --> E[Record relation span]
D -- Error --> F
C -- Not found --> E
E --> G[End]
F --> G
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Note Reviews pausedUse the following commands to manage reviews:
Summary by CodeRabbit
WalkthroughReplace parser-combinator parenthesised extraction with a streaming Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller as Parser code
participant Iter as Peekable<Iterator<SyntaxElement>>
participant Error as UnclosedDelimiterError
Caller->>Iter: call extract_delimited(open, close)
Note right of Iter: iterate elements\npeek / next
Iter-->>Iter: on open => depth += 1
loop scan elements
Iter->>Iter: consume element
alt element == open
Iter-->>Iter: depth += 1
else element == close
Iter-->>Iter: depth -= 1
alt depth == 0
Iter-->>Caller: return Ok(collected_text)
end
else
Iter-->>Iter: append element text to buffer
end
end
%% EOF reached without matching close
Iter-->>Error: build UnclosedDelimiterError { collected, expected }
Error-->>Caller: return Err(UnclosedDelimiterError)
sequenceDiagram
autonumber
participant Scanner as collect_relation_spans
participant Extras as Extras<'a'>
participant Parser as relation_columns / primary_key_clause
Scanner->>Extras: create extras (src, errors)
Scanner->>Parser: call relation_columns()
alt columns parsed successfully
Parser-->>Scanner: span_of_columns
Scanner->>Parser: if next ident == "primary" then primary_key_clause()
alt primary parsed successfully or absent
Parser-->>Scanner: optional_pk_span or none
Scanner->>Scanner: push relation span (start..end)
else primary parse failed
Parser-->>Extras: push pk error
Scanner->>Scanner: skip line
end
else columns parse failed
Parser-->>Extras: push column parse error
Scanner->>Scanner: skip line
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/parser/span_scanner.rs (1)
291-323: Simplifyrecord_relationby composing parsers instead of manual peeking/flagsCompose
relation_columnswith an optionalprimary_key_clauseand parse them in one shot. This removes the mutableokflag and duplicated whitespace handling, and reduces cognitive load. This has been highlighted previously; address it now.Apply this diff:
- fn record_relation(st: &mut State<'_>, start: usize) { - let cols = relation_columns(); - - let (cols_res, cols_err) = st.parse_span(cols, start); - if let Some(sp) = cols_res { - st.stream.skip_until(sp.end); - st.stream.skip_ws_inline(); - let mut ok = true; - if let Some((SyntaxKind::T_IDENT, pk_span)) = st.stream.peek().cloned() - && st.extra.src.get(pk_span.clone()) == Some("primary") - { - let (pk_res, pk_err) = - st.parse_span(primary_key_clause(st.extra.src), pk_span.start); - if let Some(pk_sp) = pk_res { - st.stream.skip_until(pk_sp.end); - } else { - st.extra.errors.extend(pk_err); - ok = false; - } - } - - if ok { - let end = st.stream.line_end(st.stream.cursor()); - st.stream.skip_until(end); - st.spans.push(start..end); - } else { - st.skip_line(); - } - } else { - st.extra.errors.extend(cols_err); - st.skip_line(); - } - } + fn record_relation(st: &mut State<'_>, start: usize) { + let parser = relation_columns() + .then(primary_key_clause(st.extra.src).or_not()) + .map_with_span(|_, sp: Span| sp); + + let (res, err) = st.parse_span(parser, start); + if let Some(sp) = res { + st.stream.skip_until(sp.end); + st.stream.skip_ws_inline(); + let end = st.stream.line_end(st.stream.cursor()); + st.stream.skip_until(end); + st.spans.push(start..end); + } else { + st.extra.extend(err); + st.skip_line(); + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
src/parser/ast/parse_utils/delimiter.rs(1 hunks)src/parser/ast/parse_utils/mod.rs(0 hunks)src/parser/span_scanner.rs(3 hunks)src/parser/tests/parser.rs(2 hunks)
💤 Files with no reviewable changes (1)
- src/parser/ast/parse_utils/mod.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Keep any single Rust source file under 400 lines; split long switches/dispatch tables and move large test data to external files
Disallow Clippy warnings; do not silence lints except as a last resort
Any lint suppressions must be tightly scoped and include a clear reason; prefer expect over allow
Fix warnings emitted during tests in code instead of silencing them
Extract helper functions when functions are too long; group parameters into structs when they are too many
If returning a large error, consider using Arc to reduce returned data
Write unit and behavioural tests for new functionality and run before/after changes
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate functions for conditional criteria with more than two branches
Prefer .expect() over .unwrap()
Use concat!() for long string literals rather than escaping newlines with a backslash
Files:
src/parser/tests/parser.rssrc/parser/ast/parse_utils/delimiter.rssrc/parser/span_scanner.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/parser/tests/parser.rssrc/parser/ast/parse_utils/delimiter.rssrc/parser/span_scanner.rs
src/**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
src/**/*.rs: Every module must begin with a module-level //! comment explaining purpose and utility
Document public APIs with /// Rustdoc comments so cargo doc can generate docs
Files:
src/parser/tests/parser.rssrc/parser/ast/parse_utils/delimiter.rssrc/parser/span_scanner.rs
🧬 Code Graph Analysis (2)
src/parser/tests/parser.rs (4)
src/parser/mod.rs (1)
parse(41-49)src/parser/cst_builder/mod.rs (2)
errors(51-53)root(45-47)src/parser/ast/parse_utils/errors.rs (1)
is_empty(144-146)src/parser/cst_builder/spans.rs (2)
relations(66-69)relations(168-170)
src/parser/span_scanner.rs (2)
src/parser/lexer_helpers.rs (3)
inline_ws(98-101)ident(111-115)balanced_block_nonempty(195-200)src/parser/token_stream.rs (1)
src(118-120)
🔇 Additional comments (4)
src/parser/tests/parser.rs (1)
116-130: Good fixtures for invalid relation formsAdd fixtures for empty columns, whitespace-only columns, and an invalid primary key clause. This matches earlier review feedback and improves coverage.
src/parser/span_scanner.rs (3)
1-1: Add required module-level docsDocument the module’s purpose at the top. This satisfies the repo’s documentation guideline for
//!headers.
227-245: Keyword parser with contextual errors looks goodMatch identifiers by source text and include the found token text and span in the error. This improves diagnostics without complicating the parser combinators.
269-277: Primary key clause parser is clear and preciseEnforce
primary keyfollowed by a non-empty parenthesised list. The composition is readable and integrates with the sharedbalanced_block_nonemptyhelper.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
src/parser/ast/parse_utils/delimiter.rs (3)
128-145: Eliminate per-node heap allocations; push text directly into the bufferAvoid allocating
Stringfor node text on every element. Push token/node text directly intobuf. This reduces allocations and improves throughput on large trees.Apply:
-fn process_element( - e: &SyntaxElement<DdlogLanguage>, - ctx: &mut DelimiterParseContext<'_>, -) -> ElementResult { - let text = element_text(e); - match e.kind() { - k if k == ctx.open_kind => { - *ctx.depth += 1; - ctx.buf.push_str(text.as_ref()); - ElementResult::Continue - } - k if k == ctx.close_kind => handle_close_delimiter(ctx, text.as_ref()), - _ => { - ctx.buf.push_str(text.as_ref()); - ElementResult::Continue - } - } -} +fn process_element( + e: &SyntaxElement<DdlogLanguage>, + ctx: &mut DelimiterParseContext<'_>, +) -> ElementResult { + match e.kind() { + k if k == ctx.open_kind => { + *ctx.depth += 1; + push_element_text(ctx.buf, e); + ElementResult::Continue + } + k if k == ctx.close_kind => { + let s = match e { + SyntaxElement::Token(t) => t.text(), + SyntaxElement::Node(n) => n.text().as_str(), // Node will not be a closer token + }; + handle_close_delimiter(ctx, s) + } + _ => { + push_element_text(ctx.buf, e); + ElementResult::Continue + } + } +} -use std::borrow::Cow; - -fn element_text(e: &SyntaxElement<DdlogLanguage>) -> Cow<'_, str> { - match e { - SyntaxElement::Token(t) => Cow::Borrowed(t.text()), - SyntaxElement::Node(n) => Cow::Owned(n.text().to_string()), - } -} +fn push_element_text(buf: &mut String, e: &SyntaxElement<DdlogLanguage>) { + match e { + SyntaxElement::Token(t) => buf.push_str(t.text()), + SyntaxElement::Node(n) => buf.push_str(n.text().as_str()), + } +}Also applies to: 157-163
1-6: Align module docs with British English and current API surfaceKeep “parenthesised” (British) and mention that the module handles arbitrary delimiter pairs, not only parentheses.
Apply:
-//! This module provides functions to parse parenthesised blocks and extract -//! text content from within balanced delimiters, handling nested structures -//! correctly. +//! This module provides functions to extract text from within balanced, nested +//! delimiters (e.g., parentheses), handling arbitrary opening/closing kinds.
62-87: Distinguish missing opening delimiter from unclosedReturn a distinct error when no opening delimiter is found instead of conflating it with an unclosed delimiter. This prevents callers that treat any
Erras “absent” from silently swallowing real parse errors.Points of attention:
src/parser/ast/parse_utils/delimiter.rs– updateskip_to_opening_delimiterandextract_delimitedsrc/parser/ast/relation.rs– infn extract_key_list, handle missing‐open vs unclosed cases instead of.ok()?for bothProposed changes:
fn skip_to_opening_delimiter<I>(iter: &mut Peekable<I>, open_kind: SyntaxKind) -> bool where I: Iterator<Item = SyntaxElement<DdlogLanguage>>, { - for e in iter.by_ref() { - if e.kind() == open_kind { - break; - } - } -} + for e in iter.by_ref() { + if e.kind() == open_kind { + return true; + } + } + false +} @@ pub fn extract_delimited<I>( - skip_to_opening_delimiter(iter, open_kind); + if !skip_to_opening_delimiter(iter, open_kind) { + return Err(UnclosedDelimiterError { + collected: String::new(), + expected: close_kind, + }); + }Then, in
src/parser/ast/relation.rs(insideextract_key_list), update:- let content = super::parse_utils::extract_delimited(iter, T_LPAREN, T_RPAREN).ok()?; + match super::parse_utils::extract_delimited(iter, T_LPAREN, T_RPAREN) { + Ok(text) => text, + Err(UnclosedDelimiterError { collected, expected }) => { + // handle unclosed delimiter as a real error + panic!("unclosed delimiter {}; got {:?}", expected, collected); + } + // missing opening delimiter yields None: + Err(_) => return None, + };src/parser/ast/relation.rs (1)
98-119: Remove bespoke parenthesis skipping; reuse the shared extractorReduce duplicate delimiter logic by reusing
extract_delimitedto advance past the columns list. This centralises balancing behaviour and keeps semantics consistent with other consumers.Example refactor:
- fn skip_to_end_of_columns<I>(iter: &mut std::iter::Peekable<I>) -> Option<()> + fn skip_to_end_of_columns<I>(iter: &mut std::iter::Peekable<I>) -> Option<()> where I: Iterator<Item = rowan::SyntaxElement<DdlogLanguage>>, { - let mut depth = 0usize; - for e in iter.by_ref() { - match e.kind() { - SyntaxKind::T_LPAREN => depth += 1, - SyntaxKind::T_RPAREN => { - if depth == 0 { - return None; - } - depth -= 1; - if depth == 0 { - return Some(()); - } - } - _ => {} - } - } - None + // Consume the first parenthesised list and discard content. + super::skip_whitespace_and_comments(iter); + let _ = super::parse_utils::extract_delimited(iter, SyntaxKind::T_LPAREN, SyntaxKind::T_RPAREN).ok()?; + Some(()) }src/parser/span_scanner.rs (1)
1-6: Split this module; it exceeds the 400-line limitEnforce the repository guideline (“Files must not exceed 400 lines”). Extract per-declaration collectors into dedicated modules (e.g., import.rs, typedef.rs, relation.rs, index.rs, function.rs, transformer.rs, rules.rs) and re-export from a small facade.
♻️ Duplicate comments (2)
src/parser/tests/parser.rs (1)
563-571: Parametrise invalid-relation assertions; ensure no duplicate blocks remainKeep this parameterised test. Verify that earlier duplicate test blocks were removed to avoid redundant execution and longer CI times.
Run:
#!/bin/bash # Fail if duplicated fixtures or tests exist. set -euo pipefail echo "Searching for duplicate invalid-relation fixtures/tests..." rg -nP 'fn\s+relation_empty_columns\s*\(\)|fn\s+relation_whitespace_columns\s*\(\)|fn\s+relation_invalid_pk\s*\(\)' -C1 rg -nP 'fn\s+relation_invalid_is_error\s*\(' -C1src/parser/span_scanner.rs (1)
291-321: Collapse manual control flow inrecord_relationinto a single composed parserRemove ad-hoc peeking, flags and duplicated whitespace management. Compose
relation_columns()with optionalprimary_key_clause(..).or_not()and parse in one pass. This preserves behaviour (optional PK) while simplifying control flow and centralising error collection.Apply this diff:
- fn record_relation(st: &mut State<'_>, start: usize) { - let cols = relation_columns(); - - let (cols_res, cols_err) = st.parse_span(cols, start); - if let Some(sp) = cols_res { - st.stream.skip_until(sp.end); - st.stream.skip_ws_inline(); - let mut ok = true; - if let Some((SyntaxKind::T_IDENT, pk_span)) = st.stream.peek().cloned() - && st.extra.src.get(pk_span.clone()) == Some("primary") - { - let (pk_res, pk_err) = - st.parse_span(primary_key_clause(st.extra.src), pk_span.start); - if let Some(pk_sp) = pk_res { - st.stream.skip_until(pk_sp.end); - } else { - st.extra.errors.extend(pk_err); - ok = false; - } - } - - if ok { - let end = st.stream.line_end(st.stream.cursor()); - st.stream.skip_until(end); - st.spans.push(start..end); - } else { - st.skip_line(); - } - } else { - st.extra.errors.extend(cols_err); - st.skip_line(); - } - } + fn record_relation(st: &mut State<'_>, start: usize) { + // Parse columns followed by an optional `primary key (...)` clause. + // `or_not()` yields `None` without error if `primary` is not present; if `primary` is present + // but malformed, the inner parser emits errors, matching the current behaviour. + let parser = relation_columns() + .then(primary_key_clause(st.extra.src).or_not()) + .map_with_span(|_, sp: Span| sp); + + let (res, errs) = st.parse_span(parser, start); + st.extra.errors.extend(errs); + + if let Some(sp) = res { + let end = st.stream.line_end(sp.end); + st.stream.skip_until(end); + st.spans.push(start..end); + } else { + st.skip_line(); + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (7)
docs/function-parsing-design.md(1 hunks)src/parser/ast/parse_utils/delimiter.rs(6 hunks)src/parser/ast/parse_utils/mod.rs(1 hunks)src/parser/ast/relation.rs(1 hunks)src/parser/span_scanner.rs(3 hunks)src/parser/tests/parser.rs(2 hunks)src/syntax_utils.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Keep any single Rust source file under 400 lines; split long switches/dispatch tables and move large test data to external files
Disallow Clippy warnings; do not silence lints except as a last resort
Any lint suppressions must be tightly scoped and include a clear reason; prefer expect over allow
Fix warnings emitted during tests in code instead of silencing them
Extract helper functions when functions are too long; group parameters into structs when they are too many
If returning a large error, consider using Arc to reduce returned data
Write unit and behavioural tests for new functionality and run before/after changes
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate functions for conditional criteria with more than two branches
Prefer .expect() over .unwrap()
Use concat!() for long string literals rather than escaping newlines with a backslash
Files:
src/syntax_utils.rssrc/parser/span_scanner.rssrc/parser/tests/parser.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/syntax_utils.rssrc/parser/span_scanner.rssrc/parser/tests/parser.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a module-level //! comment explaining purpose and utility
Document public APIs with /// Rustdoc comments so cargo doc can generate docs
Files:
src/syntax_utils.rssrc/parser/span_scanner.rssrc/parser/tests/parser.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Use docs/ as the reference source of truth; proactively update docs when requirements, dependencies, or architecture change
Documentation in docs/ must use en-GB-oxendict spelling and grammar
docs/**/*.md: The word “outwith” is acceptable in documentation
Keep US spelling when it appears in API names (e.g., color)
Use the Oxford comma
Treat company names as collective nouns (e.g., “Lille Industries are…”)
Write headings in sentence case
Use Markdown headings (#, ##, ###, …) in order without skipping levels
Follow markdownlint recommendations
Use standard Markdown syntax for code blocks and lists
Always provide a language identifier for fenced code blocks; use plaintext for non-code text
Use '-' as the first-level bullet and renumber ordered lists when items change
Prefer inline links text or angle brackets
Ensure blank lines before and after bulleted lists and fenced code blocks
Ensure tables have a delimiter line below the header row
Expand uncommon acronyms on first use (e.g., Continuous Integration (CI))
Wrap paragraphs at 80 columns
Wrap code (in docs) at 120 columns
Do not wrap tables
Use footnotes referenced with [^label]
Include Mermaid diagrams where they add clarity
Embed figures using Markdown image syntaxand provide brief, descriptive alt text
Add a short descriptive sentence before each Mermaid diagram for screen readers
Files:
docs/function-parsing-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint and run make fmt to format Markdown (including tables) after changes
Validate Mermaid diagrams in Markdown by running make nixie
Wrap Markdown paragraphs and bullet points at 80 columns; code blocks at 120; do not wrap tables/headings; use dashes for list bullets; use GFM footnotes
Files:
docs/function-parsing-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/function-parsing-design.md
🧬 Code graph analysis (1)
src/parser/tests/parser.rs (4)
src/parser/mod.rs (1)
parse(41-49)src/parser/cst_builder/mod.rs (2)
errors(51-53)root(45-47)src/parser/ast/parse_utils/errors.rs (1)
is_empty(144-146)src/parser/cst_builder/spans.rs (2)
relations(66-69)relations(168-170)
🔍 Remote MCP
Here are a few concrete details pulled from the Pull Request’s own summary to aid your review:
• In src/parser/ast/parse_utils/delimiter.rs, a new public function
#[must_use = "discarding the extracted text loses delimiter content"]
pub fn extract_delimited<I>(
iter: &mut Peekable<I>,
open_kind: SyntaxKind,
close_kind: SyntaxKind,
) -> Result<String, UnclosedDelimiterError>
where
I: Iterator<Item = SyntaxElement<DdlogLanguage>>replaces the old paren_block_span–based approach. It balances nested delimiters via an explicit depth counter, accumulates inner text (using a new element_text helper returning Cow<'_, str>), and returns Err(UnclosedDelimiterError { collected, expected }) if the closer is missing.
• In src/parser/ast/parse_utils/mod.rs, the crate-private paren_block_span export is removed and replaced by public re-exports of
UnclosedDelimiterErrorextract_delimited- an alias
extract_parenthesized = extract_delimited
ensuring backwards-compatible access while cutting out the old API.
• In src/parser/span_scanner.rs:
- A new
relation_columns()parser (built with Chumsky combinators) parses a non-empty(…)-delimited identifier list. - A new
primary_key_clause()parser enforces the exact"primary key"keyword sequence followed by a column list. collect_relation_spansis refactored to callSpanCollector::parse_spanwith these two parsers (via a newrecord_relationhelper), pushing a span only if columns and—if present—the PK clause both succeed. On failure, errors are accumulated in a newExtras<'a>struct and the line is skipped.
• In src/parser/tests/parser.rs, three new fixtures—relation_empty_columns(), relation_whitespace_columns(), and relation_invalid_pk()—are added alongside a parameterized test asserting that each invalid input produces errors and yields no relations. Note that the new tests were inserted twice in the file, resulting in duplicate test blocks.
🔇 Additional comments (6)
src/parser/ast/parse_utils/delimiter.rs (1)
20-31: Good: error type ergonomics and trait impls are in placeKeep
Display,Error, and the explicit message; this improves diagnostics and integrates cleanly with error stacks.src/syntax_utils.rs (1)
38-42: LGTM: align with the new delimiter extractorImport
extract_delimitedvia an alias and preserve the previous error-salvage behaviour withunwrap_or_else. Keep as-is.src/parser/ast/relation.rs (1)
150-153: LGTM: switch toextract_delimitedwith explicit delimitersUse
.ok()?to drop invalid PK clauses, delegating error reporting to the parser layer as intended in this PR.src/parser/tests/parser.rs (1)
116-130: LGTM: add fixtures for invalid relation inputsKeep these fixtures; they pair cleanly with the parameterised test below.
src/parser/span_scanner.rs (2)
192-225: Relation columns parser is correct and span semantics are tightThe parser enforces a non-empty column list and captures the span without trailing inline whitespace while still consuming it from the stream via
then_ignore(ws). This fixes prior span drift and redundant whitespace skipping.
227-245: Confirm keyword tokenisation correctnessConfirm that no dedicated
K_PRIMARYorK_KEYtokens exist in theSyntaxKindenum; keywords are lexed asT_IDENT.
Confirm thatkeyword(..)matching onT_IDENTwith the expected text correctly handles bothprimaryandkey.
|
@coderabbitai please create a new GitHub issue for the following: src/parser/span_scanner.rs (1) |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@leynos I'll examine the current state of the span_scanner.rs file and create a GitHub issue for the refactoring. Now I'll create the GitHub issue: 🏁 Script executed: Length of output: 1524 🏁 Script executed: Length of output: 1643 @leynos I've successfully created GitHub issue #125 for refactoring the Issue Details:
Key findings from the file analysis:
The issue includes a comprehensive breakdown of the proposed solution with dedicated modules for each declaration type, implementation steps, and acceptance criteria to ensure the refactoring maintains functionality while enforcing the repository guidelines. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
docs/function-parsing-design.md (1)
95-106: Fix type/field mismatch for UnclosedDelimiterError in the diagramAlign the Mermaid diagram with the actual exported error type. The code defines
UnclosedDelimiterError { collected: String, expected: SyntaxKind }, not{ delimiter: char, span: TextRange }. Update the diagram and its relationships accordingly.Apply:
class UnclosedDelimiterError { - +delimiter: char - +span: TextRange + +collected: String + +expected: SyntaxKind } @@ class ParseError { <<enum>> Delimiter(DelimiterError) - UnclosedDelimiter(UnclosedDelimiterError) + UnclosedDelimiter(UnclosedDelimiterError) %% document this variant maps to the extractor error MissingColon MissingName MissingType }Also add a brief note below the diagram clarifying that parser-level delimiter issues surface as
ParseError::Delimiter/ParseError::UnclosedDelimiter, whereas stream-extraction failures surface asparse_utils::UnclosedDelimiterError. This avoids ambiguity where two error paths share similar names.src/parser/ast/parse_utils/delimiter.rs (2)
42-56: Prefer expect() over unwrap() in docsReplace
.unwrap()with.expect("balanced parentheses")in the example to conform to the repository guideline and provide clearer failure context in doctests.-).unwrap(); +).expect("balanced parentheses");
117-126: Document behaviour when the opening delimiter is absent
skip_to_opening_delimiterconsumes the entire iterator ifopen_kindnever appears, causingextract_delimitedto returnErr(UnclosedDelimiterError { collected: "", expected: close_kind }). Make this explicit in theextract_delimitedRustdoc to prevent misinterpretation of the error as “missing closer after an opener”.Add to the Errors section:
- “If the opening delimiter is not encountered before EOF, the function also returns
UnclosedDelimiterErrorwithcollectedempty.”src/parser/span_scanner.rs (1)
1-6: Split this module; file length exceeds repository limitsThis file is ~665 lines. The guidelines cap any single Rust source file at 400 lines. Extract the relation/index/function/rule scanners into submodules (e.g.,
span_scanner::{imports, typedefs, relations, indexes, functions, transformers, rules}) and re-export the entrypoints here.
♻️ Duplicate comments (2)
docs/function-parsing-design.md (1)
62-65: Acknowledge alias mention; alignment with migration guidance is correctReference to the transitional alias
extract_parenthesizedis present and keeps upgrade paths clear.src/parser/span_scanner.rs (1)
292-324: Inline relation parsing via composed parsers; drop manual peek/flagsEliminate the manual
peekandokflag by composingrelation_columns()with an optionalprimary_key_clause(src)into a single parser. This reduces control flow, centralises error propagation, and keepsSpanCollector::parse_spanusage consistent with other handlers.Apply:
- fn record_relation(st: &mut State<'_>, start: usize) { - let cols = relation_columns(); - - let (cols_res, cols_err) = st.parse_span(cols, start); - if let Some(sp) = cols_res { - st.stream.skip_until(sp.end); - st.stream.skip_ws_inline(); - let mut ok = true; - if let Some((SyntaxKind::T_IDENT, pk_span)) = st.stream.peek().cloned() - && st.extra.src.get(pk_span.clone()) == Some("primary") - { - let (pk_res, pk_err) = - st.parse_span(primary_key_clause(st.extra.src), pk_span.start); - if let Some(pk_sp) = pk_res { - st.stream.skip_until(pk_sp.end); - } else { - st.extra.errors.extend(pk_err); - ok = false; - } - } - - if ok { - let end = st.stream.line_end(st.stream.cursor()); - st.stream.skip_until(end); - st.spans.push(start..end); - } else { - st.skip_line(); - } - } else { - st.extra.errors.extend(cols_err); - st.skip_line(); - } - } + fn record_relation(st: &mut State<'_>, start: usize) { + let parser = + relation_columns() + .then(primary_key_clause(st.extra.src).or_not()) + .map_with_span(|_, sp: Span| sp); + + let (res, errs) = st.parse_span(parser, start); + if let Some(sp) = res { + let end = st.stream.line_end(sp.end); + st.stream.skip_until(end); + st.spans.push(start..end); + } else { + st.extra.errors.extend(errs); + st.skip_line(); + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
docs/function-parsing-design.md(1 hunks)src/parser/ast/parse_utils/delimiter.rs(6 hunks)src/parser/ast/parse_utils/mod.rs(1 hunks)src/parser/span_scanner.rs(3 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Keep any single Rust source file under 400 lines; split long switches/dispatch tables and move large test data to external files
Disallow Clippy warnings; do not silence lints except as a last resort
Any lint suppressions must be tightly scoped and include a clear reason; prefer expect over allow
Fix warnings emitted during tests in code instead of silencing them
Extract helper functions when functions are too long; group parameters into structs when they are too many
If returning a large error, consider using Arc to reduce returned data
Write unit and behavioural tests for new functionality and run before/after changes
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate functions for conditional criteria with more than two branches
Prefer .expect() over .unwrap()
Use concat!() for long string literals rather than escaping newlines with a backslash
Files:
src/parser/span_scanner.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/parse_utils/delimiter.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/parser/span_scanner.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/parse_utils/delimiter.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a module-level //! comment explaining purpose and utility
Document public APIs with /// Rustdoc comments so cargo doc can generate docs
Files:
src/parser/span_scanner.rssrc/parser/ast/parse_utils/mod.rssrc/parser/ast/parse_utils/delimiter.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Use docs/ as the reference source of truth; proactively update docs when requirements, dependencies, or architecture change
Documentation in docs/ must use en-GB-oxendict spelling and grammar
docs/**/*.md: The word “outwith” is acceptable in documentation
Keep US spelling when it appears in API names (e.g., color)
Use the Oxford comma
Treat company names as collective nouns (e.g., “Lille Industries are…”)
Write headings in sentence case
Use Markdown headings (#, ##, ###, …) in order without skipping levels
Follow markdownlint recommendations
Use standard Markdown syntax for code blocks and lists
Always provide a language identifier for fenced code blocks; use plaintext for non-code text
Use '-' as the first-level bullet and renumber ordered lists when items change
Prefer inline links text or angle brackets
Ensure blank lines before and after bulleted lists and fenced code blocks
Ensure tables have a delimiter line below the header row
Expand uncommon acronyms on first use (e.g., Continuous Integration (CI))
Wrap paragraphs at 80 columns
Wrap code (in docs) at 120 columns
Do not wrap tables
Use footnotes referenced with [^label]
Include Mermaid diagrams where they add clarity
Embed figures using Markdown image syntaxand provide brief, descriptive alt text
Add a short descriptive sentence before each Mermaid diagram for screen readers
Files:
docs/function-parsing-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint and run make fmt to format Markdown (including tables) after changes
Validate Mermaid diagrams in Markdown by running make nixie
Wrap Markdown paragraphs and bullet points at 80 columns; code blocks at 120; do not wrap tables/headings; use dashes for list bullets; use GFM footnotes
Files:
docs/function-parsing-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/function-parsing-design.md
🧬 Code graph analysis (1)
src/parser/ast/parse_utils/delimiter.rs (1)
src/parser/ast/parse_utils/errors.rs (2)
fmt(61-71)fmt(77-98)
🔍 Remote MCP
Here are a few concrete details pulled from the Pull Request’s own summary to aid your review:
• In src/parser/ast/parse_utils/delimiter.rs, a new public function
#[must_use = "discarding the extracted text loses delimiter content"]
pub fn extract_delimited<I>(
iter: &mut Peekable<I>,
open_kind: SyntaxKind,
close_kind: SyntaxKind,
) -> Result<String, UnclosedDelimiterError>
where
I: Iterator<Item = SyntaxElement<DdlogLanguage>>replaces the old paren_block_span–based approach. It balances nested delimiters via an explicit depth counter, accumulates inner text (using a new element_text helper returning Cow<'_, str>), and returns Err(UnclosedDelimiterError { collected, expected }) if the closer is missing.
• In src/parser/ast/parse_utils/mod.rs, the crate-private paren_block_span export is removed and replaced by public re-exports of
UnclosedDelimiterErrorextract_delimited- an alias
extract_parenthesized = extract_delimited
ensuring backwards-compatible access while cutting out the old API.
• In src/parser/span_scanner.rs:
- A new
relation_columns()parser (built with Chumsky combinators) parses a non-empty(…)-delimited identifier list. - A new
primary_key_clause()parser enforces the exact"primary key"keyword sequence followed by a column list. collect_relation_spansis refactored to callSpanCollector::parse_spanwith these two parsers (via a newrecord_relationhelper), pushing a span only if columns and—if present—the PK clause both succeed. On failure, errors are accumulated in a newExtras<'a>struct and the line is skipped.
• In src/parser/tests/parser.rs, three new fixtures—relation_empty_columns(), relation_whitespace_columns(), and relation_invalid_pk()—are added alongside a parameterized test asserting that each invalid input produces errors and yields no relations. Note that the new tests were inserted twice in the file, resulting in duplicate test blocks.
🔇 Additional comments (4)
src/parser/ast/parse_utils/delimiter.rs (1)
61-67: Good: public, must_use extractor with precise signatureThe
#[must_use]annotation andResult<String, UnclosedDelimiterError>return type are appropriate and safe for callers. No issues.src/parser/ast/parse_utils/mod.rs (1)
17-25: Good: curated re-exports with deprecation on legacy aliasExporting
UnclosedDelimiterErrorandextract_delimitedwith a deprecatedextract_parenthesizedalias is the right balance between clarity and compatibility. No further changes.src/parser/span_scanner.rs (2)
215-225: Non-empty relation columns parser looks correct and whitespace-safeThe parser enforces a non-empty
(...)and returns a span that excludes trailing inline whitespace viathen_ignore(ws). Good attention to span semantics.
269-278: Primary key clause parser is precise and whitespace-tolerantSequential keyword enforcement plus
balanced_block_nonemptyis clear and robust; trailing inline trivia is consumed. Good.
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (1)
src/parser/ast/parse_utils/delimiter.rs (1)
169-173: Adopt chunked text appends (LGTM).Use of
for_each_chunkavoids intermediate allocations and addresses the prior micro‑optimisation request.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/parser/ast/parse_utils/delimiter.rs(6 hunks)src/parser/ast/relation.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Keep any single Rust source file under 400 lines; split long switches/dispatch tables and move large test data to external files
Disallow Clippy warnings; do not silence lints except as a last resort
Any lint suppressions must be tightly scoped and include a clear reason; prefer expect over allow
Fix warnings emitted during tests in code instead of silencing them
Extract helper functions when functions are too long; group parameters into structs when they are too many
If returning a large error, consider using Arc to reduce returned data
Write unit and behavioural tests for new functionality and run before/after changes
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate functions for conditional criteria with more than two branches
Prefer .expect() over .unwrap()
Use concat!() for long string literals rather than escaping newlines with a backslash
Files:
src/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a module-level //! comment explaining purpose and utility
Document public APIs with /// Rustdoc comments so cargo doc can generate docs
Files:
src/parser/ast/relation.rssrc/parser/ast/parse_utils/delimiter.rs
🧬 Code graph analysis (2)
src/parser/ast/relation.rs (2)
src/parser/ast/mod.rs (1)
skip_whitespace_and_comments(116-123)src/parser/ast/parse_utils/delimiter.rs (1)
extract_delimited(63-93)
src/parser/ast/parse_utils/delimiter.rs (1)
src/parser/ast/parse_utils/errors.rs (2)
fmt(61-71)fmt(77-98)
🔍 Remote MCP
Here are a few concrete details pulled from the Pull Request’s own summary to aid your review:
• In src/parser/ast/parse_utils/delimiter.rs, a new public function
#[must_use = "discarding the extracted text loses delimiter content"]
pub fn extract_delimited<I>(
iter: &mut Peekable<I>,
open_kind: SyntaxKind,
close_kind: SyntaxKind,
) -> Result<String, UnclosedDelimiterError>
where
I: Iterator<Item = SyntaxElement<DdlogLanguage>>replaces the old paren_block_span–based approach. It balances nested delimiters via an explicit depth counter, accumulates inner text (using a new element_text helper returning Cow<'_, str>), and returns Err(UnclosedDelimiterError { collected, expected }) if the closer is missing.
• In src/parser/ast/parse_utils/mod.rs, the crate-private paren_block_span export is removed and replaced by public re-exports of
UnclosedDelimiterErrorextract_delimited- an alias
extract_parenthesized = extract_delimited
ensuring backwards-compatible access while cutting out the old API.
• In src/parser/span_scanner.rs:
- A new
relation_columns()parser (built with Chumsky combinators) parses a non-empty(…)-delimited identifier list. - A new
primary_key_clause()parser enforces the exact"primary key"keyword sequence followed by a column list. collect_relation_spansis refactored to callSpanCollector::parse_spanwith these two parsers (via a newrecord_relationhelper), pushing a span only if columns and—if present—the PK clause both succeed. On failure, errors are accumulated in a newExtras<'a>struct and the line is skipped.
• In src/parser/tests/parser.rs, three new fixtures—relation_empty_columns(), relation_whitespace_columns(), and relation_invalid_pk()—are added alongside a parameterized test asserting that each invalid input produces errors and yields no relations. Note that the new tests were inserted twice in the file, resulting in duplicate test blocks.
🔇 Additional comments (2)
src/parser/ast/parse_utils/delimiter.rs (2)
1-6: Document the module entrypoint well (LGTM).Retain the succinct, purpose-led module docs. The en‑GB spelling (“parenthesised”) aligns with the repo guidelines.
20-31: Keep the error surface idiomatic (LGTM).Implement
DisplayandErrorforUnclosedDelimiterErroras done. The message is actionable and composes cleanly with higher-level errors.
|
@coderabbitai pause |
✅ Actions performedReviews paused. |
Adjust to assert the unclosed error over the statement span (0..len) rather than the opening index.\n\nRationale: the span scanner’s balanced-block combinator currently reports unclosed delimiters at the substream start; this preserves current branch behaviour while integrating main’s test module refactor.\n\nNo runtime changes; tests and lints pass.
16485a3 to
9d17e14
Compare
| fn import_statement_standard_case() { | ||
| let src = "import standard_library"; | ||
| let parsed = parse(src); | ||
| assert!(parsed.errors().is_empty()); | ||
| let imports = parsed.root().imports(); | ||
| assert_eq!(imports.len(), 1); | ||
| let Some(imp) = imports.first() else { | ||
| panic!("expected import"); | ||
| }; | ||
| assert_eq!(imp.path(), "standard_library"); | ||
| assert!(imp.alias().is_none()); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
@leynos, I've identified the code duplication issue in src/parser/tests/parser.rs. The main problems are:
Issues Found:
- Duplicate test blocks: The same
relation_invalid_is_errorparameterised test appears twice in the file - Similar fixture patterns: Many fixtures follow identical structure (string literals with minor variations)
- Repetitive test assertions: Multiple tests follow the same parse-and-assert pattern
Suggested Fix:
Replace duplicated tests with consolidated parameterised test cases using #[rstest] and group related fixtures. Here's the approach:
- Remove duplicate test blocks
- Consolidate similar fixtures into parameterised test cases
- Create shared test helper functions for common assertion patterns
- Group related test cases using
#[rstest(...)]with multiple parameters
AI Coding Agent Prompt:
You are refactoring a Rust test file to eliminate code duplication. The file `src/parser/tests/parser.rs` contains duplicate test blocks and repetitive test patterns.
TASKS:
1. Remove the duplicate `relation_invalid_is_error` test block (keep only one instance)
2. Identify groups of similar test functions that follow the same pattern
3. Replace repetitive individual tests with parameterised `#[rstest]` test cases
4. Create shared helper functions for common assertion patterns like:
- Parse and verify no errors
- Parse and verify specific error conditions
- Parse and verify AST node properties
GUIDELINES:
- Use `#[rstest(...)]` with multiple test cases instead of individual test functions
- Keep fixture functions but group related ones logically
- Create helper functions like `assert_parse_success()`, `assert_parse_errors()`, `assert_relation_properties()`
- Follow the existing rstest patterns already in the file
- Ensure all test coverage is preserved
- Use descriptive parameter names in rstest cases
EXAMPLE PATTERN:
Instead of:
```rust
#[rstest]
fn test_case_1(fixture1: &str) { /* parse and assert */ }
#[rstest]
fn test_case_2(fixture2: &str) { /* parse and assert */ }
```
Use:
```rust
#[rstest]
#[case::case1(fixture1(), "expected_result1")]
#[case::case2(fixture2(), "expected_result2")]
fn parameterised_test(#[case] input: &str, #[case] expected: &str) {
// shared test logic
}
```
Focus on maintaining the same test coverage while reducing code duplication through parameterisation and helper functions.
The refactoring should reduce the file from 25+ similar functions to a more manageable set of parameterised test cases, aligning with the repository's coding guidelines that specify using rstest for avoiding test repetition.
🏁 Script executed:
#!/bin/bash
# Examine the complete structure of the parser test file
echo "=== File structure and line count ==="
wc -l src/parser/tests/parser.rs
echo -e "\n=== Function definitions ==="
rg -n "^fn " src/parser/tests/parser.rs
echo -e "\n=== Fixture definitions ==="
rg -n "#\[fixture\]" src/parser/tests/parser.rs
echo -e "\n=== Test function patterns ==="
rg -n -A5 "#\[rstest\]" src/parser/tests/parser.rs
echo -e "\n=== Sample of similar function structures ==="
rg -n -A10 "fn (complex_typedef|extern_function_parsed|extern_type|fact_rule_parsed)" src/parser/tests/parser.rs | head -50
Length of output: 22478
🏁 Script executed:
#!/bin/bash
# Look for duplicate test blocks and similar patterns
echo "=== Looking for duplicate test implementations ==="
rg -n -A15 "relation_invalid_is_error" src/parser/tests/parser.rs
echo -e "\n=== Checking for repeated fixture patterns ==="
rg -n -B2 -A5 "relation_(empty|whitespace|invalid)" src/parser/tests/parser.rs
echo -e "\n=== Looking for similar test assertion patterns ==="
rg -n -C3 "assert.*parsed.*errors" src/parser/tests/parser.rs
echo -e "\n=== Examining parse and assert patterns ==="
rg -n -C2 "let parsed = parse\(" src/parser/tests/parser.rs | head -30
Length of output: 18297
@leynos, Perfect! I've identified the extensive code duplication patterns in the 1080-line test file. The issues are clear:
Key Duplication Patterns Found:
-
Repetitive parse-and-assert structure: Nearly every test follows the same pattern:
let parsed = parse(input); assert!(parsed.errors().is_empty()); // or !is_empty() let items = parsed.root().some_collection(); assert_eq!(items.len(), 1);
-
Similar function parsing tests: Multiple
*_parsedfunctions (extern_function_parsed, function_with_body_parsed, etc.) follow identical structures -
Error-checking tests: Multiple
*_is_errorfunctions with nearly identical assertions -
Fixture proliferation: Many fixtures return simple string literals
Comprehensive Fix Strategy:
Create helper functions, consolidate similar tests into parameterised cases, and group related functionality.
Enhanced AI Coding Agent Prompt:
You are refactoring the 1080-line Rust test file `src/parser/tests/parser.rs` to eliminate code duplication by consolidating similar test patterns into parameterised test cases and helper functions.
SPECIFIC TASKS:
1. **Create Test Helper Functions**:
```rust
fn assert_parse_success(src: &str) -> ParsedProgram { /* shared logic */ }
fn assert_parse_errors(src: &str) { /* shared error assertion */ }
fn assert_single_relation(parsed: &ParsedProgram, name: &str, is_input: bool, is_output: bool) { /* ... */ }
fn assert_single_function(parsed: &ParsedProgram, name: &str, is_extern: bool, params: Vec<(&str, &str)>, return_type: Option<&str>) { /* ... */ }
```
2. **Consolidate Function Tests**: Replace these individual tests:
- `extern_function_parsed`, `function_with_body_parsed`, `function_no_return_parsed`, etc.
With a single parameterised test:
```rust
#[rstest]
#[case::extern_function(extern_function(), "hash", true, vec![("data", "string")], Some("u64"))]
#[case::with_body(function_with_body(), "compute", false, vec![("x", "u32")], Some("u32"))]
// ... more cases
fn function_parsing_success(
#[case] input: &str,
#[case] expected_name: &str,
#[case] is_extern: bool,
#[case] expected_params: Vec<(&str, &str)>,
#[case] expected_return: Option<&str>
) {
let parsed = assert_parse_success(input);
assert_single_function(&parsed, expected_name, is_extern, expected_params, expected_return);
}
```
3. **Consolidate Relation Tests**: Group `input_relation_parsed`, `output_relation_parsed`, `internal_relation_parsed` into one parameterised test.
4. **Consolidate Error Tests**: Group all `*_is_error` functions into categorised parameterised tests:
```rust
#[rstest]
#[case::function_unclosed_params(function_unclosed_params())]
#[case::extern_missing_colon(extern_function_missing_colon())]
#[case::unterminated_body(function_unterminated_body())]
fn function_error_cases(#[case] input: &str) {
assert_parse_errors(input);
// Add function-specific assertions
}
```
5. **Consolidate Transformer Tests**: Similar pattern for transformer tests.
6. **Consolidate Import/TypeDef Tests**: Group similar import and typedef tests.
7. **Remove Duplicate Fixtures**: Keep only unique fixtures, remove redundant string literals.
CONSOLIDATION TARGETS (25+ functions to reduce):
- Function tests: 8+ individual functions → 1-2 parameterised tests
- Relation tests: 4+ individual functions → 1 parameterised test
- Error tests: 6+ individual functions → 2-3 parameterised tests by category
- Transformer tests: 6+ individual functions → 1-2 parameterised tests
- Import/TypeDef tests: Multiple functions → 1-2 parameterised tests
REQUIREMENTS:
- Preserve all test coverage and assertions
- Use descriptive `#[case::name]` attributes
- Keep existing fixtures but eliminate unused ones
- Follow existing rstest patterns in the file
- Reduce file from 1080 lines to under 600 lines
- Create logical groupings by functionality (functions, relations, imports, etc.)
EXAMPLE TRANSFORMATION:
Replace this pattern (repeated 8+ times):
```rust
#[rstest]
fn some_test_parsed(fixture: &str) {
let parsed = parse(fixture);
assert!(parsed.errors().is_empty());
// specific assertions...
}
```
With:
```rust
#[rstest]
#[case::case1(fixture1(), expected1)]
#[case::case2(fixture2(), expected2)]
fn consolidated_test(#[case] input: &str, #[case] expected: SomeType) {
let parsed = assert_parse_success(input);
// parameterised assertions using expected
}
```
Focus on eliminating the repetitive `parse() + assert!()` boilerplate through helper functions and parameterised test consolidation.
This refactoring should reduce the file size significantly while maintaining full test coverage and addressing the code duplication issue flagged by CodeScene.
| fn complex_program_round_trip(complex_prog: &str) { | ||
| let parsed = parse(complex_prog); | ||
| assert!(parsed.errors().is_empty()); | ||
| let text = pretty_print(parsed.root().syntax()); | ||
| assert_eq!(text, complex_prog); | ||
| let relations = parsed.root().relations(); | ||
| assert_eq!(relations.len(), 2); | ||
| let [first, second] = relations.as_slice() else { | ||
| panic!("expected two relations"); | ||
| }; | ||
| assert!(first.is_input()); | ||
| assert!(second.is_output()); | ||
| assert_eq!(first.name(), Some("R".into())); | ||
| assert_eq!(second.name(), Some("S".into())); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
@leynos Here's my analysis and suggested fix for the Large Assertion Blocks issue:
Problem Analysis
The test file contains 9 assertion blocks with 4+ assertions each. Looking at the code, these large assertion blocks typically combine multiple testing concerns:
- Parse validation (checking for errors)
- Round-trip verification (input matches output)
- Structural validation (node counts, types)
- Content validation (names, properties)
Suggested Fix
Break down large test functions into smaller, focused tests using rstest parameterization and helper functions:
1. Extract Common Assertion Helpers
fn assert_parse_success(parsed: &ParseResult, expected_text: &str) {
assert!(parsed.errors().is_empty());
let text = pretty_print(parsed.root().syntax());
assert_eq!(text, expected_text);
assert_eq!(parsed.root().kind(), SyntaxKind::N_DATALOG_PROGRAM);
}
fn assert_relation_properties(relation: &Relation, name: &str, is_input: bool, is_output: bool) {
assert_eq!(relation.name(), Some(name.into()));
assert_eq!(relation.is_input(), is_input);
assert_eq!(relation.is_output(), is_output);
}2. Use Parameterized Tests
Replace multiple similar tests with rstest parameterized cases.
3. Separate Testing Concerns
Split large tests into focused units testing one aspect each.
AI Coding Agent Prompt
You are tasked with refactoring src/parser/tests/parser.rs to resolve a "Large Assertion Blocks" code quality issue. The file currently has 9 assertion blocks with 4+ assertions each.
**Objectives:**
1. Break down large test functions with 4+ assertions into smaller, focused tests
2. Extract common assertion patterns into reusable helper functions
3. Use rstest parameterized tests to reduce duplication
4. Separate testing concerns: parsing validation, round-trip verification, structural validation, and content validation
**Specific Requirements:**
1. Create helper functions like:
- `assert_parse_success(parsed, expected_text)` for common parse + round-trip validation
- `assert_relation_properties(relation, name, is_input, is_output)` for relation validation
- `assert_relation_count(parsed, expected_count)` for structural validation
2. Convert large test functions (like `complex_program_round_trip`) into multiple smaller tests:
- One for parse success and round-trip
- One for relation count validation
- One for individual relation properties
3. Use `#[rstest(...)]` parameterized tests where multiple tests follow the same pattern with different fixtures
4. Ensure each test function has a maximum of 3 assertions
5. Maintain all existing test coverage - do not remove any validation logic
6. Follow the repository's Rust coding guidelines:
- Keep functions focused and under 12 cyclomatic complexity
- Use rstest fixtures for shared setup
- Place function attributes after doc comments
**Files to modify:**
- `src/parser/tests/parser.rs`
**Testing:**
Run `make test` to ensure all refactored tests pass and coverage is preserved.
This refactoring will make the tests more maintainable, easier to debug when they fail, and align with the repository's coding guidelines for using rstest effectively.
🏁 Script executed:
#!/bin/bash
# Examine the full test file to understand the assertion blocks
cat src/parser/tests/parser.rs
Length of output: 34342
@leynos Here's my comprehensive analysis and solution for the Large Assertion Blocks issue:
Identified Large Assertion Blocks
I found these functions with 4+ assertions:
complex_program_round_trip- 8 assertionsempty_program_has_no_items- 8 assertionsinput_relation_parsed- 7 assertionsoutput_relation_parsed- 6 assertionsinternal_relation_parsed- 7 assertionsmultiline_relation_parsed- 6 assertionsindex_single_column_parsed- 4 assertionsindex_multi_column_parsed- 4 assertionsindex_nested_function_parsed- 4 assertions
Suggested Refactoring Strategy
1. Extract Common Assertion Helpers
// Parse validation helpers
fn assert_parse_success(parsed: &ParseResult, expected_text: &str) {
assert!(parsed.errors().is_empty());
let text = pretty_print(parsed.root().syntax());
assert_eq!(text, expected_text);
}
fn assert_root_kind(parsed: &ParseResult) {
assert_eq!(parsed.root().kind(), SyntaxKind::N_DATALOG_PROGRAM);
}
// Collection validation helpers
fn assert_relation_count(parsed: &ParseResult, expected: usize) {
assert_eq!(parsed.root().relations().len(), expected);
}
fn assert_empty_collections(parsed: &ParseResult) {
assert!(parsed.root().imports().is_empty());
assert!(parsed.root().type_defs().is_empty());
assert!(parsed.root().relations().is_empty());
assert!(parsed.root().functions().is_empty());
assert!(parsed.root().indexes().is_empty());
assert!(parsed.root().rules().is_empty());
}
// Relation validation helpers
fn assert_relation_properties(
relation: &Relation,
name: &str,
is_input: bool,
is_output: bool,
) {
assert_eq!(relation.name(), Some(name.into()));
assert_eq!(relation.is_input(), is_input);
assert_eq!(relation.is_output(), is_output);
}2. Split Large Tests
For example, complex_program_round_trip becomes:
#[rstest]
fn complex_program_parse_success(complex_prog: &str) {
let parsed = parse(complex_prog);
assert_parse_success(&parsed, complex_prog);
}
#[rstest]
fn complex_program_has_two_relations(complex_prog: &str) {
let parsed = parse(complex_prog);
assert_relation_count(&parsed, 2);
}
#[rstest]
fn complex_program_relation_properties(complex_prog: &str) {
let parsed = parse(complex_prog);
let relations = parsed.root().relations();
let [first, second] = relations.as_slice() else {
panic!("expected two relations");
};
assert_relation_properties(first, "R", true, false);
assert_relation_properties(second, "S", false, true);
}AI Coding Agent Prompt
You are tasked with refactoring `src/parser/tests/parser.rs` to resolve a "Large Assertion Blocks" code quality issue. The file currently has 9 test functions with 4+ assertions each, which makes tests harder to read, understand, and debug.
**Primary Objective:**
Reduce all test functions to have a maximum of 3 assertions each by extracting common patterns into helper functions and splitting large tests into focused units.
**Specific Functions to Refactor:**
1. `complex_program_round_trip` (8 assertions)
2. `empty_program_has_no_items` (8 assertions)
3. `input_relation_parsed` (7 assertions)
4. `output_relation_parsed` (6 assertions)
5. `internal_relation_parsed` (7 assertions)
6. `multiline_relation_parsed` (6 assertions)
7. `index_single_column_parsed` (4 assertions)
8. `index_multi_column_parsed` (4 assertions)
9. `index_nested_function_parsed` (4 assertions)
**Required Helper Functions to Create:**
1. **Parse Validation Helpers:**
```rust
fn assert_parse_success(parsed: &ParseResult, expected_text: &str) {
assert!(parsed.errors().is_empty());
let text = pretty_print(parsed.root().syntax());
assert_eq!(text, expected_text);
}
fn assert_root_kind(parsed: &ParseResult) {
assert_eq!(parsed.root().kind(), SyntaxKind::N_DATALOG_PROGRAM);
}
fn assert_parse_has_errors(parsed: &ParseResult) {
assert!(!parsed.errors().is_empty());
}
```
2. **Collection Validation Helpers:**
```rust
fn assert_relation_count(parsed: &ParseResult, expected: usize) {
assert_eq!(parsed.root().relations().len(), expected);
}
fn assert_index_count(parsed: &ParseResult, expected: usize) {
assert_eq!(parsed.root().indexes().len(), expected);
}
fn assert_empty_collections(parsed: &ParseResult) {
assert!(parsed.root().imports().is_empty());
assert!(parsed.root().type_defs().is_empty());
assert!(parsed.root().relations().is_empty());
assert!(parsed.root().functions().is_empty());
assert!(parsed.root().indexes().is_empty());
assert!(parsed.root().rules().is_empty());
}
```
3. **Entity-Specific Helpers:**
```rust
fn assert_relation_properties(relation: &Relation, name: &str, is_input: bool, is_output: bool) {
assert_eq!(relation.name(), Some(name.into()));
assert_eq!(relation.is_input(), is_input);
assert_eq!(relation.is_output(), is_output);
}
fn assert_relation_columns(relation: &Relation, expected: Vec<(String, String)>) {
assert_eq!(relation.columns(), expected);
}
fn assert_relation_primary_key(relation: &Relation, expected: Option<Vec<String>>) {
assert_eq!(relation.primary_key(), expected);
}
fn assert_index_properties(index: &Index, name: &str, relation: &str, columns: Vec<String>) {
assert_eq!(index.name(), Some(name.into()));
assert_eq!(index.relation(), Some(relation.into()));
assert_eq!(index.columns(), columns);
}
```
**Refactoring Examples:**
1. **Split `complex_program_round_trip` into:**
- `complex_program_parse_success` - parse + round-trip validation
- `complex_program_has_two_relations` - count validation
- `complex_program_relation_properties` - individual relation properties
2. **Split `empty_program_has_no_items` into:**
- `empty_program_parse_success` - parse validation
- `empty_program_collections_empty` - collection emptiness
- `empty_program_structure_valid` - structure validation
3. **Split relation parsing tests into:**
- `{fixture}_parse_success` - parse + round-trip
- `{fixture}_properties` - name, input/output flags
- `{fixture}_structure` - columns, primary key
**Requirements:**
1. Maintain 100% test coverage - no validation logic should be lost
2. Each test function must have ≤3 assertions
3. Use descriptive test names that clearly indicate what's being tested
4. Place helper functions at the top of the file after existing utilities
5. Add doc comments to helper functions explaining their purpose
6. Follow repository Rust guidelines (function attributes after doc comments, etc.)
7. Ensure all tests continue to pass after refactoring
**Files to Modify:**
- `src/parser/tests/parser.rs` only
**Validation:**
Run `make test` to ensure all refactored tests pass and maintain existing coverage.
This refactoring will significantly improve test maintainability, make failures easier to debug, and align with the repository's testing guidelines.
| fn import_statement_standard_case() { | ||
| let src = "import standard_library"; | ||
| let parsed = parse(src); | ||
| assert!(parsed.errors().is_empty()); | ||
| let imports = parsed.root().imports(); | ||
| assert_eq!(imports.len(), 1); | ||
| let Some(imp) = imports.first() else { | ||
| panic!("expected import"); | ||
| }; | ||
| assert_eq!(imp.path(), "standard_library"); | ||
| assert!(imp.alias().is_none()); | ||
| } |
There was a problem hiding this comment.
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix.
| @@ -0,0 +1,1147 @@ | |||
| //! Integration tests for the parser skeleton. | |||
There was a problem hiding this comment.
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix.
| fn assert_empty_collections(parsed: &crate::Parsed) { | ||
| assert!(parsed.root().imports().is_empty()); | ||
| assert!(parsed.root().type_defs().is_empty()); | ||
| assert!(parsed.root().relations().is_empty()); | ||
| assert!(parsed.root().functions().is_empty()); | ||
| assert!(parsed.root().indexes().is_empty()); | ||
| assert!(parsed.root().rules().is_empty()); | ||
| } |
There was a problem hiding this comment.
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix.
* Refactor parser tests to use typed programs * Refactor parser tests with domain enums and shared specs
Summary
SpanCollectorerror handlingTesting
make fmtmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_68a4f8042954832293acb004414be087
Summary by Sourcery
Use chumsky-based parsers for relation column lists and primary key clauses in span collection, streamline error handling, and add tests for related parsing error cases.
New Features:
relation_columnsparser for non-empty column lists in relation definitionsprimary_key_clauseparser for parsingprimary keyclausesEnhancements:
collect_relation_spanswith the new parsers and centralized error handlingparen_block_spanexport from parse_utilsTests: