Conversation
Reviewer's GuideThis PR refactors the concrete syntax tree (CST) construction by extracting parsing types and tree-building logic into a dedicated cst_builder module, introduces a builder-based API for managing statement spans, centralizes span iteration with reusable abstractions, simplifies the top-level parse function to delegate to the new module, and adds targeted unit tests for span validation and round-trip tree construction. Sequence diagram for CST construction using the new cst_builder modulesequenceDiagram
participant Parser
participant cst_builder as CSTBuilder
participant ParsedSpansBuilder
participant ParsedSpans
participant build_green_tree
Parser->>ParsedSpansBuilder: builder().imports(...).typedefs(...).relations(...).indexes(...).functions(...).transformers(...).rules(...)
ParsedSpansBuilder->>ParsedSpans: build()
Parser->>build_green_tree: build_green_tree(tokens, src, &spans)
build_green_tree-->>Parser: GreenNode
Parser->>CSTBuilder: Parsed::new(green, root, errors)
CSTBuilder-->>Parser: Parsed
Class diagram for refactored CST construction (Parsed, ParsedSpans, ParsedSpansBuilder)classDiagram
class Parsed {
+green: GreenNode
+root: ast::Root
+errors: Vec<Simple<SyntaxKind>>
+green() GreenNode
+root() ast::Root
+errors() Vec<Simple<SyntaxKind>>
+new(green, root, errors)
}
class ParsedSpans {
+imports: Vec<Span>
+typedefs: Vec<Span>
+relations: Vec<Span>
+indexes: Vec<Span>
+functions: Vec<Span>
+transformers: Vec<Span>
+rules: Vec<Span>
+imports() &[Span]
+typedefs() &[Span]
+relations() &[Span]
+indexes() &[Span]
+functions() &[Span]
+transformers() &[Span]
+rules() &[Span]
+builder() ParsedSpansBuilder
}
class ParsedSpansBuilder {
+imports(Vec<Span>) ParsedSpansBuilder
+typedefs(Vec<Span>) ParsedSpansBuilder
+relations(Vec<Span>) ParsedSpansBuilder
+indexes(Vec<Span>) ParsedSpansBuilder
+functions(Vec<Span>) ParsedSpansBuilder
+transformers(Vec<Span>) ParsedSpansBuilder
+rules(Vec<Span>) ParsedSpansBuilder
+build() ParsedSpans
}
ParsedSpansBuilder --|> ParsedSpans : builds
Parsed --* ParsedSpans : uses
Class diagram for CST builder module structureclassDiagram
class cst_builder {
+Parsed
+ParsedSpans
+build_green_tree()
}
class spans {
+ParsedSpans
+ParsedSpansBuilder
}
class tree {
+build_green_tree()
}
cst_builder o-- spans
cst_builder o-- tree
cst_builder ..> ParsedSpans : re-exports
cst_builder ..> build_green_tree : re-exports
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
WalkthroughRefactor the parser module by extracting all concrete syntax tree (CST) construction and parsed span management logic into a new Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant SpanScanner
participant CSTBuilder
Parser->>SpanScanner: parse_tokens(tokens)
SpanScanner-->>Parser: ParsedSpans (via builder)
Parser->>CSTBuilder: build_green_tree(tokens, src, ParsedSpans)
CSTBuilder-->>Parser: GreenNode
Parser->>CSTBuilder: Parsed::new(GreenNode, AST root, errors)
CSTBuilder-->>Parser: Parsed
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)**/*.rsInstructions used from: Sources: 🔇 Additional comments (6)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Failed
Enforce advisory code health rules
(1 file with Code Duplication, Excess Number of Function Arguments)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| cst_builder.rs | 3 advisory rules | 10.00 → 9.10 | Suppress |
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Consider replacing the cfg!(debug_assertions) + ensure_span_lists_sorted/assert pattern with a single debug_assert! in ParsedSpans::new to simplify the debug‐only span validation.
- The start_nodes and finish_nodes helpers share very similar iteration logic—consider unifying them or extracting a small iterator struct to reduce duplication in build_green_tree.
- The new cst_builder module is quite large; splitting span validation, token pushing, and CST construction into smaller submodules or files could improve readability and maintainability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider replacing the cfg!(debug_assertions) + ensure_span_lists_sorted/assert pattern with a single debug_assert! in ParsedSpans::new to simplify the debug‐only span validation.
- The start_nodes and finish_nodes helpers share very similar iteration logic—consider unifying them or extracting a small iterator struct to reduce duplication in build_green_tree.
- The new cst_builder module is quite large; splitting span validation, token pushing, and CST construction into smaller submodules or files could improve readability and maintainability.
## Individual Comments
### Comment 1
<location> `src/parser/cst_builder.rs:314` </location>
<code_context>
+ assert!(errors.is_empty(), "{}", errors.join("\n"));
+}
+
+fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: &Span, src: &str) {
+ let text = src.get(span.clone()).map_or_else(
+ || {
</code_context>
<issue_to_address>
The `push_token` function clones the span for `str::get`, which may be unnecessary if `Span` is already `Copy`.
If `Span` implements `Copy`, pass it by value instead of cloning to avoid unnecessary allocation. If not, consider whether `Span` should be `Copy`.
Suggested implementation:
```rust
fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: Span, src: &str) {
let text = src.get(span).map_or_else(
|| {
warn!(
"token span {:?} out of bounds for source of length {}",
span,
src.len()
);
""
},
|t| t,
);
```
- Ensure that `Span` implements `Copy`. If it does not, derive or implement `Copy` and `Clone` for `Span` if appropriate.
- Update all call sites of `push_token` in your codebase to pass `span` by value instead of by reference.
</issue_to_address>
### Comment 2
<location> `src/parser/cst_builder.rs:163` </location>
<code_context>
+/// Span lists must be sorted and non-overlapping so that tokens are wrapped
+/// into well-formed nodes. Validation occurs in debug builds when
+/// [`ParsedSpans`] is created.
+pub(super) fn build_green_tree(
+ tokens: &[(SyntaxKind, Span)],
+ src: &str,
</code_context>
<issue_to_address>
Consider refactoring the span iterator logic into a single SpanCursor type and two compact loops to eliminate duplicated code and simplify build_green_tree.
Here’s one way to collapse all seven iterators into a single, homogeneous “cursor” type and turn your three almost-identical loops into just two small `for` loops. Once you have that, you can delete `advance_span_iter`, `maybe_start`, `maybe_finish`, `start_nodes`, and `finish_nodes` entirely.
```rust
use std::iter::Peekable;
use std::slice::Iter;
// 1) A single SpanCursor that knows its SyntaxKind
struct SpanCursor<'a> {
iter: Peekable<Iter<'a, Span>>,
kind: SyntaxKind,
}
impl<'a> SpanCursor<'a> {
fn new(spans: &'a [Span], kind: SyntaxKind) -> Self {
Self {
iter: spans.iter().peekable(),
kind,
}
}
// advance past any spans that end <= pos
fn advance_to(&mut self, pos: usize) {
while matches!(self.iter.peek(), Some(s) if pos >= s.end) {
self.iter.next();
}
}
// if the next span starts exactly at pos, open a node
fn start_if(&mut self, builder: &mut GreenNodeBuilder, pos: usize) {
if matches!(self.iter.peek(), Some(s) if pos == s.start) {
builder.start_node(DdlogLanguage::kind_to_raw(self.kind));
}
}
// if the next span ends at or before pos, close the node
fn finish_if(&mut self, builder: &mut GreenNodeBuilder, pos: usize) {
if matches!(self.iter.peek(), Some(s) if pos >= s.end) {
builder.finish_node();
self.iter.next();
}
}
}
```
Then your `build_green_tree` collapses down to:
```rust
pub(super) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
// 2) build an array of cursors instead of seven separate iters
let mut cursors = [
SpanCursor::new(spans.imports(), SyntaxKind::N_IMPORT_STMT),
SpanCursor::new(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
SpanCursor::new(spans.relations(), SyntaxKind::N_RELATION_DECL),
SpanCursor::new(spans.indexes(), SyntaxKind::N_INDEX),
SpanCursor::new(spans.functions(), SyntaxKind::N_FUNCTION),
SpanCursor::new(spans.transformers(), SyntaxKind::N_TRANSFORMER),
SpanCursor::new(spans.rules(), SyntaxKind::N_RULE),
];
for &(kind, ref span) in tokens {
// advance & possibly start all spanning nodes
for cur in &mut cursors {
cur.advance_to(span.start);
cur.start_if(&mut builder, span.start);
}
push_token(&mut builder, kind, span, src);
// possibly finish all spanning nodes
for cur in &mut cursors {
cur.finish_if(&mut builder, span.end);
}
}
builder.finish_node();
builder.finish()
}
```
This:
- Eliminates all the near-duplication in advance/start/finish helpers
- Reduces seven mutable locals down to one compact array
- Keeps exactly the same order & semantics but with a single inner loop per phase
- Deletes the now-unused `start_nodes`/`finish_nodes`/`advance_span_iter`/etc.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| assert!(errors.is_empty(), "{}", errors.join("\n")); | ||
| } | ||
|
|
||
| fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: &Span, src: &str) { |
There was a problem hiding this comment.
suggestion: The push_token function clones the span for str::get, which may be unnecessary if Span is already Copy.
If Span implements Copy, pass it by value instead of cloning to avoid unnecessary allocation. If not, consider whether Span should be Copy.
Suggested implementation:
fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: Span, src: &str) {
let text = src.get(span).map_or_else(
|| {
warn!(
"token span {:?} out of bounds for source of length {}",
span,
src.len()
);
""
},
|t| t,
);- Ensure that
SpanimplementsCopy. If it does not, derive or implementCopyandCloneforSpanif appropriate. - Update all call sites of
push_tokenin your codebase to passspanby value instead of by reference.
| /// Span lists must be sorted and non-overlapping so that tokens are wrapped | ||
| /// into well-formed nodes. Validation occurs in debug builds when | ||
| /// [`ParsedSpans`] is created. | ||
| pub(super) fn build_green_tree( |
There was a problem hiding this comment.
issue (complexity): Consider refactoring the span iterator logic into a single SpanCursor type and two compact loops to eliminate duplicated code and simplify build_green_tree.
Here’s one way to collapse all seven iterators into a single, homogeneous “cursor” type and turn your three almost-identical loops into just two small for loops. Once you have that, you can delete advance_span_iter, maybe_start, maybe_finish, start_nodes, and finish_nodes entirely.
use std::iter::Peekable;
use std::slice::Iter;
// 1) A single SpanCursor that knows its SyntaxKind
struct SpanCursor<'a> {
iter: Peekable<Iter<'a, Span>>,
kind: SyntaxKind,
}
impl<'a> SpanCursor<'a> {
fn new(spans: &'a [Span], kind: SyntaxKind) -> Self {
Self {
iter: spans.iter().peekable(),
kind,
}
}
// advance past any spans that end <= pos
fn advance_to(&mut self, pos: usize) {
while matches!(self.iter.peek(), Some(s) if pos >= s.end) {
self.iter.next();
}
}
// if the next span starts exactly at pos, open a node
fn start_if(&mut self, builder: &mut GreenNodeBuilder, pos: usize) {
if matches!(self.iter.peek(), Some(s) if pos == s.start) {
builder.start_node(DdlogLanguage::kind_to_raw(self.kind));
}
}
// if the next span ends at or before pos, close the node
fn finish_if(&mut self, builder: &mut GreenNodeBuilder, pos: usize) {
if matches!(self.iter.peek(), Some(s) if pos >= s.end) {
builder.finish_node();
self.iter.next();
}
}
}Then your build_green_tree collapses down to:
pub(super) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
// 2) build an array of cursors instead of seven separate iters
let mut cursors = [
SpanCursor::new(spans.imports(), SyntaxKind::N_IMPORT_STMT),
SpanCursor::new(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
SpanCursor::new(spans.relations(), SyntaxKind::N_RELATION_DECL),
SpanCursor::new(spans.indexes(), SyntaxKind::N_INDEX),
SpanCursor::new(spans.functions(), SyntaxKind::N_FUNCTION),
SpanCursor::new(spans.transformers(), SyntaxKind::N_TRANSFORMER),
SpanCursor::new(spans.rules(), SyntaxKind::N_RULE),
];
for &(kind, ref span) in tokens {
// advance & possibly start all spanning nodes
for cur in &mut cursors {
cur.advance_to(span.start);
cur.start_if(&mut builder, span.start);
}
push_token(&mut builder, kind, span, src);
// possibly finish all spanning nodes
for cur in &mut cursors {
cur.finish_if(&mut builder, span.end);
}
}
builder.finish_node();
builder.finish()
}This:
- Eliminates all the near-duplication in advance/start/finish helpers
- Reduces seven mutable locals down to one compact array
- Keeps exactly the same order & semantics but with a single inner loop per phase
- Deletes the now-unused
start_nodes/finish_nodes/advance_span_iter/etc.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/cst_builder.rs(1 hunks)src/parser/mod.rs(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🧬 Code Graph Analysis (2)
src/parser/mod.rs (4)
src/tokenizer.rs (1)
tokenize(201-257)src/parser/cst_builder.rs (6)
build_green_tree(163-221)new(22-32)new(83-113)green(36-38)root(42-44)errors(48-50)src/parser/token_stream.rs (1)
new(41-47)src/parser/span_collector.rs (1)
new(44-50)
src/parser/cst_builder.rs (4)
src/parser/mod.rs (9)
imports(148-154)relations(186-192)indexes(205-211)functions(224-230)transformers(243-249)rules(262-268)kind(121-123)text(133-135)from_green(113-117)src/language.rs (1)
kind_to_raw(185-188)src/tokenizer.rs (1)
tokenize(201-257)src/parser/span_scanner.rs (1)
parse_tokens(19-50)
🔇 Additional comments (11)
src/parser/mod.rs (4)
9-9: Clean import list after refactoring.The simplified import list appropriately reflects the module's reduced responsibilities after extracting CST construction logic.
20-22: Well-structured module organisation.The CST construction logic is properly extracted into a dedicated submodule, with clear separation of concerns. The public re-exports maintain the existing API surface.
47-47: Appropriate use of constructor pattern.Using
Parsed::newconstructor improves encapsulation compared to direct struct literal construction.
1022-1022: Test imports correctly updated.The test imports properly reflect the module's dependencies after the refactoring.
src/parser/cst_builder.rs (7)
1-5: Module documentation meets requirements.The module-level documentation clearly explains the module's purpose and utility as required.
13-51: Well-designedParsedstruct with proper encapsulation.The struct provides a clean API with appropriate constructor and accessor methods. Good use of
#[must_use]attributes.
223-271: Well-structured helper functions for span management.The helper functions provide clear separation of concerns and handle edge cases appropriately.
273-312: Robust span validation with comprehensive error reporting.The validation logic properly checks for overlaps and sorting issues, collecting all errors before reporting. Good implementation of the custom error type.
314-335: Robust token insertion with proper error handling.Good defensive programming with bounds checking and appropriate warning for invalid spans.
337-442: Comprehensive test coverage with good edge case handling.The tests thoroughly cover validation logic, error reporting, and round-trip CST construction. Good use of
rstestfor the round-trip test.
76-113: Remove thread-safety concernPanic in
ensure_span_lists_sortedoccurs only on the calling thread. All inspected calls toParsedSpans::new(insrc/parser/span_scanner.rsandsrc/parser/cst_builder.rs) execute in a single-threaded parsing context.
| pub(super) fn build_green_tree( | ||
| tokens: &[(SyntaxKind, Span)], | ||
| src: &str, | ||
| spans: &ParsedSpans, | ||
| ) -> GreenNode { | ||
| let mut builder = GreenNodeBuilder::new(); | ||
| builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM)); | ||
|
|
||
| let mut import_iter = spans.imports().iter().peekable(); | ||
| let mut typedef_iter = spans.typedefs().iter().peekable(); | ||
| let mut relation_iter = spans.relations().iter().peekable(); | ||
| let mut index_iter = spans.indexes().iter().peekable(); | ||
| let mut function_iter = spans.functions().iter().peekable(); | ||
| let mut transformer_iter = spans.transformers().iter().peekable(); | ||
| let mut rule_iter = spans.rules().iter().peekable(); | ||
|
|
||
| for &(kind, ref span) in tokens { | ||
| advance_span_iter(&mut import_iter, span.start); | ||
| advance_span_iter(&mut typedef_iter, span.start); | ||
| advance_span_iter(&mut relation_iter, span.start); | ||
| advance_span_iter(&mut index_iter, span.start); | ||
| advance_span_iter(&mut function_iter, span.start); | ||
| advance_span_iter(&mut transformer_iter, span.start); | ||
| advance_span_iter(&mut rule_iter, span.start); | ||
|
|
||
| start_nodes( | ||
| &mut builder, | ||
| &mut [ | ||
| (&mut import_iter, SyntaxKind::N_IMPORT_STMT), | ||
| (&mut typedef_iter, SyntaxKind::N_TYPE_DEF), | ||
| (&mut relation_iter, SyntaxKind::N_RELATION_DECL), | ||
| (&mut index_iter, SyntaxKind::N_INDEX), | ||
| (&mut function_iter, SyntaxKind::N_FUNCTION), | ||
| (&mut transformer_iter, SyntaxKind::N_TRANSFORMER), | ||
| (&mut rule_iter, SyntaxKind::N_RULE), | ||
| ], | ||
| span.start, | ||
| ); | ||
|
|
||
| push_token(&mut builder, kind, span, src); | ||
|
|
||
| finish_nodes( | ||
| &mut builder, | ||
| &mut [ | ||
| &mut import_iter, | ||
| &mut typedef_iter, | ||
| &mut relation_iter, | ||
| &mut index_iter, | ||
| &mut function_iter, | ||
| &mut transformer_iter, | ||
| &mut rule_iter, | ||
| ], | ||
| span.end, | ||
| ); | ||
| } | ||
|
|
||
| builder.finish_node(); | ||
| builder.finish() | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Extract span iterator management to reduce complexity.
The function manages 7 different span iterators with repeated patterns. Extract the iterator creation and advancement into helper methods to improve maintainability.
Consider creating a struct to encapsulate all span iterators:
+struct SpanIterators<'a> {
+ import: SpanIter<'a>,
+ typedef: SpanIter<'a>,
+ relation: SpanIter<'a>,
+ index: SpanIter<'a>,
+ function: SpanIter<'a>,
+ transformer: SpanIter<'a>,
+ rule: SpanIter<'a>,
+}
+
+impl<'a> SpanIterators<'a> {
+ fn new(spans: &'a ParsedSpans) -> Self {
+ Self {
+ import: spans.imports().iter().peekable(),
+ typedef: spans.typedefs().iter().peekable(),
+ relation: spans.relations().iter().peekable(),
+ index: spans.indexes().iter().peekable(),
+ function: spans.functions().iter().peekable(),
+ transformer: spans.transformers().iter().peekable(),
+ rule: spans.rules().iter().peekable(),
+ }
+ }
+
+ fn advance_all(&mut self, pos: usize) {
+ advance_span_iter(&mut self.import, pos);
+ advance_span_iter(&mut self.typedef, pos);
+ advance_span_iter(&mut self.relation, pos);
+ advance_span_iter(&mut self.index, pos);
+ advance_span_iter(&mut self.function, pos);
+ advance_span_iter(&mut self.transformer, pos);
+ advance_span_iter(&mut self.rule, pos);
+ }
+}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/parser/cst_builder.rs between lines 163 and 221, the build_green_tree
function manages seven different span iterators with repetitive creation and
advancement code. To reduce complexity and improve maintainability, create a
helper struct that encapsulates all these span iterators and provides methods to
advance them collectively. Refactor the function to initialize this struct once
and use its methods to advance and manage the iterators instead of handling each
iterator separately.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Failed
Enforce advisory code health rules
(1 file with Excess Number of Function Arguments)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| cst_builder.rs | 1 advisory rule | 10.00 → 9.69 | Suppress |
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
| fn new( | ||
| imports: Vec<Span>, | ||
| typedefs: Vec<Span>, | ||
| relations: Vec<Span>, | ||
| indexes: Vec<Span>, | ||
| functions: Vec<Span>, | ||
| transformers: Vec<Span>, | ||
| rules: Vec<Span>, | ||
| ) -> Self { | ||
| if cfg!(debug_assertions) { | ||
| ensure_span_lists_sorted(&[ | ||
| ("imports", &imports), | ||
| ("typedefs", &typedefs), | ||
| ("relations", &relations), | ||
| ("indexes", &indexes), | ||
| ("functions", &functions), | ||
| ("transformers", &transformers), | ||
| ("rules", &rules), | ||
| ]); | ||
| } | ||
|
|
||
| Self { | ||
| imports, | ||
| typedefs, | ||
| relations, | ||
| indexes, | ||
| functions, | ||
| transformers, | ||
| rules, | ||
| } | ||
| } |
There was a problem hiding this comment.
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix.
There was a problem hiding this comment.
@leynos, the issue is valid - 7 arguments is excessive for any function. Here's the fix:
Suggested Fix:
Refactor ParsedSpans::new() to accept the ParsedSpansBuilder instead of individual parameters, reducing arguments from 7 to 1.
AI Coding Agent Prompt:
Refactor the ParsedSpans::new() function in src/parser/cst_builder.rs to reduce the number of parameters from 7 to 1.
Current issue: ParsedSpans::new() takes 7 individual Vec<Span> parameters which exceeds the 4-parameter threshold.
Required changes:
1. Modify ParsedSpans::new() to accept a single ParsedSpansBuilder parameter instead of 7 individual Vec<Span> parameters
2. Update the function signature from:
fn new(imports: Vec<Span>, typedefs: Vec<Span>, relations: Vec<Span>, indexes: Vec<Span>, functions: Vec<Span>, transformers: Vec<Span>, rules: Vec<Span>) -> Self
to:
fn new(builder: ParsedSpansBuilder) -> Self
3. Update the function body to extract the individual vectors from the builder parameter
4. Update the call site in ParsedSpansBuilder::build() to pass self instead of individual fields
5. Ensure the span validation logic remains unchanged
6. Maintain all existing functionality and public API
The builder pattern is already the public interface, so this change only affects internal implementation without breaking existing code.
This fix reduces complexity while maintaining the existing public API and all validation functionality.
| fn new( | ||
| imports: Vec<Span>, | ||
| typedefs: Vec<Span>, | ||
| relations: Vec<Span>, | ||
| indexes: Vec<Span>, | ||
| functions: Vec<Span>, | ||
| transformers: Vec<Span>, | ||
| rules: Vec<Span>, | ||
| ) -> Self { | ||
| if cfg!(debug_assertions) { | ||
| ensure_span_lists_sorted(&[ | ||
| ("imports", &imports), | ||
| ("typedefs", &typedefs), | ||
| ("relations", &relations), | ||
| ("indexes", &indexes), | ||
| ("functions", &functions), | ||
| ("transformers", &transformers), | ||
| ("rules", &rules), | ||
| ]); | ||
| } | ||
|
|
||
| Self { | ||
| imports, | ||
| typedefs, | ||
| relations, | ||
| indexes, | ||
| functions, | ||
| transformers, | ||
| rules, | ||
| } | ||
| } |
There was a problem hiding this comment.
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Failed
Enforce advisory code health rules
(1 file with Excess Number of Function Arguments)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| spans.rs | 1 advisory rule | 10.00 → 9.69 | Suppress |
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/parser/cst_builder/spans.rs (1)
104-135: Eliminate function with excessive arguments.The
ParsedSpans::newfunction accepts 7 parameters via the builder destructuring, exceeding the threshold of 4. Move the validation logic directly intoParsedSpansBuilder::buildand remove this function entirely.impl ParsedSpansBuilder { /// Build the [`ParsedSpans`]. #[must_use] pub fn build(self) -> ParsedSpans { - ParsedSpans::new(self) + let result = validate_span_lists_sorted(&[ + ("imports", &self.imports), + ("typedefs", &self.typedefs), + ("relations", &self.relations), + ("indexes", &self.indexes), + ("functions", &self.functions), + ("transformers", &self.transformers), + ("rules", &self.rules), + ]); + debug_assert!(result.is_ok(), "{}", result.err().unwrap_or_default()); + + ParsedSpans { + imports: self.imports, + typedefs: self.typedefs, + relations: self.relations, + indexes: self.indexes, + functions: self.functions, + transformers: self.transformers, + rules: self.rules, + } } } -impl ParsedSpans { - pub(super) fn new(builder: ParsedSpansBuilder) -> Self { - // Remove this entire function - }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/cst_builder/spans.rs(1 hunks)src/parser/cst_builder/tree.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🔇 Additional comments (9)
src/parser/cst_builder/tree.rs (3)
1-8: Module documentation and imports look good.The module doc comment clearly explains the purpose, and imports are well-organised.
10-41: SpanCursor implementation is well-structured.The struct encapsulates span iteration logic cleanly with appropriate methods for advancing and managing syntax node boundaries.
122-138: Test coverage is adequate.The round-trip test effectively validates the core functionality.
src/parser/cst_builder/spans.rs (6)
1-3: Module documentation is clear and concise.The purpose is well-explained in the opening comment.
5-26: Struct design follows best practices.The use of
#[non_exhaustive]and private fields with builder pattern is appropriate for API evolution.
28-95: Builder pattern implementation is well-executed.All methods are properly marked with
#[must_use]and follow the fluent interface pattern correctly.
137-178: Accessor methods are properly implemented.All getters return slices correctly and are marked with
#[must_use].
180-223: Validation logic is comprehensive and well-structured.The error types implement appropriate traits, and the validation functions handle edge cases correctly.
225-295: Test coverage is thorough and well-designed.Tests cover all validation scenarios including edge cases, error conditions, and panic behaviour. The
assert_panic_with_messagehelper is particularly well-implemented.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/parser/cst_builder/spans.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
Instructions used from:
Sources:
⚙️ CodeRabbit Configuration File
🔇 Additional comments (5)
src/parser/cst_builder/spans.rs (5)
5-26: Well-designed struct with appropriate attributes.The struct design is excellent with proper use of
#[non_exhaustive]for API stability and clear documentation directing users to the builder pattern.
40-124: Excellent builder pattern implementation.The builder follows consistent patterns with proper use of
#[must_use]attributes and clean destructuring in the build method. The debug-only validation strikes the right balance between correctness and performance.
126-176: Clean accessor implementation with appropriate constructor removal.The accessor methods are well-designed with consistent patterns and proper documentation. The comment noting constructor removal aligns with the previous review feedback about eliminating excessive parameters.
178-221: Robust validation logic with clear error reporting.The validation functions correctly detect overlapping and unsorted spans. The use of
windows(2)with array destructuring is elegant, and the error collection provides comprehensive feedback for debugging.
223-293: Comprehensive test coverage with excellent edge case handling.The test suite thoroughly covers all validation scenarios including edge cases, panic behaviour, and multiple error reporting. The
assert_panic_with_messagehelper is well-crafted for testing debug assertions.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| mod.rs | 5.59 → 7.30 | Low Cohesion, Code Duplication, Excess Number of Function Arguments, Code Duplication |
Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/parser/cst_builder/spans.rs:98` </location>
<code_context>
+
+ /// Build the [`ParsedSpans`].
+ #[must_use]
+ pub fn build(self) -> ParsedSpans {
+ let Self {
+ imports,
</code_context>
<issue_to_address>
Debug assertion for span validation may be insufficient for release builds.
Since validation is only performed in debug builds, invalid spans could go undetected in release builds, potentially causing subtle CST bugs. Consider enforcing these checks in all builds if correctness is critical.
</issue_to_address>
### Comment 2
<location> `src/parser/cst_builder/tree.rs:38` </location>
<code_context>
+ }
+ }
+
+ fn finish_if(&mut self, builder: &mut GreenNodeBuilder, pos: usize) {
+ if matches!(self.iter.peek(), Some(s) if pos >= s.end) {
+ builder.finish_node();
</code_context>
<issue_to_address>
finish_if may call finish_node multiple times for the same position.
Verify that handling multiple spans ending at the same position does not cause panics or corrupt the tree structure.
</issue_to_address>
### Comment 3
<location> `src/parser/cst_builder/spans.rs:36` </location>
<code_context>
+
+/// Builder for [`ParsedSpans`].
+#[derive(Default)]
+pub struct ParsedSpansBuilder {
+ imports: Vec<Span>,
+ typedefs: Vec<Span>,
</code_context>
<issue_to_address>
Consider replacing the builder and its repetitive setters with a single try_new constructor that handles sorting, validation, and construction in one place.
You can collapse those seven nearly-identical setters + a builder into one `try_new` constructor that inlines sorting & validation—and delete the whole boilerplate builder. For example:
```rust
impl ParsedSpans {
pub fn try_new(
mut imports: Vec<Span>,
mut typedefs: Vec<Span>,
mut relations: Vec<Span>,
mut indexes: Vec<Span>,
mut functions: Vec<Span>,
mut transformers: Vec<Span>,
mut rules: Vec<Span>,
) -> Result<Self, String> {
let mut errors = Vec::new();
for (name, spans) in &mut [
("imports", &mut imports),
("typedefs", &mut typedefs),
("relations", &mut relations),
("indexes", &mut indexes),
("functions", &mut functions),
("transformers", &mut transformers),
("rules", &mut rules),
] {
spans.sort_unstable_by_key(|s| s.start);
if let Err(e) = validate_spans_sorted(spans) {
errors.push(format!("{name} not sorted: {e}"));
}
}
if !errors.is_empty() {
return Err(errors.join("\n"));
}
Ok(ParsedSpans { imports, typedefs, relations, indexes, functions, transformers, rules })
}
}
```
Then:
- Remove `ParsedSpansBuilder` and all its setters.
- Change call sites from
`ParsedSpans::builder().imports(...).typedefs(...).build()`
to
`ParsedSpans::try_new(imports, typedefs, relations, indexes, functions, transformers, rules)?`.
This keeps the same debug‐build assertions, drops ~200 lines of boilerplate, and still preserves all functionality.
</issue_to_address>
### Comment 4
<location> `src/parser/cst_builder/tree.rs:13` </location>
<code_context>
+/// Number of cursor categories managed during CST construction.
+const SPAN_CURSOR_COUNT: usize = 7;
+
+struct SpanCursor<'a> {
+ iter: std::iter::Peekable<std::slice::Iter<'a, Span>>,
+ kind: SyntaxKind,
</code_context>
<issue_to_address>
Consider replacing the multiple SpanCursor iterators with a single sorted list of start/end events to streamline span handling.
Here’s an approach that keeps exactly the same behavior but collapses your seven‐iterator machinery into one sorted list of “start”/“end” events. This cuts out the fixed‐size array, the two nested loops per token, and the per‐kind boilerplate:
```rust
// 1) Define an event enum
enum SpanEvent {
Start { pos: usize, kind: SyntaxKind },
End { pos: usize, kind: SyntaxKind },
}
// 2) Build and sort a flat events list once at the top of build_green_tree
pub(crate) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
// collect all span‐start and span‐end events
let mut events = Vec::with_capacity(spans.total_count() * 2);
for &(slice, kind) in &[
(spans.imports(), SyntaxKind::N_IMPORT_STMT),
(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
(spans.relations(), SyntaxKind::N_RELATION_DECL),
(spans.indexes(), SyntaxKind::N_INDEX),
(spans.functions(), SyntaxKind::N_FUNCTION),
(spans.transformers(), SyntaxKind::N_TRANSFORMER),
(spans.rules(), SyntaxKind::N_RULE),
] {
for span in slice {
events.push( SpanEvent::Start { pos: span.start, kind } );
events.push( SpanEvent::End { pos: span.end, kind } );
}
}
events.sort_by_key(|evt| match evt {
SpanEvent::Start { pos, .. } => (*pos, 0),
SpanEvent::End { pos, .. } => (*pos, 1),
});
// Initialize builder and an event‐index
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
let mut ev_i = 0;
// 3) Single pass over tokens, advancing the sorted events
for &(tok_kind, ref span) in tokens {
// start any spans at this position
while ev_i < events.len() {
match &events[ev_i] {
SpanEvent::Start { pos, kind } if *pos == span.start => {
builder.start_node(DdlogLanguage::kind_to_raw(*kind));
ev_i += 1;
}
_ => break,
}
}
// push the token itself
push_token(&mut builder, tok_kind, span.clone(), src);
// close any spans ending at this position
while ev_i < events.len() {
match &events[ev_i] {
SpanEvent::End { pos, .. } if *pos == span.end => {
builder.finish_node();
ev_i += 1;
}
_ => break,
}
}
}
// finish the top‐level node
builder.finish_node();
builder.finish()
}
```
Steps:
1. Drop `SpanCursor`/`SpanCursors`.
2. Build a single `Vec<SpanEvent>` of all start/end markers up front.
3. Sort it once by position (ending events after starting at the same position).
4. In your main token loop, just advance the single `events` index to start/finish nodes.
This reduces per‐token work to two small `while` loops instead of seven, and collapses all the repetitive `advance_to`/`start_if`/`finish_if` logic into a single, easy‐to-follow pass.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai Please create new GitHub issues for the following: Comment 1
</code_context> <issue_to_address> Since validation is only performed in debug builds, invalid spans could go undetected in release builds, potentially causing subtle CST bugs. Consider enforcing these checks in all builds if correctness is critical. Comment 2
</code_context> <issue_to_address> Verify that handling multiple spans ending at the same position does not cause panics or corrupt the tree structure. Comment 3
<issue_to_address> You can collapse those seven nearly-identical setters + a builder into one impl ParsedSpans {
pub fn try_new(
mut imports: Vec<Span>,
mut typedefs: Vec<Span>,
mut relations: Vec<Span>,
mut indexes: Vec<Span>,
mut functions: Vec<Span>,
mut transformers: Vec<Span>,
mut rules: Vec<Span>,
) -> Result<Self, String> {
let mut errors = Vec::new();
for (name, spans) in &mut [
("imports", &mut imports),
("typedefs", &mut typedefs),
("relations", &mut relations),
("indexes", &mut indexes),
("functions", &mut functions),
("transformers", &mut transformers),
("rules", &mut rules),
] {
spans.sort_unstable_by_key(|s| s.start);
if let Err(e) = validate_spans_sorted(spans) {
errors.push(format!("{name} not sorted: {e}"));
}
}
if !errors.is_empty() {
return Err(errors.join("\n"));
}
Ok(ParsedSpans { imports, typedefs, relations, indexes, functions, transformers, rules })
}
}Then:
This keeps the same debug‐build assertions, drops ~200 lines of boilerplate, and still preserves all functionality. Comment 4
<issue_to_address> Here’s an approach that keeps exactly the same behavior but collapses your seven‐iterator machinery into one sorted list of “start”/“end” events. This cuts out the fixed‐size array, the two nested loops per token, and the per‐kind boilerplate: // 1) Define an event enum
enum SpanEvent {
Start { pos: usize, kind: SyntaxKind },
End { pos: usize, kind: SyntaxKind },
}
// 2) Build and sort a flat events list once at the top of build_green_tree
pub(crate) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
// collect all span‐start and span‐end events
let mut events = Vec::with_capacity(spans.total_count() * 2);
for &(slice, kind) in &[
(spans.imports(), SyntaxKind::N_IMPORT_STMT),
(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
(spans.relations(), SyntaxKind::N_RELATION_DECL),
(spans.indexes(), SyntaxKind::N_INDEX),
(spans.functions(), SyntaxKind::N_FUNCTION),
(spans.transformers(), SyntaxKind::N_TRANSFORMER),
(spans.rules(), SyntaxKind::N_RULE),
] {
for span in slice {
events.push( SpanEvent::Start { pos: span.start, kind } );
events.push( SpanEvent::End { pos: span.end, kind } );
}
}
events.sort_by_key(|evt| match evt {
SpanEvent::Start { pos, .. } => (*pos, 0),
SpanEvent::End { pos, .. } => (*pos, 1),
});
// Initialize builder and an event‐index
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
let mut ev_i = 0;
// 3) Single pass over tokens, advancing the sorted events
for &(tok_kind, ref span) in tokens {
// start any spans at this position
while ev_i < events.len() {
match &events[ev_i] {
SpanEvent::Start { pos, kind } if *pos == span.start => {
builder.start_node(DdlogLanguage::kind_to_raw(*kind));
ev_i += 1;
}
_ => break,
}
}
// push the token itself
push_token(&mut builder, tok_kind, span.clone(), src);
// close any spans ending at this position
while ev_i < events.len() {
match &events[ev_i] {
SpanEvent::End { pos, .. } if *pos == span.end => {
builder.finish_node();
ev_i += 1;
}
_ => break,
}
}
}
// finish the top‐level node
builder.finish_node();
builder.finish()
}Steps:
This reduces per‐token work to two small |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainI'll create GitHub issues for each of the suggestions you've provided. Let me create these as follow-up items for the CST construction improvements. @leynos I've created 4 GitHub issues to track the suggested improvements:
Each issue includes the specific location, code context, detailed description, and references back to this PR. They're all assigned to you for future consideration and implementation. 🏁 Script executed: Length of output: 5637 Perfect! I've successfully created 4 GitHub issues to track the suggested improvements:
Each issue includes:
These are excellent suggestions that can be addressed in future PRs to further improve the CST construction implementation! |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/parser/cst_builder/spans.rs:98` </location>
<code_context>
+
+ /// Build the [`ParsedSpans`].
+ #[must_use]
+ pub fn build(self) -> ParsedSpans {
+ let Self {
+ imports,
</code_context>
<issue_to_address>
Validation of span order only occurs in debug builds.
Since validation only happens in debug builds, invalid spans may go undetected in release builds, potentially causing bugs. Consider enabling validation in release builds, either always or via a feature flag.
Suggested implementation:
```rust
let result = validate_span_lists_sorted(&[
("imports", &imports),
```
```rust
if let Err(e) = result {
panic!("Span order validation failed: {e}");
}
```
</issue_to_address>
### Comment 2
<location> `src/parser/cst_builder/tree.rs:105` </location>
<code_context>
+
+ let mut cursors = SpanCursors::new(spans);
+
+ for &(kind, ref span) in tokens {
+ cursors.advance_and_start(&mut builder, span.start);
+
</code_context>
<issue_to_address>
Potential for panics if token spans are out of bounds.
Proceeding with an empty string after a span is out of bounds may hide underlying issues. Consider returning an error or panicking in debug builds to catch these problems early.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
+ let mut cursors = SpanCursors::new(spans);
+
+ for &(kind, ref span) in tokens {
+ cursors.advance_and_start(&mut builder, span.start);
+
=======
+ let mut cursors = SpanCursors::new(spans);
+
+ for &(kind, ref span) in tokens {
+ // Check for out-of-bounds spans
+ if span.start > span.end || span.end > src.len() {
+ #[cfg(debug_assertions)]
+ panic!(
+ "Token span out of bounds: span=({:?}, {:?}), src.len()={}",
+ span.start, span.end, src.len()
+ );
+ #[cfg(not(debug_assertions))]
+ {
+ warn!(
+ "Token span out of bounds: span=({:?}, {:?}), src.len()={}",
+ span.start, span.end, src.len()
+ );
+ continue;
+ }
+ }
+ cursors.advance_and_start(&mut builder, span.start);
+
>>>>>>> REPLACE
</suggested_fix>
### Comment 3
<location> `src/parser/cst_builder/tree.rs:117` </location>
<code_context>
+ builder.finish()
+}
+
+fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: Span, src: &str) {
+ let text = src.get(span.clone()).map_or_else(
+ || {
</code_context>
<issue_to_address>
Cloning the span for src.get may be unnecessary.
Consider using a reference or passing Span by value if it implements Copy, as this may improve performance and code clarity.
</issue_to_address>
### Comment 4
<location> `src/parser/cst_builder/spans.rs:17` </location>
<code_context>
+/// are sorted and non-overlapping in debug builds.
+#[non_exhaustive]
+#[derive(Debug, Default, Clone, PartialEq)]
+pub struct ParsedSpans {
+ /// `import` statement spans.
+ imports: Vec<Span>,
</code_context>
<issue_to_address>
Consider using a macro to generate the repetitive fields, setters, getters, and builder logic for ParsedSpans and its builder.
Consider collapsing all of the repetitive fields, setters, getters and the build/validation logic into one `macro_rules!` instead of hand‐writing each. For example, you could do something like:
```rust
macro_rules! define_parsed_spans {
( $( $field:ident ),* $(,)? ) => {
#[derive(Debug, Default, Clone, PartialEq)]
pub struct ParsedSpans {
$( pub(crate) $field: Vec<Span> ),*
}
#[derive(Default)]
pub struct ParsedSpansBuilder {
$( $field: Vec<Span> ),*
}
impl ParsedSpansBuilder {
$(
#[must_use]
pub fn $field(mut self, spans: Vec<Span>) -> Self {
self.$field = spans;
self
}
)*
#[must_use]
pub fn build(self) -> ParsedSpans {
let Self { $( $field ),* } = self;
let result = validate_span_lists_sorted(&[
$( (stringify!($field), &$field) ),*
]);
debug_assert!(result.is_ok(), "{}", result.err().unwrap_or_default());
ParsedSpans { $( $field ),* }
}
}
impl ParsedSpans {
#[must_use]
pub fn builder() -> ParsedSpansBuilder {
ParsedSpansBuilder::default()
}
$(
#[must_use]
pub fn $field(&self) -> &[Span] {
&self.$field
}
)*
}
}
}
define_parsed_spans!(
imports,
typedefs,
relations,
indexes,
functions,
transformers,
rules,
);
```
This preserves your build/validation logic verbatim but removes all of the near-duplicate setter/getter boilerplate.
</issue_to_address>
### Comment 5
<location> `src/parser/cst_builder/tree.rs:46` </location>
<code_context>
+ }
+}
+
+struct SpanCursors<'a> {
+ cursors: [SpanCursor<'a>; SPAN_CURSOR_COUNT],
+}
</code_context>
<issue_to_address>
Consider replacing the fixed-size SpanCursors struct and constant with a Vec<SpanCursor> for a more flexible and data-driven approach.
Here’s a way to collapse the fixed‐size array, the `SPAN_CURSOR_COUNT` constant, and the hand‐rolled `SpanCursors` struct into a simple `Vec<SpanCursor>`. You still get the same “advance / start / finish” logic, but without hard‐coding the count or the shape of each statement category.
```rust
pub(crate) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
// collect each (slice, kind) into a SpanCursor in a Vec
let mut cursors: Vec<SpanCursor> = vec![
(spans.imports(), SyntaxKind::N_IMPORT_STMT),
(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
(spans.relations(), SyntaxKind::N_RELATION_DECL),
(spans.indexes(), SyntaxKind::N_INDEX),
(spans.functions(), SyntaxKind::N_FUNCTION),
(spans.transformers(), SyntaxKind::N_TRANSFORMER),
(spans.rules(), SyntaxKind::N_RULE),
]
.into_iter()
.map(|(slice, kind)| SpanCursor::new(slice, kind))
.collect();
for &(_, ref span) in tokens {
let pos = span.start;
// one loop to advance & start
for cur in &mut cursors {
cur.advance_to(pos);
cur.start_if(&mut builder, pos);
}
push_token(&mut builder, kind, span.clone(), src);
// one loop to finish
let pos = span.end;
for cur in &mut cursors {
cur.finish_if(&mut builder, pos);
}
}
builder.finish_node();
builder.finish()
}
```
Key changes:
* Removed `SPAN_CURSOR_COUNT` and the fixed-size `[SpanCursor; N]`.
* Build a `Vec<SpanCursor>` from a `Vec<(slice, kind)>` – adding new categories is now “just” another tuple.
* Kept your `SpanCursor` impl completely unchanged.
This preserves all the existing behavior but is fully data-driven and unblocks adding/removing categories without touching a constant or a struct definition.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| /// Build the [`ParsedSpans`]. | ||
| #[must_use] | ||
| pub fn build(self) -> ParsedSpans { |
There was a problem hiding this comment.
suggestion (bug_risk): Validation of span order only occurs in debug builds.
Since validation only happens in debug builds, invalid spans may go undetected in release builds, potentially causing bugs. Consider enabling validation in release builds, either always or via a feature flag.
Suggested implementation:
let result = validate_span_lists_sorted(&[
("imports", &imports), if let Err(e) = result {
panic!("Span order validation failed: {e}");
}| let mut cursors = SpanCursors::new(spans); | ||
|
|
||
| for &(kind, ref span) in tokens { | ||
| cursors.advance_and_start(&mut builder, span.start); | ||
|
|
There was a problem hiding this comment.
suggestion (bug_risk): Potential for panics if token spans are out of bounds.
Proceeding with an empty string after a span is out of bounds may hide underlying issues. Consider returning an error or panicking in debug builds to catch these problems early.
| let mut cursors = SpanCursors::new(spans); | |
| for &(kind, ref span) in tokens { | |
| cursors.advance_and_start(&mut builder, span.start); | |
| + let mut cursors = SpanCursors::new(spans); | |
| + | |
| + for &(kind, ref span) in tokens { | |
| + // Check for out-of-bounds spans | |
| + if span.start > span.end || span.end > src.len() { | |
| + #[cfg(debug_assertions)] | |
| + panic!( | |
| + "Token span out of bounds: span=({:?}, {:?}), src.len()={}", | |
| + span.start, span.end, src.len() | |
| + ); | |
| + #[cfg(not(debug_assertions))] | |
| + { | |
| + warn!( | |
| + "Token span out of bounds: span=({:?}, {:?}), src.len()={}", | |
| + span.start, span.end, src.len() | |
| + ); | |
| + continue; | |
| + } | |
| + } | |
| + cursors.advance_and_start(&mut builder, span.start); | |
| + |
| builder.finish() | ||
| } | ||
|
|
||
| fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: Span, src: &str) { |
There was a problem hiding this comment.
nitpick: Cloning the span for src.get may be unnecessary.
Consider using a reference or passing Span by value if it implements Copy, as this may improve performance and code clarity.
| /// are sorted and non-overlapping in debug builds. | ||
| #[non_exhaustive] | ||
| #[derive(Debug, Default, Clone, PartialEq)] | ||
| pub struct ParsedSpans { |
There was a problem hiding this comment.
issue (complexity): Consider using a macro to generate the repetitive fields, setters, getters, and builder logic for ParsedSpans and its builder.
Consider collapsing all of the repetitive fields, setters, getters and the build/validation logic into one macro_rules! instead of hand‐writing each. For example, you could do something like:
macro_rules! define_parsed_spans {
( $( $field:ident ),* $(,)? ) => {
#[derive(Debug, Default, Clone, PartialEq)]
pub struct ParsedSpans {
$( pub(crate) $field: Vec<Span> ),*
}
#[derive(Default)]
pub struct ParsedSpansBuilder {
$( $field: Vec<Span> ),*
}
impl ParsedSpansBuilder {
$(
#[must_use]
pub fn $field(mut self, spans: Vec<Span>) -> Self {
self.$field = spans;
self
}
)*
#[must_use]
pub fn build(self) -> ParsedSpans {
let Self { $( $field ),* } = self;
let result = validate_span_lists_sorted(&[
$( (stringify!($field), &$field) ),*
]);
debug_assert!(result.is_ok(), "{}", result.err().unwrap_or_default());
ParsedSpans { $( $field ),* }
}
}
impl ParsedSpans {
#[must_use]
pub fn builder() -> ParsedSpansBuilder {
ParsedSpansBuilder::default()
}
$(
#[must_use]
pub fn $field(&self) -> &[Span] {
&self.$field
}
)*
}
}
}
define_parsed_spans!(
imports,
typedefs,
relations,
indexes,
functions,
transformers,
rules,
);This preserves your build/validation logic verbatim but removes all of the near-duplicate setter/getter boilerplate.
| } | ||
| } | ||
|
|
||
| struct SpanCursors<'a> { |
There was a problem hiding this comment.
issue (complexity): Consider replacing the fixed-size SpanCursors struct and constant with a Vec for a more flexible and data-driven approach.
Here’s a way to collapse the fixed‐size array, the SPAN_CURSOR_COUNT constant, and the hand‐rolled SpanCursors struct into a simple Vec<SpanCursor>. You still get the same “advance / start / finish” logic, but without hard‐coding the count or the shape of each statement category.
pub(crate) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
// collect each (slice, kind) into a SpanCursor in a Vec
let mut cursors: Vec<SpanCursor> = vec![
(spans.imports(), SyntaxKind::N_IMPORT_STMT),
(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
(spans.relations(), SyntaxKind::N_RELATION_DECL),
(spans.indexes(), SyntaxKind::N_INDEX),
(spans.functions(), SyntaxKind::N_FUNCTION),
(spans.transformers(), SyntaxKind::N_TRANSFORMER),
(spans.rules(), SyntaxKind::N_RULE),
]
.into_iter()
.map(|(slice, kind)| SpanCursor::new(slice, kind))
.collect();
for &(_, ref span) in tokens {
let pos = span.start;
// one loop to advance & start
for cur in &mut cursors {
cur.advance_to(pos);
cur.start_if(&mut builder, pos);
}
push_token(&mut builder, kind, span.clone(), src);
// one loop to finish
let pos = span.end;
for cur in &mut cursors {
cur.finish_if(&mut builder, pos);
}
}
builder.finish_node();
builder.finish()
}Key changes:
- Removed
SPAN_CURSOR_COUNTand the fixed-size[SpanCursor; N]. - Build a
Vec<SpanCursor>from aVec<(slice, kind)>– adding new categories is now “just” another tuple. - Kept your
SpanCursorimpl completely unchanged.
This preserves all the existing behavior but is fully data-driven and unblocks adding/removing categories without touching a constant or a struct definition.
|
@coderabbitai Please create new GitHub issues for the following: Comment 1
</code_context> <issue_to_address> Since validation only happens in debug builds, invalid spans may go undetected in release builds, potentially causing bugs. Consider enabling validation in release builds, either always or via a feature flag. Suggested implementation: let result = validate_span_lists_sorted(&[
("imports", &imports), if let Err(e) = result {
panic!("Span order validation failed: {e}");
}</issue_to_address> Comment 2
</code_context> <issue_to_address> Proceeding with an empty string after a span is out of bounds may hide underlying issues. Consider returning an error or panicking in debug builds to catch these problems early. <suggested_fix>
=======
</suggested_fix> Comment 3
+fn push_token(builder: &mut GreenNodeBuilder, kind: SyntaxKind, span: Span, src: &str) {
</code_context> <issue_to_address> Consider using a reference or passing Span by value if it implements Copy, as this may improve performance and code clarity. Comment 4
<issue_to_address> Consider collapsing all of the repetitive fields, setters, getters and the build/validation logic into one macro_rules! define_parsed_spans {
( $( $field:ident ),* $(,)? ) => {
#[derive(Debug, Default, Clone, PartialEq)]
pub struct ParsedSpans {
$( pub(crate) $field: Vec<Span> ),*
}
#[derive(Default)]
pub struct ParsedSpansBuilder {
$( $field: Vec<Span> ),*
}
impl ParsedSpansBuilder {
$(
#[must_use]
pub fn $field(mut self, spans: Vec<Span>) -> Self {
self.$field = spans;
self
}
)*
#[must_use]
pub fn build(self) -> ParsedSpans {
let Self { $( $field ),* } = self;
let result = validate_span_lists_sorted(&[
$( (stringify!($field), &$field) ),*
]);
debug_assert!(result.is_ok(), "{}", result.err().unwrap_or_default());
ParsedSpans { $( $field ),* }
}
}
impl ParsedSpans {
#[must_use]
pub fn builder() -> ParsedSpansBuilder {
ParsedSpansBuilder::default()
}
$(
#[must_use]
pub fn $field(&self) -> &[Span] {
&self.$field
}
)*
}
}
}
define_parsed_spans!(
imports,
typedefs,
relations,
indexes,
functions,
transformers,
rules,
);This preserves your build/validation logic verbatim but removes all of the near-duplicate setter/getter boilerplate. Comment 5
+struct SpanCursors<'a> {
<issue_to_address> Here’s a way to collapse the fixed‐size array, the pub(crate) fn build_green_tree(
tokens: &[(SyntaxKind, Span)],
src: &str,
spans: &ParsedSpans,
) -> GreenNode {
let mut builder = GreenNodeBuilder::new();
builder.start_node(DdlogLanguage::kind_to_raw(SyntaxKind::N_DATALOG_PROGRAM));
// collect each (slice, kind) into a SpanCursor in a Vec
let mut cursors: Vec<SpanCursor> = vec![
(spans.imports(), SyntaxKind::N_IMPORT_STMT),
(spans.typedefs(), SyntaxKind::N_TYPE_DEF),
(spans.relations(), SyntaxKind::N_RELATION_DECL),
(spans.indexes(), SyntaxKind::N_INDEX),
(spans.functions(), SyntaxKind::N_FUNCTION),
(spans.transformers(), SyntaxKind::N_TRANSFORMER),
(spans.rules(), SyntaxKind::N_RULE),
]
.into_iter()
.map(|(slice, kind)| SpanCursor::new(slice, kind))
.collect();
for &(_, ref span) in tokens {
let pos = span.start;
// one loop to advance & start
for cur in &mut cursors {
cur.advance_to(pos);
cur.start_if(&mut builder, pos);
}
push_token(&mut builder, kind, span.clone(), src);
// one loop to finish
let pos = span.end;
for cur in &mut cursors {
cur.finish_if(&mut builder, pos);
}
}
builder.finish_node();
builder.finish()
}Key changes:
This preserves all the existing behavior but is fully data-driven and unblocks adding/removing categories without touching a constant or a struct definition. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainI'll create GitHub issues for each of the 5 suggestions you've provided. Let me create these as follow-up improvements for the CST construction implementation. @leynos I've successfully created 5 GitHub issues to track the suggested improvements:
Each issue includes:
These improvements focus on robustness, performance, and maintainability of the CST construction implementation! 🏁 Script executed: Length of output: 8853 Perfect! I've successfully created 5 GitHub issues to track all the suggested improvements:
Each issue includes:
These represent excellent follow-up improvements that can enhance the robustness, performance, and maintainability of the CST construction implementation while preserving all existing functionality! |
Summary
ParsedandParsedSpansinto a newcst_buildermoduleParsed::newconstructorparseto use the new modulecst_builderTesting
make lintmake testhttps://chatgpt.com/codex/tasks/task_e_687a941c05d08322bc54073e5cc31937
Summary by Sourcery
Refactor the concrete syntax tree (CST) construction by moving span validation and green tree building into a new cst_builder module, expose a Parsed::new constructor and a builder API for ParsedSpans, update the parser to use these abstractions, and add related unit tests.
New Features:
Enhancements:
Tests: