Conversation
Parse typedef and extern type declarations, build N_TYPE_DEF nodes, and add typed AST accessors. Extend tests for type definitions.
Reviewer's GuideThis PR adds support for Class diagram for new and updated AST node types (TypeDef and Root)classDiagram
class Root {
+Vec<Import> imports()
+Vec<TypeDef> type_defs()
}
class Import {
+syntax: SyntaxNode
+syntax() SyntaxNode
+path() Option<String>
+alias() Option<String>
}
class TypeDef {
+syntax: SyntaxNode
+syntax() SyntaxNode
+name() Option<String>
+is_extern() bool
+definition() Option<String>
}
Root --> Import : contains
Root --> TypeDef : contains
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
WalkthroughThe parser was extended to distinguish and process both import and typedef declarations at the top level. Internal data structures and functions were updated to track and handle spans for imports and typedefs separately. The typed AST gained a new Changes
Sequence Diagram(s)sequenceDiagram
participant Source as Source Code
participant Parser as Parser
participant CST as CST Builder
participant AST as Typed AST
Source->>Parser: Provide tokens
Parser->>Parser: Identify imports and typedefs, collect spans
Parser->>CST: Pass tokens and ItemSpans
CST->>CST: Build CST nodes for imports and typedefs
CST->>AST: Wrap typedef nodes in TypeDef structs
AST->>AST: Provide methods for introspection (name, definition, is_extern)
Poem
✨ Finishing Touches
🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Refactor the duplicated span-advancing and node-start/finish logic in build_green_tree into a common helper to reduce boilerplate for imports and typedefs.
- Simplify TypeDef.definition() by extracting the child type syntax node and using its .text() instead of manually iterating and concatenating tokens.
- In parse_tokens(), consider dropping the temporary Item enum and partitioning spans directly into ItemSpans for clearer, more concise span collection.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Refactor the duplicated span-advancing and node-start/finish logic in build_green_tree into a common helper to reduce boilerplate for imports and typedefs.
- Simplify TypeDef.definition() by extracting the child type syntax node and using its .text() instead of manually iterating and concatenating tokens.
- In parse_tokens(), consider dropping the temporary Item enum and partitioning spans directly into ItemSpans for clearer, more concise span collection.
## Individual Comments
### Comment 1
<location> `src/parser/mod.rs:186` </location>
<code_context>
break;
}
}
+ while let Some(next) = type_iter.peek() {
+ if span.start >= next.end {
+ type_iter.next();
+ } else {
+ break;
+ }
+ }
// Begin an `N_IMPORT_STMT` node when this token marks the start of an
</code_context>
<issue_to_address>
Duplicated logic for advancing import_iter and type_iter could be refactored.
Consider extracting the shared logic into a helper function to minimize duplication and reduce the risk of inconsistencies.
Suggested implementation:
```rust
let mut import_iter = items.imports.iter().peekable();
let mut type_iter = items.typedefs.iter().peekable();
fn advance_iter_to_span<'a, I>(iter: &mut std::iter::Peekable<I>, span_start: usize)
where
I: Iterator<Item = &'a _>,
&'a _: std::ops::Deref<Target = crate::Span>, // Adjust as needed for your Span type
{
while let Some(next) = iter.peek() {
if span_start >= next.end {
iter.next();
} else {
break;
}
}
}
for (kind, span) in tokens {
// Advance to the next import span if this token lies after the end of
// the current one. Multiple tokens can share the same span, so we need
advance_iter_to_span(&mut import_iter, span.start);
advance_iter_to_span(&mut type_iter, span.start);
```
- You may need to adjust the type bounds and dereferencing in the `advance_iter_to_span` function depending on the actual type of your span objects (e.g., if they are not directly accessible via `.end`).
- Remove the old duplicated iterator-advancing code for both `import_iter` and `type_iter`.
</issue_to_address>
### Comment 2
<location> `src/parser/mod.rs:235` </location>
<code_context>
builder.finish_node();
import_iter.next();
}
+ if type_iter
+ .peek()
+ .is_some_and(|current| span.end >= current.end)
+ {
+ builder.finish_node();
+ type_iter.next();
+ }
}
</code_context>
<issue_to_address>
Finishing nodes for import and typedef may overlap if spans are not disjoint.
If overlapping spans are possible, both finish_node calls could be executed for the same token. If this can't happen, please add an assertion or comment to clarify this invariant.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| @@ -21,6 +21,12 @@ pub struct Parsed { | |||
| errors: Vec<Simple<SyntaxKind>>, | |||
| } | |||
There was a problem hiding this comment.
suggestion: Duplicated logic for advancing import_iter and type_iter could be refactored.
Consider extracting the shared logic into a helper function to minimize duplication and reduce the risk of inconsistencies.
Suggested implementation:
let mut import_iter = items.imports.iter().peekable();
let mut type_iter = items.typedefs.iter().peekable();
fn advance_iter_to_span<'a, I>(iter: &mut std::iter::Peekable<I>, span_start: usize)
where
I: Iterator<Item = &'a _>,
&'a _: std::ops::Deref<Target = crate::Span>, // Adjust as needed for your Span type
{
while let Some(next) = iter.peek() {
if span_start >= next.end {
iter.next();
} else {
break;
}
}
}
for (kind, span) in tokens {
// Advance to the next import span if this token lies after the end of
// the current one. Multiple tokens can share the same span, so we need
advance_iter_to_span(&mut import_iter, span.start);
advance_iter_to_span(&mut type_iter, span.start);- You may need to adjust the type bounds and dereferencing in the
advance_iter_to_spanfunction depending on the actual type of your span objects (e.g., if they are not directly accessible via.end). - Remove the old duplicated iterator-advancing code for both
import_iterandtype_iter.
| } | ||
| while let Some(next) = type_iter.peek() { | ||
| if span.start >= next.end { | ||
| type_iter.next(); |
There was a problem hiding this comment.
question (bug_risk): Finishing nodes for import and typedef may overlap if spans are not disjoint.
If overlapping spans are possible, both finish_node calls could be executed for the same token. If this can't happen, please add an assertion or comment to clarify this invariant.
There was a problem hiding this comment.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
src/parser/mod.rs (1)
186-241: Consider refactoring to reduce code duplication.The logic for advancing iterators and starting/finishing nodes is duplicated between imports and typedefs. Consider extracting common logic.
You could extract the common pattern into a helper closure:
let mut handle_span = |iter: &mut std::iter::Peekable<_>, node_kind| { // Advance iterator logic while let Some(next) = iter.peek() { if span.start >= next.end { iter.next(); } else { break; } } // Start node if at span start if iter.peek().is_some_and(|current| span.start == current.start) { builder.start_node(DdlogLanguage::kind_to_raw(node_kind)); return true; } false }; // Then use it for both imports and typedefs let started_import = handle_span(&mut import_iter, SyntaxKind::N_IMPORT_STMT); let started_typedef = handle_span(&mut type_iter, SyntaxKind::N_TYPE_DEF);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/parser/mod.rs(9 hunks)tests/parser.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
tests/parser.rssrc/parser/mod.rs
🧬 Code Graph Analysis (1)
src/parser/mod.rs (1)
src/language.rs (1)
kind_to_raw(182-187)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build-test
🔇 Additional comments (5)
src/parser/mod.rs (3)
70-165: Well-structured parser implementation for typedef support.The parser correctly handles both typedef aliases and extern type declarations using appropriate chumsky combinators. The recursive type parser for tuples is particularly well-implemented.
310-318: Clean implementation consistent with existing patterns.The
type_defs()method correctly mirrors theimports()method implementation.
24-28: Add documentation for theItemSpansstruct.The struct lacks documentation comments required by the coding guidelines.
+/// Holds spans for different types of top-level items parsed from the source. #[derive(Default)] struct ItemSpans { + /// Spans of import statements. imports: Vec<Span>, + /// Spans of typedef declarations. typedefs: Vec<Span>, }Likely an incorrect or invalid review comment.
tests/parser.rs (2)
7-7: Appropriate lint configuration for test code.The
expect_usedlint expectation is correctly scoped and justified for test assertions.
161-197: Comprehensive test coverage for typedef parsing.The three test cases effectively cover the main typedef scenarios: standard aliases, complex tuple types, and extern declarations. The assertions properly validate the TypeDef API methods.
| pub fn definition(&self) -> Option<String> { | ||
| if self.is_extern() { | ||
| return None; | ||
| } | ||
| let mut found_eq = false; | ||
| let mut out = String::new(); | ||
| for e in self.syntax.children_with_tokens() { | ||
| match e { | ||
| rowan::NodeOrToken::Token(t) => { | ||
| if found_eq { | ||
| out.push_str(t.text()); | ||
| } else if t.kind() == SyntaxKind::T_EQ { | ||
| found_eq = true; | ||
| } | ||
| } | ||
| rowan::NodeOrToken::Node(n) => { | ||
| if found_eq { | ||
| out.push_str(&n.text().to_string()); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| if found_eq { | ||
| Some(out.trim().to_string()) | ||
| } else { | ||
| None | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consider using iterator-based approach for better performance.
The current implementation with string concatenation in a loop could be inefficient for large type definitions.
pub fn definition(&self) -> Option<String> {
if self.is_extern() {
return None;
}
- let mut found_eq = false;
- let mut out = String::new();
- for e in self.syntax.children_with_tokens() {
- match e {
- rowan::NodeOrToken::Token(t) => {
- if found_eq {
- out.push_str(t.text());
- } else if t.kind() == SyntaxKind::T_EQ {
- found_eq = true;
- }
- }
- rowan::NodeOrToken::Node(n) => {
- if found_eq {
- out.push_str(&n.text().to_string());
- }
- }
- }
- }
- if found_eq {
- Some(out.trim().to_string())
- } else {
- None
- }
+ let eq_pos = self.syntax
+ .children_with_tokens()
+ .position(|e| matches!(e.kind(), SyntaxKind::T_EQ))?;
+
+ let definition = self.syntax
+ .children_with_tokens()
+ .skip(eq_pos + 1)
+ .map(|e| match e {
+ rowan::NodeOrToken::Token(t) => t.text().to_string(),
+ rowan::NodeOrToken::Node(n) => n.text().to_string(),
+ })
+ .collect::<String>();
+
+ Some(definition.trim().to_string())
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/parser/mod.rs around lines 420 to 447, the definition method uses string
concatenation inside a loop, which can be inefficient for large type
definitions. Refactor this method to use an iterator-based approach by
collecting the relevant text pieces into an iterator and then joining them once
at the end, instead of appending strings repeatedly. This will improve
performance by minimizing intermediate string allocations.
| pub fn name(&self) -> Option<String> { | ||
| let mut iter = self.syntax.children_with_tokens(); | ||
| for el in iter.by_ref() { | ||
| match el.kind() { | ||
| SyntaxKind::K_TYPEDEF => break, | ||
| SyntaxKind::K_EXTERN => { | ||
| // skip 'extern type' | ||
| for tok in iter.by_ref() { | ||
| if tok.kind() == SyntaxKind::K_TYPE { | ||
| break; | ||
| } | ||
| } | ||
| break; | ||
| } | ||
| _ => {} | ||
| } | ||
| } | ||
| iter.find_map(|e| match e { | ||
| rowan::NodeOrToken::Token(t) if t.kind() == SyntaxKind::T_IDENT => { | ||
| Some(t.text().to_string()) | ||
| } | ||
| _ => None, | ||
| }) | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Simplify the name() method implementation.
The current implementation with nested loops and manual iteration is hard to follow. Consider using iterator combinators for clarity.
pub fn name(&self) -> Option<String> {
- let mut iter = self.syntax.children_with_tokens();
- for el in iter.by_ref() {
- match el.kind() {
- SyntaxKind::K_TYPEDEF => break,
- SyntaxKind::K_EXTERN => {
- // skip 'extern type'
- for tok in iter.by_ref() {
- if tok.kind() == SyntaxKind::K_TYPE {
- break;
- }
- }
- break;
- }
- _ => {}
- }
- }
- iter.find_map(|e| match e {
- rowan::NodeOrToken::Token(t) if t.kind() == SyntaxKind::T_IDENT => {
- Some(t.text().to_string())
- }
- _ => None,
- })
+ self.syntax
+ .children_with_tokens()
+ .skip_while(|e| !matches!(e.kind(), SyntaxKind::K_TYPEDEF | SyntaxKind::K_EXTERN))
+ .skip_while(|e| e.kind() != SyntaxKind::T_IDENT)
+ .find_map(|e| match e {
+ rowan::NodeOrToken::Token(t) if t.kind() == SyntaxKind::T_IDENT => {
+ Some(t.text().to_string())
+ }
+ _ => None,
+ })
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn name(&self) -> Option<String> { | |
| let mut iter = self.syntax.children_with_tokens(); | |
| for el in iter.by_ref() { | |
| match el.kind() { | |
| SyntaxKind::K_TYPEDEF => break, | |
| SyntaxKind::K_EXTERN => { | |
| // skip 'extern type' | |
| for tok in iter.by_ref() { | |
| if tok.kind() == SyntaxKind::K_TYPE { | |
| break; | |
| } | |
| } | |
| break; | |
| } | |
| _ => {} | |
| } | |
| } | |
| iter.find_map(|e| match e { | |
| rowan::NodeOrToken::Token(t) if t.kind() == SyntaxKind::T_IDENT => { | |
| Some(t.text().to_string()) | |
| } | |
| _ => None, | |
| }) | |
| } | |
| pub fn name(&self) -> Option<String> { | |
| self.syntax | |
| .children_with_tokens() | |
| .skip_while(|e| !matches!(e.kind(), SyntaxKind::K_TYPEDEF | SyntaxKind::K_EXTERN)) | |
| .skip_while(|e| e.kind() != SyntaxKind::T_IDENT) | |
| .find_map(|e| match e { | |
| rowan::NodeOrToken::Token(t) if t.kind() == SyntaxKind::T_IDENT => { | |
| Some(t.text().to_string()) | |
| } | |
| _ => None, | |
| }) | |
| } |
🤖 Prompt for AI Agents
In src/parser/mod.rs around lines 385 to 408, the name() method uses nested
loops and manual iteration, making it hard to read. Refactor this method to use
iterator combinators like filter, skip_while, or find_map to simplify the logic
and improve clarity while preserving the existing behavior.
Summary
N_TYPE_DEFCST nodes and expose typed accessorsTesting
cargo clippy -- -D warningscargo testmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_685da4e26f4c8322b67fc17671698459
Summary by Sourcery
Implement typedef and extern type parsing by tracking typedef spans, building corresponding CST nodes, and providing typed AST wrappers with accessors for name, definition, and extern status, accompanied by comprehensive tests.
New Features:
Enhancements:
Tests: