Conversation
Reviewer's GuideThis PR introduces full support for ER diagram for index declaration CST node integrationerDiagram
DATALOG_PROGRAM ||--o{ N_INDEX : contains
N_INDEX {
name string
relation string
columns string
}
Class diagram for new Index AST wrapper and Root::indexes methodclassDiagram
class Root {
+Vec<Index> indexes()
}
class Index {
+syntax: SyntaxNode
+syntax() SyntaxNode
+name() Option<String>
+relation() Option<String>
+columns() Vec<String>
}
Root --> "*" Index
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
Summary by CodeRabbit
Summary by CodeRabbit
WalkthroughThe changes introduce support for Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Tokenizer
participant Parser
participant CSTBuilder
participant AST
User->>Tokenizer: Provide source with 'index' declaration
Tokenizer->>Parser: Token stream (incl. K_INDEX, K_ON)
Parser->>Parser: Collect spans for imports, typedefs, relations, indexes
Parser->>CSTBuilder: Pass tokens and spans (incl. indexes)
CSTBuilder->>AST: Build syntax tree with index nodes
AST->>User: Expose index nodes and fields via API
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
📄 Source: CodeRabbit Inference Engine (AGENTS.md) List of files the instruction was applied to:
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md) List of files the instruction was applied to:
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
⚙️ Source: CodeRabbit Configuration File List of files the instruction was applied to:
✨ Finishing Touches🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Refactor the repeated
filter(...).ignored()whitespace/comment parser into a shared combinator to reduce duplication inindex_columnsandindex_decl. - The manual buffer logic in
Index::columnsis brittle; consider iterating over child identifier tokens directly or using rowan queries to collect column names more robustly.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Refactor the repeated `filter(...).ignored()` whitespace/comment parser into a shared combinator to reduce duplication in `index_columns` and `index_decl`.
- The manual buffer logic in `Index::columns` is brittle; consider iterating over child identifier tokens directly or using rowan queries to collect column names more robustly.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
src/language.rs(2 hunks)src/parser/mod.rs(7 hunks)src/tokenizer.rs(2 hunks)tests/parser.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/language.rssrc/tokenizer.rstests/parser.rssrc/parser/mod.rs
🔇 Additional comments (7)
src/tokenizer.rs (1)
138-138: LGTM!The new keyword mappings are correctly added and maintain alphabetical ordering.
Also applies to: 149-149
src/language.rs (1)
95-95: LGTM!The new syntax kind variants are properly integrated and maintain alphabetical ordering.
Also applies to: 106-106
tests/parser.rs (1)
75-88: LGTM!Comprehensive test coverage for index parsing including valid single/multi-column cases and error handling for invalid syntax.
Also applies to: 411-454
src/parser/mod.rs (4)
95-113: LGTM!The integration of index span collection follows the established pattern for other statement types.
Also applies to: 116-154
524-585: LGTM!The CST construction correctly integrates index nodes following the established pattern.
740-748: LGTM!The method correctly follows the established pattern for collecting typed AST nodes.
1066-1150: LGTM!The Index AST wrapper correctly extracts all required information from the syntax tree.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/parser/mod.rs (1)
457-548: Robust index declaration parser with proper error recovery.The implementation correctly handles the
index <name> on <relation>(<columns>)pattern with proper support for nested parentheses in column expressions. The past review comment about handling nested parentheses has been addressed through the depth tracking mechanism.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/parser/mod.rs(8 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
src/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/parser/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build-test
🔇 Additional comments (5)
src/parser/mod.rs (5)
59-74: Well-structured whitespace parser.The
inline_wsfunction is properly documented and follows the established patterns in the codebase. The implementation correctly matches whitespace and comment tokens.
112-122: Proper integration of index parsing.The changes correctly integrate index span collection and CST building into the main parse flow, following the established pattern for other statement types.
132-171: Correct extension of parse_tokens for index support.The function properly extends the parsing pipeline to include index declarations. Good use of
#[expect]with clear reasoning, and proper error merging from multiple parsing stages.
561-561: Consistent CST construction for index nodes.The modifications to
build_green_treeproperly integrate index span handling, maintaining consistency with the existing pattern for other statement types.Also applies to: 566-566, 573-573, 579-579, 599-604, 611-611
771-779: Consistent AST accessor for index declarations.The
indexes()method follows the established pattern for accessing parsed statement types from the root AST node.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
docs/haskell-parser-analysis.md(1 hunks)src/parser/mod.rs(8 hunks)tests/parser.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation indocs/must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using themake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
docs/haskell-parser-analysis.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use '' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)
List of files the instruction was applied to:
docs/haskell-parser-analysis.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-oxendic spelling and grammar.
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
docs/haskell-parser-analysis.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
tests/parser.rssrc/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
tests/parser.rssrc/parser/mod.rs
🧬 Code Graph Analysis (1)
tests/parser.rs (1)
src/parser/mod.rs (10)
parse(110-130)errors(99-101)indexes(773-779)name(846-852)name(923-934)name(1112-1123)relation(1127-1138)columns(954-1009)columns(1146-1194)root(93-95)
🪛 LanguageTool
docs/haskell-parser-analysis.md
[typographical] ~176-~176: Consider using an em dash in dialogues and enumerations.
Context: ...turning a Transformer node. - index – defines an index on a relation. The Rus...
(DASH_RULE)
🔇 Additional comments (7)
tests/parser.rs (1)
75-485: Comprehensive test coverage for index parsing.The test additions provide excellent coverage for the new index parsing functionality:
- Valid cases: single-column and multi-column indexes, nested function calls
- Error cases: missing 'on' keyword, unbalanced parentheses
- Proper assertions for parsed structure and error detection
The tests follow the existing patterns and conventions in the file.
src/parser/mod.rs (6)
59-74: Well-designed whitespace parser combinator.The
inline_ws()function provides a clean, reusable way to handle whitespace and comments in parsers. Good documentation and follows established patterns.
110-171: Correct integration of index parsing into the pipeline.The updates to
parseandparse_tokensproperly integrate index span collection:
- Index spans are collected alongside imports, typedefs, and relations
- Errors from index parsing are correctly aggregated with other parse errors
- The function signature uses
#[expect]attribute appropriately for the complex return type
457-548: Robust index parsing with proper nested parenthesis handling.The
collect_index_spansimplementation correctly parses index declarations:
index_columnsparser properly tracks parenthesis depth to handle nested expressions likelower(name)- Error recovery skips to line end on parse failure
- Follows the established pattern for span collection
This addresses previous concerns about handling nested parentheses in column expressions.
561-611: Consistent CST building for index nodes.The updates to
build_green_treeproperly integrate index span handling:
- Assertion for sorted spans maintains invariants
- Iterator management follows the same pattern as other node types
N_INDEXnodes are created at appropriate positions
1097-1195: Well-implemented Index AST with proper column expression handling.The
Indexstruct and its methods provide clean access to parsed index information:
name()andrelation()correctly extract identifiers from appropriate positionscolumns()properly handles comma-separated column expressions with nested parentheses- The implementation preserves complex expressions like
lower(username)as complete stringsThis addresses previous review comments about proper column extraction.
771-779: Consistent API addition for accessing index nodes.The
indexes()method follows the established pattern for collecting typed nodes from the root, maintaining API consistency withimports(),type_defs(), andrelations().
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `tests/parser.rs` </location>
<code_context>
+#[fixture]
+fn index_single_column() -> &'static str {
</code_context>
<issue_to_address>
Consider adding a test for whitespace variations in index declarations.
Adding such a test will help ensure the parser handles formatting variations like extra spaces, tabs, or newlines between tokens.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| #[fixture] | ||
| fn index_single_column() -> &'static str { | ||
| "index Idx_User_username on User(username)" |
There was a problem hiding this comment.
suggestion (testing): Consider adding a test for whitespace variations in index declarations.
Adding such a test will help ensure the parser handles formatting variations like extra spaces, tabs, or newlines between tokens.
Summary
indexkeyword andonkeyword in the tokenizerN_INDEXnodesIndexAST wrappercolumnsmethodTesting
make lintmake testhttps://chatgpt.com/codex/tasks/task_e_6861737ff3f08322a0b0a42d8dfb9f39
Summary by Sourcery
Add parsing support for index declarations, integrate them into the CST and AST, and expose them via a new Index wrapper.
New Features:
indexandonkeywordsN_INDEXnodesIndexwrapper andRoot.indexes()methodEnhancements:
inline_wscombinator for handling whitespace and commentsDocumentation:
Tests: