Encapsulate token utilities in TokenStream#30
Conversation
Reviewer's GuideThis PR introduces a new TokenStream abstraction to centralize cursor-based token navigation, refactors the parser to leverage its methods instead of manual index arithmetic, removes legacy helper functions, updates tests to use the new API, and documents TokenStream in the design doc. Class diagram for TokenStream abstraction and parser integrationclassDiagram
class TokenStream {
- tokens: &[(SyntaxKind, Span)]
- src: &str
- cursor: usize
+ new(tokens: &[(SyntaxKind, Span)], src: &str)
+ advance()
+ skip_until(end: usize)
+ line_end(start: usize) usize
+ skip_ws_inline()
+ peek() Option<(SyntaxKind, Span)>
+ cursor() usize
+ tokens() &[(SyntaxKind, Span)]
+ src() &str
}
class SpanCollector {
- stream: TokenStream
- spans: Vec<Span>
- extra: Extra
+ new(tokens: &[(SyntaxKind, Span)], src: &str, extra: Extra)
}
TokenStream <.. SpanCollector : used by
class State {
- stream: TokenStream
- spans: Vec<Span>
- extra: Extra
+ new(tokens: &[(SyntaxKind, Span)], src: &str, extra: Extra)
}
TokenStream <.. State : used by
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
Summary by CodeRabbit
WalkthroughThe parsing logic and documentation were refactored to introduce a Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant TokenStream
participant Tokens
participant Source
Parser->>TokenStream: TokenStream::new(Tokens, Source)
loop Parsing
Parser->>TokenStream: peek()
TokenStream-->>Parser: Option<(SyntaxKind, Span)>
alt Need to advance
Parser->>TokenStream: advance()
end
alt Skip whitespace/comments
Parser->>TokenStream: skip_ws_inline()
end
alt Find line end
Parser->>TokenStream: line_end(start)
TokenStream-->>Parser: usize (position)
end
end
Possibly related issues
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (6)`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
📄 Source: CodeRabbit Inference Engine (AGENTS.md) List of files the instruction was applied to:
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md) List of files the instruction was applied to:
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.
📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md) List of files the instruction was applied to:
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
⚙️ Source: CodeRabbit Configuration File List of files the instruction was applied to:
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
📄 Source: CodeRabbit Inference Engine (AGENTS.md) List of files the instruction was applied to:
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
⚙️ Source: CodeRabbit Configuration File List of files the instruction was applied to:
🧬 Code Graph Analysis (1)src/parser/token_stream.rs (2)
🪛 LanguageTooldocs/ddlint-design-and-road-map.md[uncategorized] ~198-~198: Possible missing comma found. (AI_HYDRA_LEO_MISSING_COMMA) 🔇 Additional comments (5)
✨ Finishing Touches
🧪 Generate Unit Tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Docstrings generation was requested by @leynos. * #30 (comment) The following files were modified: * `src/parser/mod.rs` * `src/parser/span_collector.rs` * `src/parser/token_stream.rs`
|
Note Generated docstrings for this pull request at #31 |
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
docs/ddlint-design-and-road-map.md(1 hunks)src/parser/mod.rs(5 hunks)src/parser/span_collector.rs(1 hunks)src/parser/token_stream.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation indocs/must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using themake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use '' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.md
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.
docs/ddlint-design-and-road-map.md: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.
📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-oxendic spelling and grammar.
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
src/parser/span_collector.rssrc/parser/mod.rssrc/parser/token_stream.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/parser/span_collector.rssrc/parser/mod.rssrc/parser/token_stream.rs
🧬 Code Graph Analysis (2)
src/parser/span_collector.rs (1)
src/parser/token_stream.rs (3)
new(32-38)tokens(61-63)src(67-69)
src/parser/token_stream.rs (1)
src/parser/span_collector.rs (1)
new(23-29)
🪛 LanguageTool
docs/ddlint-design-and-road-map.md
[style] ~198-~198: Would you like to use the Oxford spelling “tokenization”? The spelling ‘tokenisation’ is also correct.
Context: ...User: Parsed { green, root } ``` After tokenisation the parser wraps the vector of tokens i...
(OXFORD_SPELLING_Z_NOT_S)
🪛 GitHub Actions: CI
src/parser/span_collector.rs
[error] 50-50: Rust compiler error E0609: no field cursor on type parser::span_collector::SpanCollector<'_, std::vec::Vec<{integer}>>
🔇 Additional comments (14)
src/parser/token_stream.rs (3)
1-19: Excellent module documentation with clear examples.The module-level documentation clearly explains the purpose and provides a practical example. This follows the coding guidelines requirement for module-level comments.
22-27: Well-designed struct with appropriate encapsulation.The struct design encapsulates token navigation concerns effectively with proper visibility modifiers.
97-106: Method implementation follows expected patterns.The
skip_ws_inlinemethod correctly handles whitespace and comment skipping with proper bounds checking.docs/ddlint-design-and-road-map.md (1)
198-201: Fix spelling to follow British English guidelines.The documentation correctly describes the TokenStream abstraction, but should use British spelling as specified in the coding guidelines.
Apply this change to follow the en-GB-oxendict spelling requirement:
-After tokenisation the parser wraps the vector of tokens in a lightweight +After tokenization the parser wraps the vector of tokens in a lightweightLikely an incorrect or invalid review comment.
src/parser/span_collector.rs (4)
10-10: Appropriate import for new abstraction.The import of TokenStream is correctly placed and follows module organisation patterns.
15-15: Clean refactoring to encapsulate token navigation.Replacing separate cursor, tokens, and src fields with a single stream field improves encapsulation and reduces the chance of inconsistent state.
25-25: Constructor properly initialises TokenStream.The updated constructor correctly uses
TokenStream::new(tokens, src)to encapsulate the token slice and source text.
14-18: No remainingcursorreferences on SpanCollectorI’ve searched the entire Rust codebase for
.cursorand all matches reference theTokenStreamtype viastream.cursor(). There are no direct accesses to a non-existentcursorfield onSpanCollector. The refactoring to replacecursorwith thestreamfield is complete—please investigate other causes for the CI failure.Likely an incorrect or invalid review comment.
src/parser/mod.rs (6)
16-16: Appropriate module declaration for new abstraction.The
mod token_stream;declaration correctly exposes the new module within the parser.
21-26: Updated macro documentation reflects new interface.The documentation correctly describes the new
streamfield requirement and usage patterns.
45-56: Macro refactoring properly implements new abstraction.The
token_dispatch!macro has been correctly updated to usestream.peek()andstream.advance()instead of manual indexing. The pattern matching and control flow remain sound.
154-166: Parsing logic correctly adapted to use TokenStream.The
handle_importfunction properly uses the new TokenStream methods:
stream.tokens()andstream.cursor()for creating sub-streamsstream.skip_until()for advancing past processed tokensstream.line_end()for error recoveryThis maintains the same functionality while using the new abstraction.
184-207: Typedef parsing functions updated consistently.Both
handle_typedefandhandle_externfunctions have been properly updated to use TokenStream methods:
stream.advance()instead of manual cursor incrementstream.skip_ws_inline()instead of removed helperstream.line_end()for determining span boundariesThe logic flow and error handling remain intact.
526-572: Test functions properly updated to use new API.All test functions have been correctly updated to use the TokenStream interface:
- Creating streams with
TokenStream::new(&tokens, src)- Using
stream.advance(),stream.skip_until(), etc.- Assertions updated to use
stream.cursor()andstream.peek()The test coverage remains comprehensive.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
Cargo.toml(1 hunks)docs/rust-testing-with-rstest-fixtures.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
`Cargo.toml`: Use explicit version ranges in `Cargo.toml` and keep dependencies up-to-date.
Cargo.toml: Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
Cargo.toml
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation indocs/must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using themake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use '' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-oxendic spelling and grammar.
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🔇 Additional comments (2)
Cargo.toml (2)
16-16:rstestbump complies with explicit-range policyThe new constraint
>=0.25.0, <0.26.0follows the project rule of having an upper bound while tracking the current minor. ✅
4-4: Confirm 2024 edition support in CI/toolchains
edition = "2024"is still only available on nightly at the time of writing. Please verify that:
- CI uses a nightly channel, or
- you intend to gate contributors to nightly, or
- the edition bump is premature.
Failing to do so will break
cargo checkon stable.
Docstrings generation was requested by @leynos. * #30 (comment) The following files were modified: * `src/parser/mod.rs` * `src/parser/span_collector.rs` * `src/parser/token_stream.rs` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Build failure under investigation |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
docs/rust-testing-with-rstest-fixtures.md(1 hunks)src/parser/mod.rs(5 hunks)src/parser/span_collector.rs(2 hunks)src/parser/token_stream.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation indocs/must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using themake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use '' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-oxendic spelling and grammar.
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
docs/rust-testing-with-rstest-fixtures.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
src/parser/span_collector.rssrc/parser/token_stream.rssrc/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/parser/span_collector.rssrc/parser/token_stream.rssrc/parser/mod.rs
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🧬 Code Graph Analysis (2)
src/parser/span_collector.rs (2)
src/parser/token_stream.rs (4)
new(41-47)tokens(105-107)src(118-120)cursor(59-61)src/tokenizer.rs (1)
tokenize(189-245)
src/parser/token_stream.rs (2)
src/parser/span_collector.rs (1)
new(44-50)src/parser/mod.rs (1)
text(458-460)
🔇 Additional comments (14)
docs/rust-testing-with-rstest-fixtures.md (1)
108-113: Version range now matches the manifest – inconsistency resolvedThe dependency snippet is consistent with
Cargo.toml, keeps the code-block width well under 120 chars, and complies with the documented guidelines.src/parser/token_stream.rs (3)
1-19: Excellent module documentation and design rationale.The module-level documentation clearly explains the purpose, provides usage examples, and describes the benefits of the abstraction. This aligns perfectly with the coding guidelines requiring comprehensive module documentation.
75-78: Good resolution of previous performance concern.The
peekmethod now returnsOption<&(SyntaxKind, Span)>instead of cloning, which addresses the previous review comment about performance implications. This is a more efficient approach for frequent peek operations.
162-169: Improved error handling addresses previous feedback.The use of
is_some_and(|text| text.contains('\n'))is much cleaner than the previousunwrap_or("")approach. This makes the intent explicit and avoids masking potential span boundary errors.src/parser/span_collector.rs (4)
10-10: Good integration of the TokenStream abstraction.The import statement correctly brings in the TokenStream from the sibling module, enabling the refactoring of SpanCollector's internal structure.
21-42: Excellent documentation enhancement.The expanded documentation includes parameter descriptions, return information, and usage examples as required by the coding guidelines. This makes the API much clearer for users.
45-49: Clean integration with TokenStream constructor.The replacement of separate field assignments with
TokenStream::new(tokens, src)simplifies the constructor whilst maintaining the same functionality through encapsulation.
69-78: Comprehensive test coverage using rstest.The test properly validates that the TokenStream is correctly initialised and accessible through the public interface. Good use of rstest fixture as recommended in the guidelines.
src/parser/mod.rs (6)
16-16: Proper module structure for the new abstraction.Adding the
token_streammodule enables the encapsulation of token navigation logic whilst maintaining clear separation of concerns.
21-56: Well-updated macro documentation and implementation.The
token_dispatch!macro documentation has been enhanced to reflect the newstreamfield requirement, and the implementation correctly usesstream.peek()andstream.advance()methods. The logic flow remains identical whilst benefiting from encapsulated token navigation.
107-117: Enhanced function documentation with examples.The documentation for
parse_tokensnow includes parameter descriptions and usage examples, following the Rustdoc conventions specified in the coding guidelines.
188-199: Correct integration of TokenStream methods.The parsing logic correctly uses
stream.tokens(),stream.cursor(),stream.skip_until(), andstream.line_end()methods, maintaining the same logical flow whilst benefiting from encapsulated bounds checking and cleaner error recovery.
264-280: Clean replacement of manual cursor arithmetic.The
handle_externfunction now usesstream.advance(),stream.skip_ws_inline(),stream.peek(), andstream.line_end()methods instead of manual index management. This makes the code more readable and less error-prone.
617-625: Good test coverage for TokenStream functionality.The tests properly validate the
skip_untilandline_endmethods using realistic scenarios. The use of rstest follows the guidelines, and the assertions verify both boundary conditions and normal operation.
Docstrings generation was requested by @leynos. * #30 (comment) The following files were modified: * `src/parser/mod.rs` * `src/parser/span_collector.rs` * `src/parser/token_stream.rs` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
5b98773 to
c26cc06
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/parser/span_collector.rs (1)
80-116: Fix duplicate test function definitions causing compilation failure.There are two functions named
new_initialises_statedefined at lines 70-78 and 91-101, which causes a compilation error. The pipeline failure confirms this issue. The second definition appears to be leftover code from the refactoring that should be removed.Remove the duplicate function definition:
+ } + #[rstest] - fn new_initialises_state() { - let tokens = &[(SyntaxKind::K_IMPORT, 0..6)]; - let src = "import"; - let extra = vec![1, 2, 3]; - let collector = SpanCollector::new(tokens, src, extra.clone()); - - assert_eq!(collector.cursor, 0); - assert!(collector.spans.is_empty()); - assert_eq!(collector.extra, extra); - assert_eq!(collector.tokens, tokens); - assert_eq!(collector.src, src); - } - - #[rstest] fn into_parts_returns_components() {
♻️ Duplicate comments (1)
src/parser/token_stream.rs (1)
194-207: Consider simplifying the control flow.The continue statement in the loop can be eliminated for cleaner code by inverting the conditional logic, as suggested in previous reviews.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
Cargo.toml(1 hunks)docs/ddlint-design-and-road-map.md(1 hunks)docs/rust-testing-with-rstest-fixtures.md(1 hunks)src/parser/mod.rs(5 hunks)src/parser/span_collector.rs(2 hunks)src/parser/token_stream.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
`Cargo.toml`: Use explicit version ranges in `Cargo.toml` and keep dependencies up-to-date.
Cargo.toml: Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
Cargo.toml
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...
docs/**/*.md: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in thedocs/directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation indocs/must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using themake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.mddocs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...
docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use '' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.
📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.mddocs/rust-testing-with-rstest-fixtures.md
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.
docs/ddlint-design-and-road-map.md: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.
📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md)
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-oxendic spelling and grammar.
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
docs/ddlint-design-and-road-map.mddocs/rust-testing-with-rstest-fixtures.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...
**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
Userstestfixtures for shared setup.
Replace duplicated tests with#[rstest(...)]parameterised cases.
Prefermockallfor mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
📄 Source: CodeRabbit Inference Engine (AGENTS.md)
List of files the instruction was applied to:
src/parser/token_stream.rssrc/parser/span_collector.rssrc/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments must use en-GB-oxendict spelling and grammar.
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/
⚙️ Source: CodeRabbit Configuration File
List of files the instruction was applied to:
src/parser/token_stream.rssrc/parser/span_collector.rssrc/parser/mod.rs
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🧬 Code Graph Analysis (2)
src/parser/token_stream.rs (1)
src/parser/span_collector.rs (1)
new(44-50)
src/parser/span_collector.rs (2)
src/parser/token_stream.rs (4)
new(41-47)tokens(105-107)src(118-120)cursor(59-61)src/tokenizer.rs (1)
tokenize(189-245)
🪛 LanguageTool
docs/ddlint-design-and-road-map.md
[style] ~198-~198: Would you like to use the Oxford spelling “tokenization”? The spelling ‘tokenisation’ is also correct.
Context: ...User: Parsed { green, root } ``` After tokenisation the parser wraps the vector of tokens i...
(OXFORD_SPELLING_Z_NOT_S)
🪛 GitHub Actions: CI
src/parser/span_collector.rs
[error] 116-116: cargo fmt check failed: this file contains an unclosed delimiter
🔇 Additional comments (15)
Cargo.toml (1)
16-16: rstest 0.25 upgrade – findings
- No usage of
rstest_macrosin code (only referenced in documentation), so you don’t need to add it to[dev-dependencies]or refactor imports.- The project uses a
rust-toolchain.tomlpinned tonightly-2025-06-10, which exceeds the Rust 1.76 MSRV required byrstest0.25.- Please run
cargo testto confirm that all tests compile and pass as expected.docs/rust-testing-with-rstest-fixtures.md (1)
110-112: Documentation snippet now consistent withCargo.tomlThe updated version range matches the manifest; no further action needed.
src/parser/token_stream.rs (5)
1-21: Excellent module-level documentation.The documentation clearly explains the module's purpose, provides a practical usage example, and follows Rust documentation conventions. This adheres well to the coding guidelines requiring comprehensive module documentation.
22-27: Well-designed struct with appropriate lifetime management.The struct correctly uses lifetime parameters to tie together the token slice and source text references, with appropriate visibility and debug support.
29-78: Constructor and accessors follow Rust best practices.The methods are well-documented with clear examples, use appropriate attributes like
#[must_use], and thepeek()method correctly returns references rather than cloning values for better performance.
80-120: Safe and well-documented accessor methods.The
advance()method includes proper bounds checking to prevent cursor overflow, and the accessor methods provide clean access to underlying data with comprehensive documentation.
122-171: Well-implemented utility methods with robust error handling.The
skip_untilandline_endmethods provide essential token navigation functionality with comprehensive documentation and appropriate error handling usingis_some_and.src/parser/span_collector.rs (3)
10-18: Good refactoring to use TokenStream abstraction.Replacing the separate
cursor,tokens, andsrcfields with a singleTokenStreamfield centralises token management and follows the new abstraction pattern introduced in the codebase.
21-50: Excellent enhancement to constructor documentation.The expanded documentation with detailed parameter descriptions, return value explanation, and usage example significantly improves the API documentation quality, following the coding guidelines for comprehensive documentation.
61-78: Well-written test with comprehensive verification.The new test function properly verifies the TokenStream initialisation and provides good coverage of the refactored constructor functionality, following testing best practices with rstest.
src/parser/mod.rs (5)
16-57: Excellent refactoring of token_dispatch macro to use TokenStream.The macro has been properly updated to work with the new
TokenStreamabstraction, with clear documentation and examples showing the new stream-based approach. The peek-and-advance pattern is more intuitive than manual cursor management.
107-126: Enhanced function documentation improves API clarity.The comprehensive documentation with examples and clear parameter descriptions significantly improves the function's usability whilst maintaining API compatibility.
128-209: Successful refactoring to use TokenStream methods.The function has been well-refactored to use
TokenStreammethods for cursor management and error recovery. The logic remains functionally equivalent whilst being more readable and maintainable through the use ofstream.skip_until()andstream.line_end().
211-291: Clean refactoring maintaining parsing logic with improved readability.The
handle_typedefandhandle_externfunctions have been successfully refactored to useTokenStreammethods. The logic is preserved whilst being more readable, particularly the use ofstream.peek()for token type checking andstream.skip_ws_inline()for whitespace handling.
600-697: Comprehensive test updates with enhanced documentation.The tests have been successfully updated to use
TokenStreammethods instead of the old helper functions. The enhanced documentation with examples significantly improves test clarity and demonstrates proper usage of the new abstraction.
* Add SpanCollector tests * Rename test and document TokenStream usage
c26cc06 to
e339ed9
Compare
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary
Testing
make fmtmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_685fd0b6f4d88322bb8ddcd9f514ddf7
Summary by Sourcery
Introduce a TokenStream abstraction for safe and boundary-checked token navigation, refactor parser code and SpanCollector to use this new helper instead of manual indexing, remove outdated utility functions, update tests, and document the change.
New Features:
Enhancements:
Documentation:
Tests: