Skip to content

Encapsulate token utilities in TokenStream#30

Merged
leynos merged 7 commits intomainfrom
codex/encapsulate-token-utility-functions
Jun 28, 2025
Merged

Encapsulate token utilities in TokenStream#30
leynos merged 7 commits intomainfrom
codex/encapsulate-token-utility-functions

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Jun 28, 2025

Summary

  • add TokenStream for safe token navigation
  • update parser to use the new abstraction
  • fix parser tests to use TokenStream helpers
  • document TokenStream in the design doc

Testing

  • make fmt
  • make lint
  • make test

https://chatgpt.com/codex/tasks/task_e_685fd0b6f4d88322bb8ddcd9f514ddf7

Summary by Sourcery

Introduce a TokenStream abstraction for safe and boundary-checked token navigation, refactor parser code and SpanCollector to use this new helper instead of manual indexing, remove outdated utility functions, update tests, and document the change.

New Features:

  • Introduce TokenStream struct to encapsulate token navigation logic.

Enhancements:

  • Refactor parser modules and SpanCollector to use TokenStream methods instead of manual index arithmetic.
  • Remove legacy token utility functions (skip_tokens_until, line_end, skip_ws_no_newline).
  • Update unit tests to replace manual token index handling with TokenStream usage.

Documentation:

  • Document TokenStream abstraction in the design doc.

Tests:

  • Adjust parser tests to exercise TokenStream helpers.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Jun 28, 2025

Reviewer's Guide

This PR introduces a new TokenStream abstraction to centralize cursor-based token navigation, refactors the parser to leverage its methods instead of manual index arithmetic, removes legacy helper functions, updates tests to use the new API, and documents TokenStream in the design doc.

Class diagram for TokenStream abstraction and parser integration

classDiagram
    class TokenStream {
        - tokens: &[(SyntaxKind, Span)]
        - src: &str
        - cursor: usize
        + new(tokens: &[(SyntaxKind, Span)], src: &str)
        + advance()
        + skip_until(end: usize)
        + line_end(start: usize) usize
        + skip_ws_inline()
        + peek() Option<(SyntaxKind, Span)>
        + cursor() usize
        + tokens() &[(SyntaxKind, Span)]
        + src() &str
    }
    class SpanCollector {
        - stream: TokenStream
        - spans: Vec<Span>
        - extra: Extra
        + new(tokens: &[(SyntaxKind, Span)], src: &str, extra: Extra)
    }
    TokenStream <.. SpanCollector : used by
    class State {
        - stream: TokenStream
        - spans: Vec<Span>
        - extra: Extra
        + new(tokens: &[(SyntaxKind, Span)], src: &str, extra: Extra)
    }
    TokenStream <.. State : used by
Loading

File-Level Changes

Change Details Files
Introduce TokenStream abstraction for token navigation
  • Add src/parser/token_stream.rs with TokenStream struct and methods (new, cursor, tokens, advance, peek, skip_until, line_end, skip_ws_inline)
  • Update State initialization in span_collector.rs to use TokenStream
  • Modify sub-stream creation in parser/mod.rs to use TokenStream APIs
src/parser/token_stream.rs
src/parser/span_collector.rs
src/parser/mod.rs
Refactor parser functions to use TokenStream methods
  • Replace skip_tokens_until, line_end, skip_ws_no_newline calls with TokenStream.skip_until, line_end, skip_ws_inline
  • Switch manual cursor increments to TokenStream.advance and peek
  • Adjust iterator sources to st.stream.tokens() and st.stream.src()
src/parser/mod.rs
Remove deprecated token helper functions
  • Delete skip_tokens_until, line_end, skip_ws_no_newline definitions from parser/mod.rs
src/parser/mod.rs
Update parser tests to use TokenStream API
  • Import TokenStream in tests and instantiate streams instead of index variables
  • Rename and adjust skip_until, line_end, skip_ws_inline test functions to use stream.cursor, stream.advance, stream.peek
  • Assert on stream.cursor() and stream.peek() results
src/parser/mod.rs
Document TokenStream in design documentation
  • Add TokenStream overview and usage in docs/ddlint-design-and-road-map.md
docs/ddlint-design-and-road-map.md

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 28, 2025

Summary by CodeRabbit

  • New Features

    • Introduced a new token stream abstraction to simplify and improve token navigation during parsing.
  • Refactor

    • Updated parsing logic and related components to use the new token stream, enhancing maintainability and reducing manual index handling.
    • Revised documentation to clarify the parsing pipeline and token stream usage.
  • Documentation

    • Improved and expanded documentation for both the parsing process and testing instructions.
  • Chores

    • Updated test dependencies to use a newer version of the testing framework.

Summary by CodeRabbit

  • Documentation

    • Expanded the parsing pipeline documentation to explain the new token navigation abstraction.
  • Refactor

    • Simplified token navigation in the parsing logic by introducing a new abstraction for managing tokens.
    • Updated related parsing components to use the new token management approach, improving code clarity and reducing manual index handling.

Walkthrough

The parsing logic and documentation were refactored to introduce a TokenStream abstraction, replacing manual token slice indexing and cursor management. Parsing functions, macros, and the SpanCollector struct were updated to use this new abstraction, encapsulating token navigation and utility methods, while maintaining overall control flow and error handling.

Changes

File(s) Change Summary
docs/ddlint-design-and-road-map.md Added documentation describing the new TokenStream abstraction and its role in simplifying token navigation.
src/parser/mod.rs Refactored parsing logic and macros to use TokenStream instead of manual indexing; removed obsolete helper functions.
src/parser/span_collector.rs Updated SpanCollector to use a TokenStream field, adjusting the constructor and removing separate cursor/tokens/src.
src/parser/token_stream.rs Introduced new TokenStream struct with methods for token navigation, whitespace skipping, and line end detection.
Cargo.toml, docs/rust-testing-with-rstest-fixtures.md Updated rstest dependency version from 0.18 to 0.25 in development dependencies and documentation snippet.

Sequence Diagram(s)

sequenceDiagram
    participant Parser
    participant TokenStream
    participant Tokens
    participant Source

    Parser->>TokenStream: TokenStream::new(Tokens, Source)
    loop Parsing
        Parser->>TokenStream: peek()
        TokenStream-->>Parser: Option<(SyntaxKind, Span)>
        alt Need to advance
            Parser->>TokenStream: advance()
        end
        alt Skip whitespace/comments
            Parser->>TokenStream: skip_ws_inline()
        end
        alt Find line end
            Parser->>TokenStream: line_end(start)
            TokenStream-->>Parser: usize (position)
        end
    end
Loading

Possibly related issues

Possibly related PRs

Poem

A rabbit hopped through fields of code,
Where tokens once in chaos strode.
With TokenStream now by its side,
It skips and peeks with gentle pride.
No more counting, no more dread—
Just parsing joy and bugs unfed!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e339ed9 and 4a1684d.

📒 Files selected for processing (2)
  • docs/ddlint-design-and-road-map.md (1 hunks)
  • src/parser/token_stream.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...

docs/**/*.md: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in the docs/ directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation in docs/ must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...

docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use 'alt text' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.

📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

docs/ddlint-design-and-road-map.md: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-oxendic spelling and grammar.
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • src/parser/token_stream.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments must use en-GB-oxendict spelling and grammar.

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • src/parser/token_stream.rs
🧬 Code Graph Analysis (1)
src/parser/token_stream.rs (2)
src/parser/span_collector.rs (1)
  • new (44-50)
src/parser/mod.rs (1)
  • text (458-460)
🪛 LanguageTool
docs/ddlint-design-and-road-map.md

[uncategorized] ~198-~198: Possible missing comma found.
Context: ...User: Parsed { green, root } ``` After tokenization the parser wraps the vector of tokens i...

(AI_HYDRA_LEO_MISSING_COMMA)

🔇 Additional comments (5)
src/parser/token_stream.rs (4)

1-27: Well-structured module with excellent documentation.

The module documentation clearly explains the purpose and includes a practical example. The struct definition follows Rust best practices with appropriate visibility and field types.


29-120: Core API methods are well-designed and documented.

The methods provide a clean interface for token stream navigation. The peek method correctly returns references rather than cloning, which addresses previous performance concerns. Proper bounds checking in advance prevents cursor overflow.


122-143: Correct implementation of token skipping for parser recovery.

The logic properly advances the cursor past tokens until the specified position, with appropriate bounds checking and clear termination conditions.


145-206: Utility methods correctly implement line navigation and whitespace skipping.

The line_end method safely handles text access using is_some_and, and skip_ws_inline has clean control flow. Both methods address concerns raised in previous reviews whilst maintaining correct functionality.

docs/ddlint-design-and-road-map.md (1)

198-201: Documentation accurately describes the TokenStream abstraction.

The added paragraph clearly explains the purpose and benefits of the TokenStream abstraction, fitting well within the parsing pipeline documentation. The technical details accurately reflect the implementation.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment
  • Commit Unit Tests in branch codex/encapsulate-token-utility-functions

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leynos - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

coderabbitai Bot added a commit that referenced this pull request Jun 28, 2025
Docstrings generation was requested by @leynos.

* #30 (comment)

The following files were modified:

* `src/parser/mod.rs`
* `src/parser/span_collector.rs`
* `src/parser/token_stream.rs`
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 28, 2025

Note

Generated docstrings for this pull request at #31

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6797851 and 649b186.

📒 Files selected for processing (4)
  • docs/ddlint-design-and-road-map.md (1 hunks)
  • src/parser/mod.rs (5 hunks)
  • src/parser/span_collector.rs (1 hunks)
  • src/parser/token_stream.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...

docs/**/*.md: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in the docs/ directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation in docs/ must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...

docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use 'alt text' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.

📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

docs/ddlint-design-and-road-map.md: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-oxendic spelling and grammar.
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • src/parser/span_collector.rs
  • src/parser/mod.rs
  • src/parser/token_stream.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments must use en-GB-oxendict spelling and grammar.

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • src/parser/span_collector.rs
  • src/parser/mod.rs
  • src/parser/token_stream.rs
🧬 Code Graph Analysis (2)
src/parser/span_collector.rs (1)
src/parser/token_stream.rs (3)
  • new (32-38)
  • tokens (61-63)
  • src (67-69)
src/parser/token_stream.rs (1)
src/parser/span_collector.rs (1)
  • new (23-29)
🪛 LanguageTool
docs/ddlint-design-and-road-map.md

[style] ~198-~198: Would you like to use the Oxford spelling “tokenization”? The spelling ‘tokenisation’ is also correct.
Context: ...User: Parsed { green, root } ``` After tokenisation the parser wraps the vector of tokens i...

(OXFORD_SPELLING_Z_NOT_S)

🪛 GitHub Actions: CI
src/parser/span_collector.rs

[error] 50-50: Rust compiler error E0609: no field cursor on type parser::span_collector::SpanCollector<'_, std::vec::Vec<{integer}>>

🔇 Additional comments (14)
src/parser/token_stream.rs (3)

1-19: Excellent module documentation with clear examples.

The module-level documentation clearly explains the purpose and provides a practical example. This follows the coding guidelines requirement for module-level comments.


22-27: Well-designed struct with appropriate encapsulation.

The struct design encapsulates token navigation concerns effectively with proper visibility modifiers.


97-106: Method implementation follows expected patterns.

The skip_ws_inline method correctly handles whitespace and comment skipping with proper bounds checking.

docs/ddlint-design-and-road-map.md (1)

198-201: Fix spelling to follow British English guidelines.

The documentation correctly describes the TokenStream abstraction, but should use British spelling as specified in the coding guidelines.

Apply this change to follow the en-GB-oxendict spelling requirement:

-After tokenisation the parser wraps the vector of tokens in a lightweight
+After tokenization the parser wraps the vector of tokens in a lightweight

Likely an incorrect or invalid review comment.

src/parser/span_collector.rs (4)

10-10: Appropriate import for new abstraction.

The import of TokenStream is correctly placed and follows module organisation patterns.


15-15: Clean refactoring to encapsulate token navigation.

Replacing separate cursor, tokens, and src fields with a single stream field improves encapsulation and reduces the chance of inconsistent state.


25-25: Constructor properly initialises TokenStream.

The updated constructor correctly uses TokenStream::new(tokens, src) to encapsulate the token slice and source text.


14-18: No remaining cursor references on SpanCollector

I’ve searched the entire Rust codebase for .cursor and all matches reference the TokenStream type via stream.cursor(). There are no direct accesses to a non-existent cursor field on SpanCollector. The refactoring to replace cursor with the stream field is complete—please investigate other causes for the CI failure.

Likely an incorrect or invalid review comment.

src/parser/mod.rs (6)

16-16: Appropriate module declaration for new abstraction.

The mod token_stream; declaration correctly exposes the new module within the parser.


21-26: Updated macro documentation reflects new interface.

The documentation correctly describes the new stream field requirement and usage patterns.


45-56: Macro refactoring properly implements new abstraction.

The token_dispatch! macro has been correctly updated to use stream.peek() and stream.advance() instead of manual indexing. The pattern matching and control flow remain sound.


154-166: Parsing logic correctly adapted to use TokenStream.

The handle_import function properly uses the new TokenStream methods:

  • stream.tokens() and stream.cursor() for creating sub-streams
  • stream.skip_until() for advancing past processed tokens
  • stream.line_end() for error recovery

This maintains the same functionality while using the new abstraction.


184-207: Typedef parsing functions updated consistently.

Both handle_typedef and handle_extern functions have been properly updated to use TokenStream methods:

  • stream.advance() instead of manual cursor increment
  • stream.skip_ws_inline() instead of removed helper
  • stream.line_end() for determining span boundaries

The logic flow and error handling remain intact.


526-572: Test functions properly updated to use new API.

All test functions have been correctly updated to use the TokenStream interface:

  • Creating streams with TokenStream::new(&tokens, src)
  • Using stream.advance(), stream.skip_until(), etc.
  • Assertions updated to use stream.cursor() and stream.peek()

The test coverage remains comprehensive.

Comment thread src/parser/token_stream.rs Outdated
Comment thread src/parser/token_stream.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 649b186 and e0e49e1.

📒 Files selected for processing (2)
  • Cargo.toml (1 hunks)
  • docs/rust-testing-with-rstest-fixtures.md (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
`Cargo.toml`: Use explicit version ranges in `Cargo.toml` and keep dependencies up-to-date.

Cargo.toml: Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • Cargo.toml
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...

docs/**/*.md: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in the docs/ directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation in docs/ must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...

docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use 'alt text' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.

📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-oxendic spelling and grammar.
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🔇 Additional comments (2)
Cargo.toml (2)

16-16: rstest bump complies with explicit-range policy

The new constraint >=0.25.0, <0.26.0 follows the project rule of having an upper bound while tracking the current minor. ✅


4-4: Confirm 2024 edition support in CI/toolchains

edition = "2024" is still only available on nightly at the time of writing. Please verify that:

  • CI uses a nightly channel, or
  • you intend to gate contributors to nightly, or
  • the edition bump is premature.

Failing to do so will break cargo check on stable.

Comment thread docs/rust-testing-with-rstest-fixtures.md
leynos pushed a commit that referenced this pull request Jun 28, 2025
Docstrings generation was requested by @leynos.

* #30 (comment)

The following files were modified:

* `src/parser/mod.rs`
* `src/parser/span_collector.rs`
* `src/parser/token_stream.rs`

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Jun 28, 2025

Build failure under investigation

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0e49e1 and 5b98773.

📒 Files selected for processing (4)
  • docs/rust-testing-with-rstest-fixtures.md (1 hunks)
  • src/parser/mod.rs (5 hunks)
  • src/parser/span_collector.rs (2 hunks)
  • src/parser/token_stream.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...

docs/**/*.md: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in the docs/ directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation in docs/ must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...

docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use 'alt text' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.

📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-oxendic spelling and grammar.
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • docs/rust-testing-with-rstest-fixtures.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • src/parser/span_collector.rs
  • src/parser/token_stream.rs
  • src/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments must use en-GB-oxendict spelling and grammar.

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • src/parser/span_collector.rs
  • src/parser/token_stream.rs
  • src/parser/mod.rs
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🧬 Code Graph Analysis (2)
src/parser/span_collector.rs (2)
src/parser/token_stream.rs (4)
  • new (41-47)
  • tokens (105-107)
  • src (118-120)
  • cursor (59-61)
src/tokenizer.rs (1)
  • tokenize (189-245)
src/parser/token_stream.rs (2)
src/parser/span_collector.rs (1)
  • new (44-50)
src/parser/mod.rs (1)
  • text (458-460)
🔇 Additional comments (14)
docs/rust-testing-with-rstest-fixtures.md (1)

108-113: Version range now matches the manifest – inconsistency resolved

The dependency snippet is consistent with Cargo.toml, keeps the code-block width well under 120 chars, and complies with the documented guidelines.

src/parser/token_stream.rs (3)

1-19: Excellent module documentation and design rationale.

The module-level documentation clearly explains the purpose, provides usage examples, and describes the benefits of the abstraction. This aligns perfectly with the coding guidelines requiring comprehensive module documentation.


75-78: Good resolution of previous performance concern.

The peek method now returns Option<&(SyntaxKind, Span)> instead of cloning, which addresses the previous review comment about performance implications. This is a more efficient approach for frequent peek operations.


162-169: Improved error handling addresses previous feedback.

The use of is_some_and(|text| text.contains('\n')) is much cleaner than the previous unwrap_or("") approach. This makes the intent explicit and avoids masking potential span boundary errors.

src/parser/span_collector.rs (4)

10-10: Good integration of the TokenStream abstraction.

The import statement correctly brings in the TokenStream from the sibling module, enabling the refactoring of SpanCollector's internal structure.


21-42: Excellent documentation enhancement.

The expanded documentation includes parameter descriptions, return information, and usage examples as required by the coding guidelines. This makes the API much clearer for users.


45-49: Clean integration with TokenStream constructor.

The replacement of separate field assignments with TokenStream::new(tokens, src) simplifies the constructor whilst maintaining the same functionality through encapsulation.


69-78: Comprehensive test coverage using rstest.

The test properly validates that the TokenStream is correctly initialised and accessible through the public interface. Good use of rstest fixture as recommended in the guidelines.

src/parser/mod.rs (6)

16-16: Proper module structure for the new abstraction.

Adding the token_stream module enables the encapsulation of token navigation logic whilst maintaining clear separation of concerns.


21-56: Well-updated macro documentation and implementation.

The token_dispatch! macro documentation has been enhanced to reflect the new stream field requirement, and the implementation correctly uses stream.peek() and stream.advance() methods. The logic flow remains identical whilst benefiting from encapsulated token navigation.


107-117: Enhanced function documentation with examples.

The documentation for parse_tokens now includes parameter descriptions and usage examples, following the Rustdoc conventions specified in the coding guidelines.


188-199: Correct integration of TokenStream methods.

The parsing logic correctly uses stream.tokens(), stream.cursor(), stream.skip_until(), and stream.line_end() methods, maintaining the same logical flow whilst benefiting from encapsulated bounds checking and cleaner error recovery.


264-280: Clean replacement of manual cursor arithmetic.

The handle_extern function now uses stream.advance(), stream.skip_ws_inline(), stream.peek(), and stream.line_end() methods instead of manual index management. This makes the code more readable and less error-prone.


617-625: Good test coverage for TokenStream functionality.

The tests properly validate the skip_until and line_end methods using realistic scenarios. The use of rstest follows the guidelines, and the assertions verify both boundary conditions and normal operation.

Comment thread src/parser/token_stream.rs
leynos and others added 4 commits June 28, 2025 15:58
Docstrings generation was requested by @leynos.

* #30 (comment)

The following files were modified:

* `src/parser/mod.rs`
* `src/parser/span_collector.rs`
* `src/parser/token_stream.rs`

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@leynos leynos force-pushed the codex/encapsulate-token-utility-functions branch from 5b98773 to c26cc06 Compare June 28, 2025 14:59
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
src/parser/span_collector.rs (1)

80-116: Fix duplicate test function definitions causing compilation failure.

There are two functions named new_initialises_state defined at lines 70-78 and 91-101, which causes a compilation error. The pipeline failure confirms this issue. The second definition appears to be leftover code from the refactoring that should be removed.

Remove the duplicate function definition:

+    }
+
     #[rstest]
-    fn new_initialises_state() {
-        let tokens = &[(SyntaxKind::K_IMPORT, 0..6)];
-        let src = "import";
-        let extra = vec![1, 2, 3];
-        let collector = SpanCollector::new(tokens, src, extra.clone());
-
-        assert_eq!(collector.cursor, 0);
-        assert!(collector.spans.is_empty());
-        assert_eq!(collector.extra, extra);
-        assert_eq!(collector.tokens, tokens);
-        assert_eq!(collector.src, src);
-    }
-
-    #[rstest]
     fn into_parts_returns_components() {
♻️ Duplicate comments (1)
src/parser/token_stream.rs (1)

194-207: Consider simplifying the control flow.

The continue statement in the loop can be eliminated for cleaner code by inverting the conditional logic, as suggested in previous reviews.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b98773 and c26cc06.

📒 Files selected for processing (6)
  • Cargo.toml (1 hunks)
  • docs/ddlint-design-and-road-map.md (1 hunks)
  • docs/rust-testing-with-rstest-fixtures.md (1 hunks)
  • src/parser/mod.rs (5 hunks)
  • src/parser/span_collector.rs (2 hunks)
  • src/parser/token_stream.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
`Cargo.toml`: Use explicit version ranges in `Cargo.toml` and keep dependencies up-to-date.

Cargo.toml: Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • Cargo.toml
`docs/**/*.md`: Use the markdown files within the `docs/` directory as a knowled...

docs/**/*.md: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Proactively update the relevant file(s) in the docs/ directory to reflect the latest state when new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve.
Documentation in docs/ must use en-GB-oxendict spelling and grammar, except for the word 'license'.
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Markdown Mermaid diagrams using the make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
  • docs/rust-testing-with-rstest-fixtures.md
`docs/**/*.md`: Use British English spelling based on the Oxford English Diction...

docs/**/*.md: Use British English spelling based on the Oxford English Dictionary, except retain US spelling for API terms (e.g., 'color').
Use the Oxford comma in lists.
Write headings in sentence case and use Markdown heading levels in order without skipping.
Follow markdownlint recommendations for Markdown formatting.
Always use fenced code blocks with a language identifier; use 'plaintext' for non-code text.
Use '-' as the first level bullet and renumber lists when items change.
Prefer inline links using text or angle brackets around the URL.
Expand any uncommon acronym on first use, e.g., Continuous Integration (CI).
Wrap paragraphs at 80 columns and code at 120 columns; do not wrap tables.
Use footnotes referenced with [^label].
When embedding figures, use 'alt text' and provide concise alt text describing the content.
Add a short description before each Mermaid diagram so screen readers can understand it.

📄 Source: CodeRabbit Inference Engine (docs/documentation-style-guide.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
  • docs/rust-testing-with-rstest-fixtures.md
`docs/ddlint-design-and-road-map.md`: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

docs/ddlint-design-and-road-map.md: Follow the guidance in the design document, especially sections L71-L122 and L124-L139, for SyntaxKind enum design and error recovery.

📄 Source: CodeRabbit Inference Engine (docs/parser-plan.md)

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
`**/*.md`: * Avoid 2nd person or 1st person pronouns ("I", "you", "we") * Use en...

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-oxendic spelling and grammar.
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • docs/ddlint-design-and-road-map.md
  • docs/rust-testing-with-rstest-fixtures.md
`**/*.rs`: Document public APIs using Rustdoc comments (`///`) so documentation ...

**/*.rs: Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Use explicit version ranges in Cargo.toml and keep dependencies up-to-date.
Use rstest fixtures for shared setup.
Replace duplicated tests with #[rstest(...)] parameterised cases.
Prefer mockall for mocks/stubs.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.

📄 Source: CodeRabbit Inference Engine (AGENTS.md)

List of files the instruction was applied to:

  • src/parser/token_stream.rs
  • src/parser/span_collector.rs
  • src/parser/mod.rs
`**/*.rs`: * Seek to keep the cyclomatic complexity of functions no more than 12...

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments must use en-GB-oxendict spelling and grammar.

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

⚙️ Source: CodeRabbit Configuration File

List of files the instruction was applied to:

  • src/parser/token_stream.rs
  • src/parser/span_collector.rs
  • src/parser/mod.rs
🧠 Learnings (1)
docs/rust-testing-with-rstest-fixtures.md (1)
Learnt from: CR
PR: leynos/lille#0
File: docs/rust-testing-with-rstest-fixtures.md:0-0
Timestamp: 2025-06-24T18:32:32.029Z
Learning: The use of dev-dependencies in Cargo.toml for testing libraries like rstest is a standard Rust practice to avoid including test-only code in production builds.
🧬 Code Graph Analysis (2)
src/parser/token_stream.rs (1)
src/parser/span_collector.rs (1)
  • new (44-50)
src/parser/span_collector.rs (2)
src/parser/token_stream.rs (4)
  • new (41-47)
  • tokens (105-107)
  • src (118-120)
  • cursor (59-61)
src/tokenizer.rs (1)
  • tokenize (189-245)
🪛 LanguageTool
docs/ddlint-design-and-road-map.md

[style] ~198-~198: Would you like to use the Oxford spelling “tokenization”? The spelling ‘tokenisation’ is also correct.
Context: ...User: Parsed { green, root } ``` After tokenisation the parser wraps the vector of tokens i...

(OXFORD_SPELLING_Z_NOT_S)

🪛 GitHub Actions: CI
src/parser/span_collector.rs

[error] 116-116: cargo fmt check failed: this file contains an unclosed delimiter

🔇 Additional comments (15)
Cargo.toml (1)

16-16: rstest 0.25 upgrade – findings

  • No usage of rstest_macros in code (only referenced in documentation), so you don’t need to add it to [dev-dependencies] or refactor imports.
  • The project uses a rust-toolchain.toml pinned to nightly-2025-06-10, which exceeds the Rust 1.76 MSRV required by rstest 0.25.
  • Please run cargo test to confirm that all tests compile and pass as expected.
docs/rust-testing-with-rstest-fixtures.md (1)

110-112: Documentation snippet now consistent with Cargo.toml

The updated version range matches the manifest; no further action needed.

src/parser/token_stream.rs (5)

1-21: Excellent module-level documentation.

The documentation clearly explains the module's purpose, provides a practical usage example, and follows Rust documentation conventions. This adheres well to the coding guidelines requiring comprehensive module documentation.


22-27: Well-designed struct with appropriate lifetime management.

The struct correctly uses lifetime parameters to tie together the token slice and source text references, with appropriate visibility and debug support.


29-78: Constructor and accessors follow Rust best practices.

The methods are well-documented with clear examples, use appropriate attributes like #[must_use], and the peek() method correctly returns references rather than cloning values for better performance.


80-120: Safe and well-documented accessor methods.

The advance() method includes proper bounds checking to prevent cursor overflow, and the accessor methods provide clean access to underlying data with comprehensive documentation.


122-171: Well-implemented utility methods with robust error handling.

The skip_until and line_end methods provide essential token navigation functionality with comprehensive documentation and appropriate error handling using is_some_and.

src/parser/span_collector.rs (3)

10-18: Good refactoring to use TokenStream abstraction.

Replacing the separate cursor, tokens, and src fields with a single TokenStream field centralises token management and follows the new abstraction pattern introduced in the codebase.


21-50: Excellent enhancement to constructor documentation.

The expanded documentation with detailed parameter descriptions, return value explanation, and usage example significantly improves the API documentation quality, following the coding guidelines for comprehensive documentation.


61-78: Well-written test with comprehensive verification.

The new test function properly verifies the TokenStream initialisation and provides good coverage of the refactored constructor functionality, following testing best practices with rstest.

src/parser/mod.rs (5)

16-57: Excellent refactoring of token_dispatch macro to use TokenStream.

The macro has been properly updated to work with the new TokenStream abstraction, with clear documentation and examples showing the new stream-based approach. The peek-and-advance pattern is more intuitive than manual cursor management.


107-126: Enhanced function documentation improves API clarity.

The comprehensive documentation with examples and clear parameter descriptions significantly improves the function's usability whilst maintaining API compatibility.


128-209: Successful refactoring to use TokenStream methods.

The function has been well-refactored to use TokenStream methods for cursor management and error recovery. The logic remains functionally equivalent whilst being more readable and maintainable through the use of stream.skip_until() and stream.line_end().


211-291: Clean refactoring maintaining parsing logic with improved readability.

The handle_typedef and handle_extern functions have been successfully refactored to use TokenStream methods. The logic is preserved whilst being more readable, particularly the use of stream.peek() for token type checking and stream.skip_ws_inline() for whitespace handling.


600-697: Comprehensive test updates with enhanced documentation.

The tests have been successfully updated to use TokenStream methods instead of the old helper functions. The enhanced documentation with examples significantly improves test clarity and demonstrates proper usage of the new abstraction.

Comment thread docs/ddlint-design-and-road-map.md Outdated
* Add SpanCollector tests

* Rename test and document TokenStream usage
@leynos leynos force-pushed the codex/encapsulate-token-utility-functions branch from c26cc06 to e339ed9 Compare June 28, 2025 15:04
leynos and others added 2 commits June 28, 2025 16:30
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@leynos leynos merged commit eedce81 into main Jun 28, 2025
0 of 2 checks passed
@leynos leynos deleted the codex/encapsulate-token-utility-functions branch June 28, 2025 15:31
@coderabbitai coderabbitai Bot mentioned this pull request Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant