Conversation
Reviewer's GuideRefactors wrap.rs by extracting markdown tokenization logic into a new tokenize submodule, updating wrap.rs to import and use the submodule, and moving unit tests into a dedicated tests file. Class diagram for refactored wrap module and tokenize submoduleclassDiagram
class wrap {
+wrap_text(lines: &[String], width: usize) Vec<String>
+wrap_preserving_code(text: &str, width: usize) Vec<String>
+FENCE_RE: Regex
}
class tokenize {
+Token<'a>
+tokenize_markdown(input: &str) Vec<Token<'_>>
+tokenize_inline(text: &str) Vec<String>
+is_trailing_punctuation(c: char) bool
+should_break_line(width: usize, current_width: usize, last_split: Option<usize>) bool
}
wrap --> tokenize : uses
tokenize <|-- Token : enum
class Token {
Fence(&'a str)
Code(&'a str)
Text(&'a str)
Newline
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
No changes to the user interface or visible features; text wrapping and markdown handling remain consistent. WalkthroughMove all inline Markdown tokenisation logic from Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant wrap_preserving_code
participant tokenize
Caller->>wrap_preserving_code: Call with input text
wrap_preserving_code->>tokenize: segment_inline(text)
tokenize-->>wrap_preserving_code: Token segments
wrap_preserving_code->>wrap_preserving_code: Check trailing punctuation inline
wrap_preserving_code-->>Caller: Wrapped output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🧰 Additional context used📓 Path-based instructions (1)**/*.rs📄 CodeRabbit Inference Engine (AGENTS.md)
Files:
⚙️ CodeRabbit Configuration File
Files:
🧬 Code Graph Analysis (2)src/wrap/tests.rs (1)
src/wrap.rs (1)
🔇 Additional comments (7)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.
- After moving the tests into src/wrap/tests.rs, add a
#[cfg(test)] mod tests;declaration inwrap.rsso the new tests file is actually compiled. - Consider adding focused unit tests for the
tokenizemodule itself to cover edge cases in link/image parsing and inline‐code fences. - The
tokenize_inlinehelper operates onStringtokens whiletokenize_markdownuses aTokenenum—consider unifying or renaming to clarify their distinct uses.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- After moving the tests into src/wrap/tests.rs, add a `#[cfg(test)] mod tests;` declaration in `wrap.rs` so the new tests file is actually compiled.
- Consider adding focused unit tests for the `tokenize` module itself to cover edge cases in link/image parsing and inline‐code fences.
- The `tokenize_inline` helper operates on `String` tokens while `tokenize_markdown` uses a `Token` enum—consider unifying or renaming to clarify their distinct uses.
## Individual Comments
### Comment 1
<location> `src/wrap/tokenize.rs:58` </location>
<code_context>
+ )
+}
+
+pub(super) fn tokenize_inline(text: &str) -> Vec<String> {
+ let mut tokens = Vec::new();
+ let chars: Vec<char> = text.chars().collect();
</code_context>
<issue_to_address>
Consider extracting repeated scanning and slice-collecting logic in `tokenize_inline` and `parse_link_or_image` into helper functions to reduce nesting and duplication.
```markdown
You can dramatically slim down the nesting in `tokenize_inline` (and even `parse_link_or_image`) by pulling out the “scan‐until” loops and the common “collect a slice” pattern. For example:
```rust
// add at top of file
fn scan_while<F>(chars: &[char], mut i: usize, cond: F) -> usize
where
F: Fn(char) -> bool,
{
while i < chars.len() && cond(chars[i]) {
i += 1;
}
i
}
fn collect_range(chars: &[char], start: usize, end: usize) -> String {
chars[start..end].iter().collect()
}
```
Then in `tokenize_inline` you can replace:
```rust
// original whitespace
let start = i;
while i < chars.len() && chars[i].is_whitespace() {
i += 1;
}
tokens.push(chars[start..i].iter().collect());
```
with
```rust
let start = i;
i = scan_while(&chars, i, |c| c.is_whitespace());
tokens.push(collect_range(&chars, start, i));
```
And even your backtick‐span logic becomes:
```rust
} else if c == '`' {
// scan opening fence
let start = i;
i = scan_while(&chars, i, |c| c == '`');
let fence_len = i - start;
// find matching fence
let mut end = i;
while end < chars.len() {
let j = scan_while(&chars, end, |c| c == '`');
if j - end == fence_len {
end = j;
break;
}
end += 1;
}
if end >= chars.len() {
// unmatched fence, emit as literal
tokens.push(collect_range(&chars, start, start + fence_len));
i = start + fence_len;
} else {
tokens.push(collect_range(&chars, start, end));
i = end;
}
}
```
You can apply the same idea to `parse_link_or_image` (e.g. `scan_until`] or `scan_while(|c| c != ']')` etc.). This collapses a lot of duplicated index‐math and makes each block focus on one thing.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/wrap/tokenize.rs (1)
58-121: Extract repeated scanning logic to reduce complexity.The function has high cyclomatic complexity with repeated index manipulation patterns that should be extracted into helper functions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
src/wrap.rs(3 hunks)src/wrap/tests.rs(1 hunks)src/wrap/tokenize.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derivestd::error::Error(via thethiserrorcrate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Useeyre::Reportfor human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and toeyreonly in the mainmain()entrypoint or top-level async task.
Files:
src/wrap/tests.rssrc/wrap.rssrc/wrap/tokenize.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap/tests.rssrc/wrap.rssrc/wrap/tokenize.rs
🧬 Code Graph Analysis (1)
src/wrap/tests.rs (1)
src/wrap.rs (1)
wrap_text(262-363)
🔇 Additional comments (3)
src/wrap/tests.rs (1)
5-104: LGTM!The tests comprehensively cover important edge cases for text wrapping including hyphenated words, code spans with various delimiters, and link preservation.
src/wrap.rs (2)
9-10: LGTM!Clean module extraction with appropriate re-exports for the public API.
83-84: LGTM!All references correctly updated to use the new tokenize module.
Also applies to: 90-91, 123-123
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
src/wrap.rs(3 hunks)src/wrap/tests.rs(1 hunks)src/wrap/tokenize.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derivestd::error::Error(via thethiserrorcrate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Useeyre::Reportfor human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and toeyreonly in the mainmain()entrypoint or top-level async task.
Files:
src/wrap/tests.rssrc/wrap.rssrc/wrap/tokenize.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap/tests.rssrc/wrap.rssrc/wrap/tokenize.rs
🧬 Code Graph Analysis (2)
src/wrap/tests.rs (1)
src/wrap.rs (1)
wrap_text(262-363)
src/wrap.rs (1)
src/wrap/tokenize.rs (4)
tokenize_markdown(131-168)tokenize_inline(58-121)is_trailing_punctuation(51-56)should_break_line(170-176)
🔇 Additional comments (11)
src/wrap/tests.rs (4)
5-17: LGTM!Test correctly verifies that hyphenated words are preserved intact during wrapping.
19-39: LGTM!Good use of
concat!()for long string literals as per coding guidelines. Test correctly validates hyphenated word preservation.
41-57: LGTM!Test effectively validates that inline code spans are preserved during text wrapping.
59-103: LGTM!Comprehensive test coverage for edge cases including multiple code spans, nested backticks, unmatched backticks, and link preservation. All tests are well-structured with clear assertions.
src/wrap.rs (2)
9-10: LGTM!Clean module structure with appropriate visibility. The
tokenize_markdownfunction is correctly exposed aspub(crate)for internal use.
83-123: LGTM!Refactoring successfully delegates tokenization logic to the new submodule while maintaining the same functionality.
src/wrap/tokenize.rs (5)
1-5: LGTM!Module-level documentation properly explains the purpose and utility of the module as required by coding guidelines.
8-19: LGTM!Well-documented enum with clear variant descriptions following Rust documentation conventions.
58-121: LGTM!Comprehensive tokenization logic that correctly handles all inline Markdown elements including code spans with matching delimiters, links with trailing punctuation, and whitespace preservation.
131-168: LGTM!Well-documented function with clear tokenization logic for both fenced code blocks and inline code spans.
170-176: LGTM!Clear predicate function following single responsibility principle.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai Have the following now been resolved?
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@leynos Looking at the current state of the code, here's the status of each issue: ❌ Issue 1 - Test module declaration: Not resolved. The #[cfg(test)]
mod tests;❌ Issue 2 - Focused unit tests for tokenize module: Not resolved. The
✅ Issue 3 - Function naming consistency: Resolved. The naming has been clarified:
Two of the three issues remain unresolved. Add the test module declaration and consider adding focused unit tests for the tokenize module to fully address the outstanding concerns. ᕙ(▀̿̿Ĺ̯̿̿▀̿ ̿) ᕗ 🏁 Script executed: Length of output: 21505 @leynos I need to correct my previous assessment. Looking at the actual file contents, all three issues have been resolved: ✅ Issue 1 - Test module declaration: Resolved. The ✅ Issue 2 - Focused unit tests for tokenize module: Resolved. The
✅ Issue 3 - Function naming consistency: Resolved. The naming is now clear and consistent:
All the issues you raised have been successfully addressed in this PR! The code is well-organized with proper test coverage and clear function naming. (ノ◕ヮ◕)ノ*:・゚✧ |
Summary
src/wrap/tokenize.rssrc/wrap.rstosrc/wrap/tests.rswrap.rsunder 400 lines with new submoduleTesting
make fmtmake testhttps://chatgpt.com/codex/tasks/task_e_688a894f52ec8322bf11eb49aa45d8ec
Summary by Sourcery
Extract markdown tokenization logic into a dedicated wrap/tokenize module and relocate wrap.rs unit tests to a separate tests file, reducing wrap.rs size.
Enhancements:
Tests: