Conversation
Reviewer's GuideThis PR introduces a helper for punctuation detection, updates the wrapping algorithm to attach trailing punctuation to preceding links, and adds a unit test to verify this behavior. Class diagram for updated wrap logic and punctuation helperclassDiagram
class wrap {
+wrap_preserving_code(text: &str, width: usize) Vec<String>
+should_break_line(width: usize, current_width: usize, last_split: Option<usize>) bool
+is_punctuation(token: &str) bool
}
Flow diagram for updated line wrapping with punctuation handlingflowchart TD
A[Start wrapping text] --> B{Is token punctuation?}
B -- No --> C{Should break line?}
C -- Yes --> D[Push current line]
C -- No --> E[Continue accumulating]
D --> E
B -- Yes --> F{Is current line non-empty and no last split?}
F -- Yes --> G[Attach punctuation to current line]
G --> H[Push current line]
H --> I[Clear current]
F -- No --> E
I --> E
E --> B
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 2 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Summary by CodeRabbit
WalkthroughIntroduce a helper function to detect ASCII punctuation tokens and update the line-wrapping logic in Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant wrap_preserving_code
participant is_punctuation
Caller->>wrap_preserving_code: Call with text and width
wrap_preserving_code->>is_punctuation: Check if token is punctuation
is_punctuation-->>wrap_preserving_code: Return true/false
alt No split point and token is punctuation
wrap_preserving_code->>wrap_preserving_code: Append token, flush line, clear buffer
else
wrap_preserving_code->>wrap_preserving_code: Continue normal wrapping
end
wrap_preserving_code-->>Caller: Return wrapped lines
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.
- is_punctuation currently treats any ASCII punctuation token (e.g. ‘[’ or ‘]’) as a break—consider narrowing that to just the typical trailing punctuation (.,;:?!…) or using Unicode punctuation categories so you don’t misclassify markdown delimiters.
- Rather than a separate special‐case block for punctuation after wrapping, you could fold punctuation detection into your existing last_split logic to streamline the flow and avoid extra continues.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- is_punctuation currently treats any ASCII punctuation token (e.g. ‘[’ or ‘]’) as a break—consider narrowing that to just the typical trailing punctuation (.,;:?!…) or using Unicode punctuation categories so you don’t misclassify markdown delimiters.
- Rather than a separate special‐case block for punctuation after wrapping, you could fold punctuation detection into your existing last_split logic to streamline the flow and avoid extra continues.
## Individual Comments
### Comment 1
<location> `tests/wrap.rs:478` </location>
<code_context>
);
}
+
+/// Ensures that punctuation following a wrapped link is not orphaned.
+#[test]
+fn test_wrap_link_trailing_punctuation() {
+ let input = lines_vec.",
+ ];
+ let expected = lines_vec.",
+ ];
+ let output = process_stream(&input);
+ assert_eq!(output, expected);
+}
</code_context>
<issue_to_address>
Consider adding tests for multiple and leading punctuation cases.
Please add tests for multiple trailing punctuation marks, various punctuation types, and punctuation before the link to ensure comprehensive coverage.
Suggested implementation:
```rust
/// Ensures that punctuation following a wrapped link is not orphaned.
#[test]
fn test_wrap_link_trailing_punctuation() {
let input = lines_vec.",
];
let expected = lines_vec.",
];
let output = process_stream(&input);
assert_eq!(output, expected);
}
/// Ensures that multiple trailing punctuation marks after a wrapped link are not orphaned.
#[test]
fn test_wrap_link_multiple_trailing_punctuation() {
let input = lines_vec!?",
];
let expected = lines_vec!?",
];
let output = process_stream(&input);
assert_eq!(output, expected);
}
/// Ensures that various punctuation types after a wrapped link are not orphaned.
#[test]
fn test_wrap_link_various_trailing_punctuation() {
let input = lines_vec:",
"See [note](baz.md)...",
"Alert [warn](warn.md);",
];
let expected = lines_vec:",
"See",
"[note](baz.md)...",
"Alert",
"[warn](warn.md);",
];
let output = process_stream(&input);
assert_eq!(output, expected);
}
/// Ensures that punctuation before a link is handled correctly.
#[test]
fn test_wrap_link_leading_punctuation() {
let input = lines_vec\" is important.",
"([Parenthesized link](paren.md)) is here.",
];
let expected = lines_vec\" is important.",
"([Parenthesized link](paren.md)) is here.",
];
let output = process_stream(&input);
assert_eq!(output, expected);
}
/// Ensures that both leading and trailing punctuation around a link are handled.
#[test]
fn test_wrap_link_leading_and_trailing_punctuation() {
let input = lines_vec!\"",
"([Another](bar.md)?)",
];
let expected = lines_vec!\"",
"([Another](bar.md)?)",
];
let output = process_stream(&input);
assert_eq!(output, expected);
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/wrap.rs(2 hunks)tests/wrap.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap.rstests/wrap.rs
🧬 Code Graph Analysis (1)
tests/wrap.rs (1)
src/process.rs (1)
process_stream(206-214)
🔇 Additional comments (6)
src/wrap.rs (1)
196-196: LGTM!The
is_punctuationfunction correctly checks if all characters in a token are trailing punctuation marks. The implementation is concise and follows the single responsibility principle.tests/wrap.rs (5)
478-492: LGTM!The test correctly verifies that trailing punctuation after a link remains attached to the link line rather than being orphaned. The test structure is clear and follows good testing practices.
494-504: LGTM!The test effectively verifies that multiple trailing punctuation marks (
!?) are kept with the link. The assertions correctly check both positive (punctuation stays with link) and negative (punctuation not orphaned) conditions.
506-531: LGTM!The test comprehensively covers various punctuation types (colon, ellipsis, semicolon) and ensures none are orphaned on separate lines. The test structure efficiently validates multiple scenarios in a single test.
533-550: LGTM!The test properly verifies that leading punctuation (quotation marks, parentheses) before links is handled correctly without creating orphaned punctuation lines. The assertions are appropriate for the test scenario.
552-565: LGTM!The test effectively validates the combined scenario of both leading and trailing punctuation around links. This ensures comprehensive coverage of punctuation handling edge cases.
After flushing a line, wrap_preserving_code attempted to check for whitespace again even though the token could not be whitespace. The extra branch was unreachable and made last_split updates confusing. The value is now reset to None unconditionally.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/wrap.rs (2)
194-194: Include quotation marks and parentheses in the punctuation character set.The current
TRAILING_PUNCTUATIONconstant lacks quotation marks and parentheses, which are common trailing punctuation around links. Based on test cases and typical usage patterns, these characters should be considered trailing punctuation.Apply this diff to include the missing punctuation characters:
-const TRAILING_PUNCTUATION: &[char] = &['.', ',', ';', ':', '?', '!', '…']; +const TRAILING_PUNCTUATION: &[char] = &['.', ',', ';', ':', '?', '!', '…', '"', '\'', '(', ')'];
237-245: Add width constraint to prevent line overflow.The punctuation attachment logic correctly prevents orphaning punctuation marks, but it doesn't verify that appending the punctuation token won't exceed the target width. This could result in lines longer than intended.
Apply this diff to add the width constraint:
- if last_split.is_none() && is_punctuation(&token) { + if last_split.is_none() && is_punctuation(&token) && token_width <= width {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/wrap.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Comments must use en-GB-oxendict spelling and grammar.
Function documentation must include clear examples.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Clippy warnings MUST be disallowed.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Use predicate functions for conditional criteria with more than two branches.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Comment why, not what. Explain assumptions, edge cases, trade-offs, or complexity. Don't echo the obvious.
Use functions and composition. Avoid repetition by extracting reusable logic. Prefer generators or comprehensions, and declarative code to imperative repetition when readable.
Small, meaningful functions. Functions must be small, clear in purpose, single responsibility, and obey command/query segregation.
Name things precisely. Use clear, descriptive variable and function names. For booleans, prefer names withis,has, orshould.
Structure log...
Files:
src/wrap.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap.rs
⏰ Context from checks skipped due to timeout of 240000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-test
🔇 Additional comments (1)
src/wrap.rs (1)
254-254: Verify unconditionallast_splitresets insrc/wrap.rsUnconditional resets at lines 243 and 254 clear any previously recorded split point after handling punctuation or at the end of each token iteration. Ensure this change does not regress whitespace-based breaks:
- Inspect the whitespace branch at lines 210–214 where
last_splitis set on all-whitespace tokens.- Review the resets at line 243 (inside the punctuation handler) and line 254 (fallback for all other tokens).
- Run the wrapping test suite, paying special attention to scenarios with consecutive spaces or tabs:
cargo test wrap_text_unmatched_backticks wrap_text_preserves_hyphenated_words wrap_text_preserves_code_spans cargo test test_wrap_list_item test_wrap_blockquote test_wrap_hard_linebreak_backslashConfirm all tests pass to validate unchanged whitespace splitting behaviour.
|
|
||
| const TRAILING_PUNCTUATION: &[char] = &['.', ',', ';', ':', '?', '!', '…']; | ||
|
|
||
| fn is_punctuation(token: &str) -> bool { token.chars().all(|c| TRAILING_PUNCTUATION.contains(&c)) } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Improve function documentation and consider edge cases.
The is_punctuation function lacks documentation and doesn't handle empty strings explicitly. Add rustdoc comments explaining the function's purpose and behaviour.
Apply this diff to add documentation and improve robustness:
-fn is_punctuation(token: &str) -> bool { token.chars().all(|c| TRAILING_PUNCTUATION.contains(&c)) }
+/// Check if a token consists entirely of trailing punctuation characters.
+///
+/// Returns `true` if all characters in the token are found in `TRAILING_PUNCTUATION`,
+/// and `false` for empty strings.
+///
+/// # Examples
+///
+/// ```
+/// # use mdtablefix::wrap::is_punctuation;
+/// assert!(is_punctuation(".,!"));
+/// assert!(!is_punctuation("word."));
+/// assert!(!is_punctuation(""));
+/// ```
+fn is_punctuation(token: &str) -> bool {
+ !token.is_empty() && token.chars().all(|c| TRAILING_PUNCTUATION.contains(&c))
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn is_punctuation(token: &str) -> bool { token.chars().all(|c| TRAILING_PUNCTUATION.contains(&c)) } | |
| /// Check if a token consists entirely of trailing punctuation characters. | |
| /// | |
| /// Returns `true` if all characters in the token are found in `TRAILING_PUNCTUATION`, | |
| /// and `false` for empty strings. | |
| /// | |
| /// # Examples | |
| /// | |
| /// |
🤖 Prompt for AI Agents
In src/wrap.rs at line 196, the is_punctuation function lacks documentation and
does not explicitly handle empty strings. Add rustdoc comments above the
function to explain its purpose and usage with examples. Modify the function to
return false for empty strings by checking if the token is not empty before
verifying all characters are punctuation.
Summary
Testing
make fmtmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_687f3b34074c8322bef6bb579224f3f8
Summary by Sourcery
Fix link wrapping to prevent orphaned punctuation by introducing punctuation detection and updating the wrapping logic
Bug Fixes:
Enhancements:
Tests: