Conversation
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 0 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
Summary by CodeRabbit
WalkthroughIntroduce logic to treat trailing punctuation after Markdown links and images as separate tokens, but merge them with the link during line wrapping to prevent undesirable line breaks. Update documentation to describe this behaviour and add comprehensive tests to verify correct handling of various punctuation scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Input as Markdown Input
participant Tokenizer as Tokenizer
participant Wrapper as Line Wrapper
participant Output as Wrapped Output
Input->>Tokenizer: Parse Markdown line
Tokenizer->>Tokenizer: Detect link or image
Tokenizer->>Tokenizer: Identify trailing punctuation after link/image
Tokenizer->>Tokenizer: Tokenise link/image and punctuation separately
Tokenizer->>Wrapper: Pass token list
Wrapper->>Wrapper: Iterate tokens
Wrapper->>Wrapper: If link token followed by punctuation token, merge them
Wrapper->>Output: Emit wrapped line with link and punctuation together
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Reviewer's GuideThe PR enhances the link-wrapping logic to treat punctuation immediately following a link as part of the link token, preventing unwanted breaks, and adds a regression test to verify this behavior. Class diagram for updated link parsing logic in wrap.rsclassDiagram
class wrap {
+parse_link_or_image(chars: &[char], i: usize) -> (String, usize)
}
%% The parse_link_or_image function now includes logic to treat trailing punctuation (.,!?:;) as part of the link token.
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.
- The new logic only handles a single trailing punctuation character, so consider extending it to capture sequences like multiple punctuation marks or ellipses.
- Rather than merging punctuation into the link token, you might treat it as a separate token during wrapping to avoid changing the actual link syntax.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new logic only handles a single trailing punctuation character, so consider extending it to capture sequences like multiple punctuation marks or ellipses.
- Rather than merging punctuation into the link token, you might treat it as a separate token during wrapping to avoid changing the actual link syntax.
## Individual Comments
### Comment 1
<location> `tests/wrap.rs:481` </location>
<code_context>
+/// Ensures punctuation immediately following a link remains attached when
+/// wrapping lines.
+#[test]
+fn test_wrap_link_with_trailing_punctuation() {
+ let input = lines_vec.",
+ ];
+ let output = process_stream(&input);
+ assert_eq!(output, input);
+}
</code_context>
<issue_to_address>
Consider adding more test cases for different punctuation and edge cases.
Please add tests for links followed by various punctuation marks, links at line ends, links without trailing punctuation, and links with punctuation inside the link text or URL to improve coverage.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
/// Ensures punctuation immediately following a link remains attached when
/// wrapping lines.
#[test]
fn test_wrap_link_with_trailing_punctuation() {
let input = lines_vec.",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
=======
/// Ensures punctuation immediately following a link remains attached when
/// wrapping lines.
#[test]
fn test_wrap_link_with_trailing_punctuation() {
let input = lines_vec.",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
/// Test links followed by various punctuation marks.
#[test]
fn test_wrap_link_with_various_trailing_punctuation() {
let punctuations = [".", ",", ";", ":", "!", "?", ")", "]", "\"", "'"];
for &punct in &punctuations {
let input = lines_vec{}",
punct
),
];
let output = process_stream(&input);
assert_eq!(output, input, "Failed for punctuation: {}", punct);
}
}
/// Test link at the end of a line (no trailing punctuation).
#[test]
fn test_wrap_link_at_line_end() {
let input = lines_vec",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
/// Test link with punctuation inside the link text.
#[test]
fn test_wrap_link_with_punctuation_in_text() {
let input = lines_vec",
"[foo, bar](https://example.com)",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
/// Test link with punctuation inside the URL.
#[test]
fn test_wrap_link_with_punctuation_in_url() {
let input = lines_vec",
"[link](https://example.com/hello!world)",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
/// Test link with no trailing punctuation and no special cases.
#[test]
fn test_wrap_link_without_trailing_punctuation() {
let input = lines_vec",
];
let output = process_stream(&input);
assert_eq!(output, input);
}
>>>>>>> REPLACE
</suggested_fix>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
tests/wrap.rs (3)
478-488: LGTM! Comprehensive test for basic trailing punctuation.The test correctly verifies that links with trailing punctuation remain intact during wrapping.
490-503: LGTM! Excellent parameterised testing approach.The rstest parameterisation efficiently tests multiple punctuation marks, ensuring comprehensive coverage whilst avoiding code duplication.
505-527: LGTM! Thorough edge case coverage.The additional tests properly cover links at line ends, punctuation within link text, and punctuation within URLs. This comprehensive test suite ensures the feature works correctly across various scenarios.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
docs/architecture.md(1 hunks)src/wrap.rs(3 hunks)tests/wrap.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
docs/**/*.md: Reference: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Update: When new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve, proactively update the relevant file(s) in thedocs/directory to reflect the latest state. Ensure the documentation remains accurate and current.
Files:
docs/architecture.md
**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.md: Documentation must use en-GB-oxendict spelling and grammar. (EXCEPTION: the naming of the "LICENSE" file, which is to be left unchanged for community consistency.)
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks must be wrapped at 120 columns.
Tables and headings must not be wrapped.
Use dashes (-) for list bullets.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/architecture.md
⚙️ CodeRabbit Configuration File
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/architecture.md
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derivestd::error::Error(via thethiserrorcrate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Useeyre::Reportfor human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and toeyreonly in the mainmain()entrypoint or top-level async task.
Files:
src/wrap.rstests/wrap.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap.rstests/wrap.rs
🧬 Code Graph Analysis (1)
tests/wrap.rs (1)
src/process.rs (1)
process_stream(206-214)
🪛 LanguageTool
docs/architecture.md
[style] ~313-~313: Would you like to use the Oxford spelling “tokenized”? The spelling ‘tokenised’ is also correct.
Context: ...y following a Markdown link or image is tokenised separately and grouped with the link wh...
(OXFORD_SPELLING_Z_NOT_S)
[uncategorized] ~321-~321: Possible missing comma found.
Context: ...markdown link. ``` on a single line rather than splitting the punctuation o...
(AI_HYDRA_LEO_MISSING_COMMA)
🔇 Additional comments (3)
src/wrap.rs (3)
65-70: LGTM! Comprehensive punctuation character set.The function correctly identifies common trailing punctuation characters that should remain attached to links during wrapping.
115-125: LGTM! Proper separation and collection of trailing punctuation.The logic correctly collects consecutive trailing punctuation characters immediately following link parsing and adds them as separate tokens. This enables the wrapping logic to merge them back with the link token when needed.
217-233: Confirm link detection pattern robustnessTests in tests/wrap.rs exercise nested parentheses in URLs, image links, trailing punctuation, and links at line ends. No further adjustments required.
|
@coderabbitai Have these now been resolved:
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 18912 🏁 Script executed: Length of output: 21597 @leynos Looking at the current implementation, I can answer both questions: Issue 1 - Multiple punctuation handling: ✅ RESOLVED The logic now handles sequences of multiple punctuation characters correctly. In while new_i < chars.len() && is_trailing_punctuation(chars[new_i]) {
punct.push(chars[new_i]);
new_i += 1;
}And in while j < tokens.len() && tokens[j].chars().all(is_trailing_punctuation) {
token.push_str(&tokens[j]);
token_width += UnicodeWidthStr::width(tokens[j].as_str());
j += 1;
}The tests confirm this works - there's a parameterized test that includes Issue 2 - Separate tokens during wrapping: ❌ NOT RESOLVED The implementation still merges punctuation into the link token during wrapping rather than treating them as separate tokens. The current approach:
So while the tokenization creates separate tokens, the wrapping phase combines them to ensure they stay together, which maintains the original merging strategy you had suggested moving away from. |
461e191 to
3d9d19b
Compare
|
Please fix the issues in the code resulting in the following compile errors: error[E0425]: cannot find value |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@leynos Looking at the compile errors, there are variable scope issues in Based on the error messages, the issue is that // Around lines 280-295, change the references from `token` to `tokens[i]`
if i < tokens.len()
&& tokens[i].len() == 1
&& ".?!,:;".contains(tokens[i].as_str())
{
// This is trailing punctuation, keep it with the link
current_line
.last_mut()
.unwrap()
.push_str(&tokens[i]);The issue is that the code is trying to access a variable Run 🏁 Script executed: Length of output: 3308 🏁 Script executed: Length of output: 184 Now I can see the issue clearly. The problem is on lines 282, 283, and 291 where if current.is_empty()
&& tokens[i].len() == 1
&& ".?!,:;".contains(tokens[i].as_str())
&& lines
.last()
.is_some_and(|l: &String| l.trim_end().ends_with('`'))
{
lines
.last_mut()
.expect("checked last line exists")
.push_str(&tokens[i]);
continue;
}The fix is simple: replace all three instances of Specifically:
Apply these changes and run |
Replaces incorrect local variable usage with correct token reference when appending punctuation to wrapped lines. Prevents potential logic errors when preserving code blocks containing punctuation.
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
src/wrap.rs (1)
282-293: Fix compilation error: undefined variabletoken.The variable
tokenis not defined in this scope. Usetokens[i]instead.if current.is_empty() - && token.len() == 1 - && ".?!,:;".contains(token.as_str()) + && tokens[i].len() == 1 + && ".?!,:;".contains(tokens[i].as_str()) && lines .last() .is_some_and(|l: &String| l.trim_end().ends_with('`')) { lines .last_mut() .expect("checked last line exists") - .push_str(&token); + .push_str(&tokens[i]); + i += 1; continue; }
♻️ Duplicate comments (1)
docs/architecture.md (1)
321-322: Add comma for improved readability.-on a single line rather than splitting the punctuation onto the next line when +on a single line, rather than splitting the punctuation onto the next line when
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
docs/architecture.md(1 hunks)src/wrap.rs(6 hunks)tests/wrap.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
docs/**/*.md: Reference: Use the markdown files within thedocs/directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Update: When new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve, proactively update the relevant file(s) in thedocs/directory to reflect the latest state. Ensure the documentation remains accurate and current.
Files:
docs/architecture.md
**/*.md
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.md: Documentation must use en-GB-oxendict spelling and grammar. (EXCEPTION: the naming of the "LICENSE" file, which is to be left unchanged for community consistency.)
Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks must be wrapped at 120 columns.
Tables and headings must not be wrapped.
Use dashes (-) for list bullets.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/architecture.md
⚙️ CodeRabbit Configuration File
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
- Code blocks should be wrapped to 120 columns.
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/architecture.md
**/*.rs
📄 CodeRabbit Inference Engine (AGENTS.md)
**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not usereturnin single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallow.
Prefer.expect()over.unwrap().
Useconcat!()to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derivestd::error::Error(via thethiserrorcrate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Useeyre::Reportfor human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and toeyreonly in the mainmain()entrypoint or top-level async task.
Files:
src/wrap.rstests/wrap.rs
⚙️ CodeRabbit Configuration File
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Files:
src/wrap.rstests/wrap.rs
🧬 Code Graph Analysis (1)
tests/wrap.rs (1)
src/process.rs (1)
process_stream(206-214)
🪛 LanguageTool
docs/architecture.md
[style] ~313-~313: Would you like to use the Oxford spelling “tokenized”? The spelling ‘tokenised’ is also correct.
Context: ...y following a Markdown link or image is tokenised separately and grouped with the link wh...
(OXFORD_SPELLING_Z_NOT_S)
🪛 GitHub Actions: CI
src/wrap.rs
[error] 282-282: Rust compiler error E0425: cannot find value token in this scope.
🔇 Additional comments (6)
tests/wrap.rs (2)
478-503: Comprehensive test coverage for link punctuation handling.The tests effectively cover the main use cases and edge cases for the new link punctuation handling feature, including multiple trailing punctuation characters as mentioned in the PR objectives.
505-527: Well-structured edge case tests.These tests appropriately cover important edge cases: links at line end without punctuation, punctuation within link text, and punctuation in URLs.
src/wrap.rs (4)
117-123: Well-defined trailing punctuation helper.The function correctly identifies common trailing punctuation characters using idiomatic pattern matching.
167-177: Correct implementation of trailing punctuation tokenization.The code properly collects consecutive trailing punctuation characters into a separate token after links/images, enabling the grouping behaviour during wrapping.
269-280: Effective grouping of links with trailing punctuation.The logic correctly identifies link tokens and groups them with subsequent punctuation-only tokens, calculating the combined width for atomic wrapping decisions.
295-305: Proper handling of grouped tokens within line width.The code correctly appends all tokens in the group when they fit within the line width, maintaining the last split position for whitespace tokens.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary
Testing
make fmtmake lintmake testhttps://chatgpt.com/codex/tasks/task_e_68880336f7b48322baf57d4482cc75eb
Summary by Sourcery
Include trailing punctuation in link tokens during wrapping and add a regression test to verify the behavior
Bug Fixes:
Tests: