Skip to content

Fix link wrapping#128

Merged
leynos merged 6 commits intomainfrom
codex/add-test-for-markdown-wrapping-behavior
Jul 30, 2025
Merged

Fix link wrapping#128
leynos merged 6 commits intomainfrom
codex/add-test-for-markdown-wrapping-behavior

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Jul 28, 2025

Summary

  • keep punctuation attached to links when wrapping text
  • add regression test to cover the behaviour

Testing

  • make fmt
  • make lint
  • make test

https://chatgpt.com/codex/tasks/task_e_68880336f7b48322baf57d4482cc75eb

Summary by Sourcery

Include trailing punctuation in link tokens during wrapping and add a regression test to verify the behavior

Bug Fixes:

  • Treat punctuation immediately following a link as part of the link token so it stays attached during text wrapping

Tests:

  • Add regression test to ensure trailing punctuation remains attached to links when wrapping lines

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 28, 2025

Warning

Rate limit exceeded

@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 0 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 3d9d19b and 9988759.

📒 Files selected for processing (3)
  • AGENTS.md (1 hunks)
  • docs/architecture.md (1 hunks)
  • src/wrap.rs (5 hunks)

Summary by CodeRabbit

  • Documentation
    • Added a new section explaining how trailing punctuation after Markdown links or images is handled during line wrapping, with examples.
  • Bug Fixes
    • Improved line wrapping to ensure punctuation marks following links or images remain grouped with the link, preventing awkward line breaks.
  • Tests
    • Introduced several new tests to verify correct handling of links with trailing punctuation and punctuation within link text or URLs.

Walkthrough

Introduce logic to treat trailing punctuation after Markdown links and images as separate tokens, but merge them with the link during line wrapping to prevent undesirable line breaks. Update documentation to describe this behaviour and add comprehensive tests to verify correct handling of various punctuation scenarios.

Changes

Cohort / File(s) Change Summary
Documentation update
docs/architecture.md
Add a section explaining link punctuation handling, with examples and rationale for grouping punctuation with links during wrapping.
Punctuation tokenisation and wrapping logic
src/wrap.rs
Implement is_trailing_punctuation helper; update tokenisation to separate trailing punctuation after links/images; adjust wrapping logic to merge such punctuation with links.
Tests for link wrapping with punctuation
tests/wrap.rs
Add multiple tests verifying that links followed by punctuation, or with punctuation inside link text/URLs, are wrapped correctly and not split.

Sequence Diagram(s)

sequenceDiagram
    participant Input as Markdown Input
    participant Tokenizer as Tokenizer
    participant Wrapper as Line Wrapper
    participant Output as Wrapped Output

    Input->>Tokenizer: Parse Markdown line
    Tokenizer->>Tokenizer: Detect link or image
    Tokenizer->>Tokenizer: Identify trailing punctuation after link/image
    Tokenizer->>Tokenizer: Tokenise link/image and punctuation separately
    Tokenizer->>Wrapper: Pass token list
    Wrapper->>Wrapper: Iterate tokens
    Wrapper->>Wrapper: If link token followed by punctuation token, merge them
    Wrapper->>Output: Emit wrapped line with link and punctuation together
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • Fix link wrapping #99: Implements atomic tokenisation of Markdown links, which this PR extends by introducing special handling for trailing punctuation tokens during wrapping.

Poem

When links and dots walk hand in hand,
No longer split across the land.
Punctuation hugs its friendly link,
Together wrapped, just as you think.
With tests and docs, the work is done—
Markdown tidied, line by line,
Wrapping links is now divine!

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/add-test-for-markdown-wrapping-behavior

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Jul 28, 2025

Reviewer's Guide

The PR enhances the link-wrapping logic to treat punctuation immediately following a link as part of the link token, preventing unwanted breaks, and adds a regression test to verify this behavior.

Class diagram for updated link parsing logic in wrap.rs

classDiagram
    class wrap {
        +parse_link_or_image(chars: &[char], i: usize) -> (String, usize)
    }
    %% The parse_link_or_image function now includes logic to treat trailing punctuation (.,!?:;) as part of the link token.
Loading

File-Level Changes

Change Details Files
Treat trailing punctuation as part of the link token
  • Extend parse_link_or_image to check for and include '.', ',', '!', '?', ':', ';' after a link
src/wrap.rs
Add regression test for link wrapping with trailing punctuation
  • Introduce test_wrap_link_with_trailing_punctuation to assert punctuation remains attached to links
tests/wrap.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.

  • The new logic only handles a single trailing punctuation character, so consider extending it to capture sequences like multiple punctuation marks or ellipses.
  • Rather than merging punctuation into the link token, you might treat it as a separate token during wrapping to avoid changing the actual link syntax.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new logic only handles a single trailing punctuation character, so consider extending it to capture sequences like multiple punctuation marks or ellipses.
- Rather than merging punctuation into the link token, you might treat it as a separate token during wrapping to avoid changing the actual link syntax.

## Individual Comments

### Comment 1
<location> `tests/wrap.rs:481` </location>
<code_context>
+/// Ensures punctuation immediately following a link remains attached when
+/// wrapping lines.
+#[test]
+fn test_wrap_link_with_trailing_punctuation() {
+    let input = lines_vec![
+        "[`rust-multithreaded-logging-framework-for-python-design.md`](./\
+         rust-multithreaded-logging-framework-for-python-design.md).",
+    ];
+    let output = process_stream(&input);
+    assert_eq!(output, input);
+}
</code_context>

<issue_to_address>
Consider adding more test cases for different punctuation and edge cases.

Please add tests for links followed by various punctuation marks, links at line ends, links without trailing punctuation, and links with punctuation inside the link text or URL to improve coverage.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
/// Ensures punctuation immediately following a link remains attached when
/// wrapping lines.
#[test]
fn test_wrap_link_with_trailing_punctuation() {
    let input = lines_vec![
        "[`rust-multithreaded-logging-framework-for-python-design.md`](./\
         rust-multithreaded-logging-framework-for-python-design.md).",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}
=======
/// Ensures punctuation immediately following a link remains attached when
/// wrapping lines.
#[test]
fn test_wrap_link_with_trailing_punctuation() {
    let input = lines_vec![
        "[`rust-multithreaded-logging-framework-for-python-design.md`](./\
         rust-multithreaded-logging-framework-for-python-design.md).",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}

/// Test links followed by various punctuation marks.
#[test]
fn test_wrap_link_with_various_trailing_punctuation() {
    let punctuations = [".", ",", ";", ":", "!", "?", ")", "]", "\"", "'"];
    for &punct in &punctuations {
        let input = lines_vec![
            &format!(
                "[link](https://example.com){}",
                punct
            ),
        ];
        let output = process_stream(&input);
        assert_eq!(output, input, "Failed for punctuation: {}", punct);
    }
}

/// Test link at the end of a line (no trailing punctuation).
#[test]
fn test_wrap_link_at_line_end() {
    let input = lines_vec![
        "[link](https://example.com)",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}

/// Test link with punctuation inside the link text.
#[test]
fn test_wrap_link_with_punctuation_in_text() {
    let input = lines_vec![
        "[link!](https://example.com)",
        "[foo, bar](https://example.com)",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}

/// Test link with punctuation inside the URL.
#[test]
fn test_wrap_link_with_punctuation_in_url() {
    let input = lines_vec![
        "[link](https://example.com/foo,bar)",
        "[link](https://example.com/hello!world)",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}

/// Test link with no trailing punctuation and no special cases.
#[test]
fn test_wrap_link_without_trailing_punctuation() {
    let input = lines_vec![
        "[link](https://example.com)",
    ];
    let output = process_stream(&input);
    assert_eq!(output, input);
}
>>>>>>> REPLACE

</suggested_fix>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/wrap.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
tests/wrap.rs (3)

478-488: LGTM! Comprehensive test for basic trailing punctuation.

The test correctly verifies that links with trailing punctuation remain intact during wrapping.


490-503: LGTM! Excellent parameterised testing approach.

The rstest parameterisation efficiently tests multiple punctuation marks, ensuring comprehensive coverage whilst avoiding code duplication.


505-527: LGTM! Thorough edge case coverage.

The additional tests properly cover links at line ends, punctuation within link text, and punctuation within URLs. This comprehensive test suite ensures the feature works correctly across various scenarios.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8174d7 and 2dad8c6.

📒 Files selected for processing (3)
  • docs/architecture.md (1 hunks)
  • src/wrap.rs (3 hunks)
  • tests/wrap.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md

📄 CodeRabbit Inference Engine (AGENTS.md)

docs/**/*.md: Reference: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Update: When new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve, proactively update the relevant file(s) in the docs/ directory to reflect the latest state. Ensure the documentation remains accurate and current.

Files:

  • docs/architecture.md
**/*.md

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.md: Documentation must use en-GB-oxendict spelling and grammar. (EXCEPTION: the naming of the "LICENSE" file, which is to be left unchanged for community consistency.)
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by running make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks must be wrapped at 120 columns.
Tables and headings must not be wrapped.
Use dashes (-) for list bullets.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.

Files:

  • docs/architecture.md

⚙️ CodeRabbit Configuration File

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/architecture.md
**/*.rs

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Use eyre::Report for human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and to eyre only in the main main() entrypoint or top-level async task.

Files:

  • src/wrap.rs
  • tests/wrap.rs

⚙️ CodeRabbit Configuration File

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.

Files:

  • src/wrap.rs
  • tests/wrap.rs
🧬 Code Graph Analysis (1)
tests/wrap.rs (1)
src/process.rs (1)
  • process_stream (206-214)
🪛 LanguageTool
docs/architecture.md

[style] ~313-~313: Would you like to use the Oxford spelling “tokenized”? The spelling ‘tokenised’ is also correct.
Context: ...y following a Markdown link or image is tokenised separately and grouped with the link wh...

(OXFORD_SPELLING_Z_NOT_S)


[uncategorized] ~321-~321: Possible missing comma found.
Context: ...markdown link. ``` on a single line rather than splitting the punctuation o...

(AI_HYDRA_LEO_MISSING_COMMA)

🔇 Additional comments (3)
src/wrap.rs (3)

65-70: LGTM! Comprehensive punctuation character set.

The function correctly identifies common trailing punctuation characters that should remain attached to links during wrapping.


115-125: LGTM! Proper separation and collection of trailing punctuation.

The logic correctly collects consecutive trailing punctuation characters immediately following link parsing and adds them as separate tokens. This enables the wrapping logic to merge them back with the link token when needed.


217-233: Confirm link detection pattern robustness

Tests in tests/wrap.rs exercise nested parentheses in URLs, image links, trailing punctuation, and links at line ends. No further adjustments required.

Comment thread docs/architecture.md
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Jul 30, 2025

@coderabbitai Have these now been resolved:

  • The new logic only handles a single trailing punctuation character, so consider extending it to capture sequences like multiple punctuation marks or ellipses.

  • Rather than merging punctuation into the link token, you might treat it as a separate token during wrapping to avoid changing the actual link syntax.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 30, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Let's examine the current implementation in src/wrap.rs to see how trailing punctuation is handled
echo "=== Examining src/wrap.rs for trailing punctuation handling ==="
cat src/wrap.rs

echo -e "\n=== Looking for tests with multiple punctuation ==="
# Check if there are tests for multiple punctuation scenarios
rg -A 5 -B 5 "\.\.\.|\!\!|\?\?" tests/wrap.rs

Length of output: 18912


🏁 Script executed:

#!/bin/bash
# Let's check the tests more carefully for multiple punctuation scenarios
echo "=== Looking for tests with ellipses or multiple punctuation ==="
rg -A 5 -B 5 '\.\.\.' tests/wrap.rs

echo -e "\n=== Looking for any test functions related to punctuation ==="
rg -A 10 "fn.*punctuation" tests/wrap.rs

echo -e "\n=== Examining all test cases to see what punctuation scenarios are covered ==="
cat tests/wrap.rs

Length of output: 21597


@leynos Looking at the current implementation, I can answer both questions:

Issue 1 - Multiple punctuation handling: ✅ RESOLVED

The logic now handles sequences of multiple punctuation characters correctly. In tokenize_inline, there's a while loop that collects all consecutive trailing punctuation:

while new_i < chars.len() && is_trailing_punctuation(chars[new_i]) {
    punct.push(chars[new_i]);
    new_i += 1;
}

And in wrap_preserving_code, another while loop merges all consecutive punctuation tokens:

while j < tokens.len() && tokens[j].chars().all(is_trailing_punctuation) {
    token.push_str(&tokens[j]);
    token_width += UnicodeWidthStr::width(tokens[j].as_str());
    j += 1;
}

The tests confirm this works - there's a parameterized test that includes #[case("...")] for ellipses.

Issue 2 - Separate tokens during wrapping: ❌ NOT RESOLVED

The implementation still merges punctuation into the link token during wrapping rather than treating them as separate tokens. The current approach:

  1. Creates separate tokens during tokenization (link + punctuation)
  2. Merges them back together during wrapping to prevent line breaks

So while the tokenization creates separate tokens, the wrapping phase combines them to ensure they stay together, which maintains the original merging strategy you had suggested moving away from.

@leynos leynos force-pushed the codex/add-test-for-markdown-wrapping-behavior branch from 461e191 to 3d9d19b Compare July 30, 2025 20:33
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Jul 30, 2025

@coderabbitai

Please fix the issues in the code resulting in the following compile errors:

Checking mdtablefix v0.1.0 (/home/runner/work/mdtablefix/mdtablefix)

error[E0425]: cannot find value token in this scope
Error: --> src/wrap.rs:282:16
|
282 | && token.len() == 1
| ^^^^^ help: a local variable with a similar name exists: tokens
error[E0425]: cannot find value token in this scope
Error: --> src/wrap.rs:283:34
|
283 | && ".?!,:;".contains(token.as_str())
| ^^^^^ help: a local variable with a similar name exists: tokens
error[E0425]: cannot find value token in this scope
Error: --> src/wrap.rs:291:28
|
291 | .push_str(&token);
| ^^^^^ help: a local variable with a similar name exists: tokens
For more information about this error, try rustc --explain E0425.
error: could not compile mdtablefix (lib) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
make: *** [Makefile:25: lint] Error 101
Error: Process completed with exit code 2.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jul 30, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@leynos Looking at the compile errors, there are variable scope issues in src/wrap.rs. Let me examine the problematic code and provide the fix:

Based on the error messages, the issue is that token is being referenced but not defined in the current scope. The code should be referencing the current token from the tokens array. Here's the fix:

// Around lines 280-295, change the references from `token` to `tokens[i]`
            if i < tokens.len()
            && tokens[i].len() == 1
            && ".?!,:;".contains(tokens[i].as_str())
        {
            // This is trailing punctuation, keep it with the link
            current_line
                .last_mut()
                .unwrap()
                .push_str(&tokens[i]);

The issue is that the code is trying to access a variable token that doesn't exist. It should be accessing tokens[i] (or whatever the appropriate index variable is in that context) to get the current token from the tokens array.

Run make lint again after making this change to verify the compilation errors are resolved.


🏁 Script executed:

#!/bin/bash
# Examine the problematic lines in src/wrap.rs
echo "=== Examining src/wrap.rs around the error lines ==="
cat -n src/wrap.rs | sed -n '275,295p'

echo -e "\n=== Full context around the problematic function ==="
# Let's get more context to understand the function structure
cat -n src/wrap.rs | sed -n '260,310p'

Length of output: 3308


🏁 Script executed:

#!/bin/bash
# Get more context to understand the full function structure
echo "=== Examining the full wrap_preserving_code function ==="
ast-grep --pattern 'fn wrap_preserving_code($$$) {
  $$$
}'

Length of output: 184


Now I can see the issue clearly. The problem is on lines 282, 283, and 291 where token is referenced but should be tokens[i]. Here's the fix:

        if current.is_empty()
            && tokens[i].len() == 1
            && ".?!,:;".contains(tokens[i].as_str())
            && lines
                .last()
                .is_some_and(|l: &String| l.trim_end().ends_with('`'))
        {
            lines
                .last_mut()
                .expect("checked last line exists")
                .push_str(&tokens[i]);
            continue;
        }

The fix is simple: replace all three instances of token with tokens[i] since we're iterating through the tokens array with index i.

Specifically:

  • Line 282: token.len() == 1tokens[i].len() == 1
  • Line 283: ".?!,:;".contains(token.as_str())".?!,:;".contains(tokens[i].as_str())
  • Line 291: .push_str(&token).push_str(&tokens[i])

Apply these changes and run make lint again to verify the compilation errors are resolved.

Replaces incorrect local variable usage with correct token reference when appending punctuation to wrapped lines. Prevents potential logic errors when preserving code blocks containing punctuation.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
src/wrap.rs (1)

282-293: Fix compilation error: undefined variable token.

The variable token is not defined in this scope. Use tokens[i] instead.

         if current.is_empty()
-            && token.len() == 1
-            && ".?!,:;".contains(token.as_str())
+            && tokens[i].len() == 1
+            && ".?!,:;".contains(tokens[i].as_str())
             && lines
                 .last()
                 .is_some_and(|l: &String| l.trim_end().ends_with('`'))
         {
             lines
                 .last_mut()
                 .expect("checked last line exists")
-                .push_str(&token);
+                .push_str(&tokens[i]);
+            i += 1;
             continue;
         }
♻️ Duplicate comments (1)
docs/architecture.md (1)

321-322: Add comma for improved readability.

-on a single line rather than splitting the punctuation onto the next line when
+on a single line, rather than splitting the punctuation onto the next line when
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 461e191 and 3d9d19b.

📒 Files selected for processing (3)
  • docs/architecture.md (1 hunks)
  • src/wrap.rs (6 hunks)
  • tests/wrap.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md

📄 CodeRabbit Inference Engine (AGENTS.md)

docs/**/*.md: Reference: Use the markdown files within the docs/ directory as a knowledge base and source of truth for project requirements, dependency choices, and architectural decisions.
Update: When new decisions are made, requirements change, libraries are added/removed, or architectural patterns evolve, proactively update the relevant file(s) in the docs/ directory to reflect the latest state. Ensure the documentation remains accurate and current.

Files:

  • docs/architecture.md
**/*.md

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.md: Documentation must use en-GB-oxendict spelling and grammar. (EXCEPTION: the naming of the "LICENSE" file, which is to be left unchanged for community consistency.)
Validate Markdown files using make markdownlint.
Run make fmt after any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by running make nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks must be wrapped at 120 columns.
Tables and headings must not be wrapped.
Use dashes (-) for list bullets.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.

Files:

  • docs/architecture.md

⚙️ CodeRabbit Configuration File

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)
  • Code blocks should be wrapped to 120 columns.
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/architecture.md
**/*.rs

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Use eyre::Report for human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and to eyre only in the main main() entrypoint or top-level async task.

Files:

  • src/wrap.rs
  • tests/wrap.rs

⚙️ CodeRabbit Configuration File

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.

Files:

  • src/wrap.rs
  • tests/wrap.rs
🧬 Code Graph Analysis (1)
tests/wrap.rs (1)
src/process.rs (1)
  • process_stream (206-214)
🪛 LanguageTool
docs/architecture.md

[style] ~313-~313: Would you like to use the Oxford spelling “tokenized”? The spelling ‘tokenised’ is also correct.
Context: ...y following a Markdown link or image is tokenised separately and grouped with the link wh...

(OXFORD_SPELLING_Z_NOT_S)

🪛 GitHub Actions: CI
src/wrap.rs

[error] 282-282: Rust compiler error E0425: cannot find value token in this scope.

🔇 Additional comments (6)
tests/wrap.rs (2)

478-503: Comprehensive test coverage for link punctuation handling.

The tests effectively cover the main use cases and edge cases for the new link punctuation handling feature, including multiple trailing punctuation characters as mentioned in the PR objectives.


505-527: Well-structured edge case tests.

These tests appropriately cover important edge cases: links at line end without punctuation, punctuation within link text, and punctuation in URLs.

src/wrap.rs (4)

117-123: Well-defined trailing punctuation helper.

The function correctly identifies common trailing punctuation characters using idiomatic pattern matching.


167-177: Correct implementation of trailing punctuation tokenization.

The code properly collects consecutive trailing punctuation characters into a separate token after links/images, enabling the grouping behaviour during wrapping.


269-280: Effective grouping of links with trailing punctuation.

The logic correctly identifies link tokens and groups them with subsequent punctuation-only tokens, calculating the combined width for atomic wrapping decisions.


295-305: Proper handling of grouped tokens within line width.

The code correctly appends all tokens in the group when they fit within the line width, maintaining the last split position for whitespace tokens.

Comment thread docs/architecture.md Outdated
leynos and others added 2 commits July 30, 2025 21:57
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@leynos leynos merged commit a007e15 into main Jul 30, 2025
2 checks passed
@leynos leynos deleted the codex/add-test-for-markdown-wrapping-behavior branch July 30, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant