Skip to content

📝 CodeRabbit Chat: Implement requested code changes#274

Open
coderabbitai[bot] wants to merge 6 commits intomainfrom
coderabbitai/chat/acb7a03
Open

📝 CodeRabbit Chat: Implement requested code changes#274
coderabbitai[bot] wants to merge 6 commits intomainfrom
coderabbitai/chat/acb7a03

Conversation

@coderabbitai
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot commented Apr 23, 2026

Code changes was requested by @leynos.

The following files were modified:

  • docs/adrs/0002-textwrap-inline-wrapping.md
  • docs/developers-guide.md
  • docs/users-guide.md
  • src/wrap/inline.rs
  • src/wrap/inline/postprocess.rs
  • src/wrap/paragraph.rs

Summary by Sourcery

Document and test the refactored inline wrapping pipeline that delegates line fitting to textwrap, clarifying architecture and behaviour while keeping the public wrapping API stable.

Enhancements:

  • Improve inline wrapping code with richer Rustdoc comments around fragment classification, post-processing passes, and paragraph wrapping helpers.

Documentation:

  • Describe the wrap module architecture, key types, width accounting, and design decisions in the developers guide.
  • Document paragraph wrapping behaviour, including handling of inline code, links, task list indentation, fenced code, and Unicode display widths in the users guide.
  • Record an ADR describing the move to textwrap-based inline wrapping, the new fragment model, post-processing passes, and their consequences.

Tests:

  • Add unit tests covering InlineFragment classification, whitespace-line merging, and tail rebalancing behaviour for inline wrapping.

leynos and others added 6 commits April 23, 2026 12:05
…twrap and unicode-width

Add a detailed execution plan document describing the refactor to replace the bespoke line-wrapping engine in the mdtablefix project with the `textwrap` and `unicode-width` crates. This living document outlines the purpose, constraints, risks, test coverage, and stepwise implementation plan. It serves to guide the maintainability-focused refactor ensuring preserved observable behavior while modernizing and simplifying the wrapping engine.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
- Replaced the custom LineBuffer-based inline wrapping with a fragment-based approach leveraging textwrap::wrap_algorithms::wrap_first_fit for line fitting.
- Preserved Markdown-aware token grouping and whitespace semantics by adapting fragment merging and suffix rebalancing.
- Removed the now-obsolete src/wrap/line_buffer.rs module.
- Updated paragraph wrapping to use the new inline wrapper with continuation prefixes.
- Added active regression tests for nested blockquotes, footnote continuation alignment, and checkbox indentation.
- Updated documentation reflecting the new wrapping strategy and integration with textwrap and unicode-width crates.
- Added textwrap crate as a dependency and adjusted Cargo files accordingly.

This improves maintainability by using a well-tested wrapping library while preserving existing formatting behaviors through a thin adaptation layer.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Added comprehensive Mermaid sequence and flow diagrams illustrating the control flow and sequence
of the wrap_text function and its components. Included detailed explanations of how the wrapper
decides on line classification, paragraph buffering, prefix-aware wrapping, use of unicode-width
for display width measurement, and integration with the textwrap crate. These additions improve
documentation clarity around the wrapping process and internal collaboration of components.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
Ensure that the post-wrap fragment rebalancing respects line width constraints by checking the width before moving trailing fragments across lines. This prevents overflow lines that violate the width guarantees of `wrap_first_fit` and maintains correct wrapping behavior after post-processing.

- Added width-aware checks in `rebalance_atomic_tails` to avoid creating overflow lines.
- Refactored fragment classification to track fragment kinds for better whitespace and tail handling.
- Moved whitespace-only line merging and tail rebalancing logic into a dedicated `postprocess` module.
- Updated tests to verify no overflow after tail rebalancing, including adding a regression test for the `a four five` / width `6` case.
- Updated documentation and sequence diagrams to reflect the new wrapping flow and helper module usage.

This fix improves stability and correctness of inline wrapping post-processing steps.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
- Added new tests to verify nested blockquote prefix repetition in wrap_text.
- Improved robustness in postprocess.rs by handling pop operations gracefully with else continue.
- Updated architecture and execplan docs to reflect wrapping improvements and dependency changes with textwrap.

Co-authored-by: devboxerhub[bot] <devboxerhub[bot]@users.noreply.github.com>
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 23, 2026

Reviewer's Guide

Refines the inline wrapping pipeline by documenting and tightening the fragment-based textwrap integration, adding comprehensive tests for inline fragment classification and post-processing, and expanding developer/user docs with architecture and behaviour details around wrapping, while keeping public APIs stable.

Sequence diagram for wrapping a prefixed paragraph line with post-processing

sequenceDiagram
    actor User
    participant CLI as mdtablefix
    participant WrapEntry as wrap_text
    participant PW as ParagraphWriter
    participant PI as PrefixLine
    participant WI as wrap_preserving_code
    participant TW as textwrap_wrap_first_fit
    participant PP1 as merge_whitespace_only_lines
    participant PP2 as rebalance_atomic_tails

    User->>CLI: run with --wrap
    CLI->>WrapEntry: process input lines
    WrapEntry->>PW: handle_prefix_line(state, prefix_line)
    note right of PW: state paragraph flushed if pending
    PW->>PI: read prefix, rest, repeat_prefix
    PW->>PW: append_wrapped_with_prefix(prefix_line)
    PW->>PW: compute continuation_prefix
    PW->>WI: wrap_with_prefix(prefix, continuation_prefix, rest)

    WI->>WI: tokenize::segment_inline(text)
    WI->>WI: build_fragments(tokens)
    WI->>TW: wrap_first_fit(fragments)
    TW-->>WI: raw_lines (Vec<Vec<InlineFragment>>)

    WI->>PP1: merge_whitespace_only_lines(raw_lines)
    PP1-->>WI: merged_lines
    WI->>PP2: rebalance_atomic_tails(merged_lines, width)
    PP2-->>WI: adjusted_lines

    WI-->>PW: wrapped content lines
    PW->>PW: prepend prefix and continuation_prefix
    PW-->>WrapEntry: push wrapped lines to output buffer
    WrapEntry-->>CLI: final wrapped lines
    CLI-->>User: emit reflowed Markdown
Loading

Class diagram for updated wrapping types and fragments

classDiagram
    direction LR

    class FragmentKind {
        <<enum>>
        Whitespace
        InlineCode
        Link
        Plain
    }

    class InlineFragment {
        +String text
        +usize width
        +FragmentKind kind
        +new(text String) InlineFragment
        +is_whitespace() bool
        +is_atomic() bool
        +is_plain() bool
    }

    class Fragment {
        <<trait>>
        +width() f64
        +whitespace_width() f64
        +penalty_width() f64
    }

    Fragment <|.. InlineFragment

    class ParagraphState {
        +Vec~(String,bool)~ buf
        +String indent
        +clear() void
        +note_indent(line &str) void
        +push(text String, hard_break bool) void
    }

    class PrefixLine {
        +Cow~str~ prefix
        +&str rest
        +bool repeat_prefix
    }

    class ParagraphWriter {
        +&mut Vec~String~ out
        +usize width
        +new(out &mut Vec~String~, width usize) ParagraphWriter
        +wrap_with_prefix(prefix &str, continuation_prefix &str, text &str) void
        +append_wrapped_with_prefix(line &PrefixLine) void
        +flush_paragraph(state &mut ParagraphState) void
        +push_wrapped_segment(indent &str, segment &str) void
        +push_verbatim(state &mut ParagraphState, line &str) void
        +handle_prefix_line(state &mut ParagraphState, prefix_line &PrefixLine) void
    }

    class WrapInlineModule {
        <<module>>
        +determine_token_span(tokens &Vec~String~, start usize) (usize, usize)
        +attach_punctuation_to_previous_line(lines &mut Vec~String~, token &str) bool
        +width_as_f64(width usize) f64
        +classify_fragment(text &str) FragmentKind
        +push_span_text(text &mut String, tokens &Vec~String~, span Range~usize~) void
        +build_fragments(tokens &Vec~String~) Vec~InlineFragment~
        +render_line(line &Vec~InlineFragment~, is_final_output_line bool) String
        +wrap_preserving_code(text &str, width usize) Vec~String~
    }

    class PostprocessModule {
        <<module>>
        +merge_whitespace_only_lines(lines &Vec~Vec~InlineFragment~~) Vec~Vec~InlineFragment~~
        +rebalance_atomic_tails(lines &mut Vec~Vec~InlineFragment~~, width usize) void
    }

    InlineFragment --> FragmentKind
    WrapInlineModule ..> InlineFragment
    WrapInlineModule ..> FragmentKind
    WrapInlineModule ..> ParagraphWriter
    PostprocessModule ..> InlineFragment
    ParagraphWriter ..> ParagraphState
    ParagraphWriter ..> PrefixLine
Loading

File-Level Changes

Change Details Files
Improve inline post-processing helpers and make their assumptions explicit
  • Expanded module- and function-level documentation for whitespace merging and tail rebalancing helpers to describe behaviour and invariants
  • Replaced fallible pop/last handling with expect calls that encode internal invariants for lines selected for special handling
  • Added a dedicated test module that exercises InlineFragment::new, whitespace handling, and merge_whitespace_only_lines/rebalance_atomic_tails behaviours across representative cases
src/wrap/inline/postprocess.rs
Document and clarify the fragment-based inline wrapping implementation
  • Added rich doc comments for span-kind helpers, token/fragment classification, and wrapping entry points to explain how tokens become InlineFragments, how widths are computed, and how textwrap is driven
  • Documented the semantics of FragmentKind and InlineFragment, including how atomic fragments, whitespace, and plain text are treated by post-processing
  • Clarified helper functions such as width_as_f64, push_span_text, build_fragments, render_line, and wrap_preserving_code with notes on Unicode width handling and preservation of Markdown semantics
src/wrap/inline.rs
Document paragraph wrapping, prefix handling, and paragraph state management
  • Documented PrefixLine, ParagraphState, and ParagraphWriter fields and methods to explain prefix semantics, hard line break handling, and how buffered segments are flushed
  • Clarified how continuation prefixes are derived, how indent is captured, and when lines are passed through verbatim vs wrapped
  • Explained how wrap_with_prefix computes available width based on prefix display width using unicode-width
src/wrap/paragraph.rs
Add and update documentation explaining wrap architecture and behaviour
  • Added an ADR describing the decision to delegate inline line fitting to textwrap, including the fragment model, algorithm choice, post-processing passes, prefix helper consolidation, and API-stability constraints
  • Extended the developer guide with an overview of the wrap module architecture, key types, width accounting, post-processing passes, API stability, and a pointer to the new ADR
  • Extended the user guide with a "Paragraph wrapping" section describing wrap behaviour for hyphenation, inline code/links, task list indentation, fenced code blocks, and Unicode display-width awareness
docs/adrs/0002-textwrap-inline-wrapping.md
docs/developers-guide.md
docs/users-guide.md

Possibly linked issues

  • #unknown: PR refactors wrapping to use textwrap/unicode-width and unifies prefix handling exactly as the issue proposes.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • In is_trailing_punct, the Unicode punctuation string literal appears to be split incorrectly ("…—–»›)]】》」』、。,:;!?".'".contains(c)), which will not compile; this should be a single string literal containing all characters as before.
  • The test-only fragment constructors (plain, ws, code, link) currently set width using text.len() rather than UnicodeWidthStr::width, which can diverge from production behaviour for non-ASCII text; consider using the same width function as InlineFragment::new to keep tests aligned with real wrapping logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `is_trailing_punct`, the Unicode punctuation string literal appears to be split incorrectly (`"…—–»›)]】》」』、。,:;!?".'".contains(c)`), which will not compile; this should be a single string literal containing all characters as before.
- The test-only fragment constructors (`plain`, `ws`, `code`, `link`) currently set `width` using `text.len()` rather than `UnicodeWidthStr::width`, which can diverge from production behaviour for non-ASCII text; consider using the same width function as `InlineFragment::new` to keep tests aligned with real wrapping logic.

## Individual Comments

### Comment 1
<location path="src/wrap/inline.rs" line_range="34-37" />
<code_context>
 #[inline]
 fn is_trailing_punct(c: char) -> bool {
     // ASCII closers + common Unicode closers and word-final punctuation
     matches!(
         c,
         '.' | ',' | ';' | ':' | '!' | '?' | ')' | ']' | '"' | '\''
-    ) || "…—–»›)]】》」』、。,:;!?”.’".contains(c)
+    ) || "…—–»›)]】》」』、。,:;!?".'".contains(c)
 }

</code_context>
<issue_to_address>
**issue (bug_risk):** The `is_trailing_punct` Unicode set literal appears malformed and likely does not compile or behave as intended.

The new literal changed from `"…—–»›)]】》」』、。,:;!?”.’"` to `"…—–»›)]】》」』、。,:;!?".'"`, which puts the closing quote before `”.’` and leaves `.'"` outside the string. This will either fail to compile or exclude `` and `` from the punctuation set. Please restore the full literal so all intended ASCII and Unicode closers remain inside the string.
</issue_to_address>

### Comment 2
<location path="src/wrap/inline/postprocess.rs" line_range="139" />
<code_context>
+/// Moves trailing atomic or plain fragments to the following line when doing
+/// so keeps both lines within `width` display columns.
+///
+/// After `merge_whitespace_only_lines` normalises separator lines, a wrapped
+/// line may still end with an isolated code span or word that would read more
+/// naturally at the start of the next line. This function iterates over
</code_context>
<issue_to_address>
**nitpick (review_instructions):** The comment uses "normalises" instead of the required Oxford "-ize" spelling ("normalizes").

Per the guidelines, comments should follow en-GB-oxendict spelling with the -ize form. Here that means using "normalizes" rather than "normalises".

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.rs`

**Instructions:**
Comments must use en-GB-oxendict (-ize / -yse / -our) spelling and grammar.

</details>
</issue_to_address>

### Comment 3
<location path="src/wrap/inline/postprocess.rs" line_range="176" />
<code_context>
+    use super::super::{FragmentKind, InlineFragment};
+    use super::{merge_whitespace_only_lines, rebalance_atomic_tails};
+
+    #[test]
+    fn inline_fragment_new_classifies_whitespace() {
+        let fragment = InlineFragment::new(" ".to_string());
</code_context>
<issue_to_address>
**suggestion (review_instructions):** Several classification tests duplicate the same structure and should be expressed as a single rstest-parameterised case.

The `inline_fragment_new_*` tests all exercise the same constructor behaviour with different inputs and expected kinds/widths. The instructions call for replacing such duplicated tests with `#[rstest(...)]` parameterised cases. Consider collapsing these into one or two rstest-powered tests with a cases table instead of multiple near-identical `#[test]` functions.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.rs`

**Instructions:**
Replace duplicated tests with `#[rstest(...)]` parameterised cases.

</details>
</issue_to_address>

### Comment 4
<location path="docs/users-guide.md" line_range="24" />
<code_context>
+The `--wrap` flag reflows paragraphs and list items to a target column width
+(default 80). The wrap engine uses the `textwrap` crate for greedy line fitting
+and `unicode-width` for display-column measurements, so accented characters,
+CJK glyphs, and emoji are counted by their rendered width rather than their
+byte length.
+
</code_context>
<issue_to_address>
**issue (review_instructions):** "CJK" is introduced without expanding the acronym, which violates the requirement to define uncommon acronyms on first use.

Consider expanding this to something like "Chinese–Japanese–Korean (CJK) glyphs" the first time it appears, so the acronym is defined in line with the documentation style guidelines.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Define uncommon acronyms on first use.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/wrap/inline.rs
Comment on lines 34 to +37
matches!(
c,
'.' | ',' | ';' | ':' | '!' | '?' | ')' | ']' | '"' | '\''
) || "…—–»›)]】》」』、。,:;!?”.’".contains(c)
) || "…—–»›)]】》」』、。,:;!?".'".contains(c)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The is_trailing_punct Unicode set literal appears malformed and likely does not compile or behave as intended.

The new literal changed from "…—–»›)]】》」』、。,:;!?”.’" to "…—–»›)]】》」』、。,:;!?".'", which puts the closing quote before ”.’ and leaves .'" outside the string. This will either fail to compile or exclude and from the punctuation set. Please restore the full literal so all intended ASCII and Unicode closers remain inside the string.

/// Moves trailing atomic or plain fragments to the following line when doing
/// so keeps both lines within `width` display columns.
///
/// After `merge_whitespace_only_lines` normalises separator lines, a wrapped
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (review_instructions): The comment uses "normalises" instead of the required Oxford "-ize" spelling ("normalizes").

Per the guidelines, comments should follow en-GB-oxendict spelling with the -ize form. Here that means using "normalizes" rather than "normalises".

Review instructions:

Path patterns: **/*.rs

Instructions:
Comments must use en-GB-oxendict (-ize / -yse / -our) spelling and grammar.

use super::super::{FragmentKind, InlineFragment};
use super::{merge_whitespace_only_lines, rebalance_atomic_tails};

#[test]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (review_instructions): Several classification tests duplicate the same structure and should be expressed as a single rstest-parameterised case.

The inline_fragment_new_* tests all exercise the same constructor behaviour with different inputs and expected kinds/widths. The instructions call for replacing such duplicated tests with #[rstest(...)] parameterised cases. Consider collapsing these into one or two rstest-powered tests with a cases table instead of multiple near-identical #[test] functions.

Review instructions:

Path patterns: **/*.rs

Instructions:
Replace duplicated tests with #[rstest(...)] parameterised cases.

Comment thread docs/users-guide.md
The `--wrap` flag reflows paragraphs and list items to a target column width
(default 80). The wrap engine uses the `textwrap` crate for greedy line fitting
and `unicode-width` for display-column measurements, so accented characters,
CJK glyphs, and emoji are counted by their rendered width rather than their
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (review_instructions): "CJK" is introduced without expanding the acronym, which violates the requirement to define uncommon acronyms on first use.

Consider expanding this to something like "Chinese–Japanese–Korean (CJK) glyphs" the first time it appears, so the acronym is defined in line with the documentation style guidelines.

Review instructions:

Path patterns: **/*.md

Instructions:
Define uncommon acronyms on first use.

@coderabbitai
Copy link
Copy Markdown
Contributor Author

coderabbitai Bot commented Apr 23, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fd9e6943-7879-4b91-ba30-93f17e0150ab

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.

Base automatically changed from refactor-text-wrapping-to-use-textwrap-ek5oxh to main April 23, 2026 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant