Address review feedback for wrapping and footnotes by leynos · Pull Request #266 · leynos/mdtablefix

leynos · 2026-04-18T15:35:18Z

Centralize mutable state in the list, HTML-table, footnote, and paragraph-wrapping flows so the behaviour is easier to reason about from one place.

Fix the cargo-binstall Linux override, remove production expect usage from the reviewed paths, normalize whitespace-only wrapped lines, and add regression tests for mixed code-emphasis affixes, Unicode-width indentation, and indented multi-line HTML tables.

Summary by Sourcery

Centralize mutable state across paragraph wrapping, footnote renumbering, list renumbering, and HTML table conversion to simplify control flow and improve robustness.

Bug Fixes:

Fix cargo-binstall Linux override configuration to install binaries from the current directory.
Normalize whitespace-only wrapped lines to emit consistent empty lines.
Use Unicode display width when wrapping indented lines so full-width characters are handled correctly.
Ensure HTML table conversion correctly handles indented multi-line HTML tables.
Fix mixed code/emphasis affix handling so pending prefixes are cleared correctly in edge cases.

Enhancements:

Refactor paragraph wrapping to encapsulate buffer and indentation handling inside ParagraphWriter.
Refactor footnote renumbering to consolidate scan and numeric candidate state into a single DefinitionScanState struct.
Refactor list renumbering state management with helper methods for resetting and computing next list numbers.
Refactor HTML table parsing to manage buffer and depth via HtmlTableState and reuse it across conversion paths.
Introduce a SplitContext::new constructor to simplify split context creation in wrapping code and tests.

Build:

Adjust cargo-binstall Linux override to point bin-dir at the current directory instead of the binary path.

Tests:

Add regression tests for whitespace-only line normalization in wrapping.
Add regression tests for Unicode-width indentation handling in wrapping.
Add regression tests for mixed code/emphasis affix handling.
Add regression tests for indented multi-line HTML table conversion.

Centralize mutable state in the list, HTML-table, footnote, and paragraph-wrapping flows so the behaviour is easier to reason about from one place. Fix the cargo-binstall Linux override, remove production `expect` usage from the reviewed paths, normalize whitespace-only wrapped lines, and add regression tests for mixed code-emphasis affixes, Unicode-width indentation, and indented multi-line HTML tables.

coderabbitai · 2026-04-18T15:35:26Z

Note

Reviews paused

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Refactor internal ownership for wrapping and HTML-table buffering; extract footnote-definition renumbering into a new submodule; consolidate list-numbering logic; update wrap-related APIs and tests; tweak Cargo bin-dir metadata.

Changes

Cohort / File(s)	Summary
Binstall configuration `Cargo.toml`	Change Linux GNU `package.metadata.binstall.bin-dir` to fixed `.` instead of template `{ bin }{ binary-ext }`.
Wrap & paragraph subsystem `src/wrap/paragraph.rs`, `src/wrap.rs`, `src/wrap/fence.rs`, `src/wrap/inline.rs`, `src/wrap/line_buffer.rs`, `src/wrap/tests.rs`, `tests/wrap_unit.rs`	Remove `ParagraphState`; introduce owning `ParagraphWriter` with `new`/`push_wrapped`/`push_verbatim`/`push_blank_line`; add `SplitContext::new`; update call sites and tests; add blank-line and Unicode-width wrap tests; remove some legacy wrap tests.
HTML table buffering `src/html.rs`, `tests/table/convert_html.rs`	Move buffering into `HtmlTableState`; derive `in_html()` from buffer occupancy; compute nesting depth per-line and flush when depth returns to zero; always flush remaining buffer; add indented/mixed HTML table tests.
Footnote renumbering extraction `src/footnotes/renumber.rs`, `src/footnotes/renumber/definitions.rs`, `src/footnotes/renumber/tests.rs`	Extract definition scanning/rewrite/reorder into new `definitions` submodule (`collect_definition_updates`, `rewrite_definition_headers`, `reorder_definition_block`, `numeric_candidate_from_line`); adapt `renumber_footnotes` to delegate; add tests for malformed candidates and renumbering.
List numbering `src/lists.rs`	Introduce `ListState::reset()` and `ListState::next_number(indent)` to centralise pruning/pushing/increment logic; use `ListState::default()` for initialisation.
Wrap fence handling `src/wrap/fence.rs`	Remove dependence on external paragraph state from `handle_fence_line`; call `writer.push_verbatim(line)` directly; adjust imports and signature.
Code emphasis test `src/code_emphasis.rs`	Add unit test `consume_code_affixes_clears_mixed_pending_prefix` asserting pending prefix is cleared and token stream mutated when markers mismatch.
Tests added/updated `tests/table/convert_html.rs`, `tests/wrap_unit.rs`, `src/wrap/tests.rs`, `src/footnotes/renumber/tests.rs`	Add/adjust tests: HTML table conversion cases (indented/mixed), wrap behaviour for blank lines and Unicode width, update `SplitContext` usage in tests, and add footnote renumbering tests; remove several legacy wrap tests.

Possibly related issues

Refactor text wrapping and prefix logic to use textwrap and unicode-width crates #80 — Overlaps refactor of wrapping subsystem (ParagraphState removal, ParagraphWriter introduction, SplitContext changes); this PR implements those wrapping changes.

Possibly related PRs

Cargo binstall support #265 — Shares the same Cargo binstall bin-dir tweak and performs overlapping refactors across wrap/paragraph, line_buffer, HTML table buffering and footnote renumbering; shows strong code-level relation.

Poem

Fold the buffers, own the state, let paragraphs sing,
Tables unwind tidy, rows in order cling,
Footnotes renumber, lines find their place,
Tests step in softly and check every space,
A tiny Cargo tweak, another tidy trace.

🚥 Pre-merge checks | ✅ 3 | ❌ 4

❌ Failed checks (4 warnings)

Check name	Status	Explanation	Resolution
Testing	⚠️ Warning	The PR introduces substantial architectural refactorings without adequate direct unit test coverage for critical refactored components.	Add focused unit tests for definitions module functions, ParagraphWriter methods, and prefix_line() edge cases, including snapshot tests for text transformations.
User-Facing Documentation	⚠️ Warning	The pull request introduces user-facing behavioural improvements that lack documentation in CHANGELOG.md or README.md, including whitespace normalisation, Unicode display width calculations, HTML table handling, and cargo-binstall configuration.	Update CHANGELOG.md under the 'Fixed' section with entries documenting whitespace normalisation, Unicode display width calculations, indented HTML table handling, and cargo-binstall directory configuration.
Developer Documentation	⚠️ Warning	Pull request introduces five significant internal APIs and architectural abstractions without corresponding documentation in docs/architecture.md.	Update docs/architecture.md to document ParagraphWriter, HtmlTableState, DefinitionScanState, ListState, and definitions submodule with their responsibilities and pipeline integration.
Module-Level Documentation	⚠️ Warning	The file tests/wrap_unit.rs lacks a module-level docstring, starting directly with use statements instead of documentation explaining the module's purpose.	Add a module-level docstring to tests/wrap_unit.rs before the use statements, explaining that it contains unit tests for text wrapping functionality.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly summarises the primary intent of the PR: addressing review feedback across multiple refactored domains (wrapping and footnotes).
Description check	✅ Passed	The description comprehensively relates to the changeset, detailing specific bug fixes, enhancements, and test additions across all modified modules.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feedback-following-code-fmt

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai · 2026-04-18T15:35:58Z

Reviewer's Guide

Refactors wrapping, list renumbering, footnote renumbering, and HTML table conversion to centralize mutable state and remove panicking expectations, while tightening wrapping behavior (Unicode-aware indentation, whitespace-only lines) and adding regression tests, plus a small cargo-binstall override fix.

Sequence diagram for updated wrap_text paragraph processing

sequenceDiagram
    participant WT as wrap_text
    participant FT as FenceTracker
    participant PW as ParagraphWriter

    loop for each line in lines
        WT->>FT: observe(line)
        alt fence line toggled
            FT-->>WT: true
            WT->>PW: push_verbatim(line)
            WT->>WT: continue
        else not fence
            FT-->>WT: false
            alt FT.in_fence()
                WT->>PW: push_verbatim(line)
                WT->>WT: continue
            else not in fence
                alt line.trim().is_empty()
                    WT->>PW: push_blank_line()
                    WT->>WT: continue
                else
                    alt passthrough block
                        WT->>PW: push_verbatim(line)
                        WT->>WT: continue
                    else prefix_line found
                        WT->>PW: handle_prefix_line(prefix_line)
                        WT->>WT: continue
                    else plain paragraph text
                        WT->>PW: note_indent(line)
                        WT->>WT: (text, hard_break) = line_break_parts(line)
                        WT->>PW: push_wrapped(text, hard_break)
                    end
                end
            end
        end
    end
    WT->>PW: flush_paragraph()

Class diagram for updated wrapping, list, footnote, and HTML table state structs

classDiagram
    class ParagraphWriter {
        - out : &mut Vec~String~
        - width : usize
        - buf : Vec~(String, bool)~
        - indent : String
        + new(out : &mut Vec~String~, width : usize) ParagraphWriter
        + note_indent(line : &str) void
        + push_wrapped(text : String, hard_break : bool) void
        + flush_paragraph() void
        + push_verbatim(line : &str) void
        + push_blank_line() void
        + handle_prefix_line(prefix_line : &PrefixLine) void
        - push_wrapped_segment(indent : &str, segment : &str) void
    }

    class PrefixLine {
        + prefix : String
        + rest : &str
        + repeat_prefix : bool
    }

    class SplitContext {
        + lines : &mut Vec~String~
        + width : usize
        + new(lines : &mut Vec~String~, width : usize) SplitContext
    }

    class HtmlTableState {
        + buf : Vec~String~
        + depth : usize
        + in_html() bool
        + flush_raw(out : &mut Vec~String~) void
        + push_html_line(line : &str, out : &mut Vec~String~) void
    }

    class ListState {
        + indent_stack : Vec~usize~
        + counters : HashMap~usize, usize~
        + reset() void
        + prune_deeper(indent : usize, inclusive : bool) void
        + next_number(indent : usize) usize
        + handle_paragraph_restart(indent : usize, line : &str, prev_blank : bool) bool
    }

    class DefinitionScanState {
        + mapping : &mut HashMap~usize, usize~
        + next_number : &mut usize
        + numeric_list_range : Option~(usize, usize)~
        + skip_numeric_conversion : bool
        + definitions : Vec~DefinitionLine~
        + is_definition_line : Vec~bool~
        + numeric_candidates : Vec~NumericCandidate~
    }

    class FenceTracker {
        + observe(line : &str) bool
        + in_fence() bool
    }

    ParagraphWriter --> PrefixLine : handles
    ParagraphWriter --> SplitContext : uses via wrap_preserving_code
    ParagraphWriter --> FenceTracker : used in wrap_text
    HtmlTableState ..> table_lines_to_markdown : calls
    ListState ..> FenceTracker : used in renumber_lists
    DefinitionScanState ..> DefinitionLine : owns
    DefinitionScanState ..> NumericCandidate : owns

File-Level Changes

Change	Details	Files
Centralize paragraph wrapping logic into a stateful writer and adjust wrap_text behavior, including whitespace-only lines and Unicode-width indentation.	Replace ParagraphState + ParagraphWriter split with a single ParagraphWriter that owns the paragraph buffer and indentation state. Ensure wrap_text treats empty/whitespace-only lines as blank lines and does not wrap them. Use UnicodeWidthStr to compute available width for indentation and prefix handling, including full-width spaces. Refactor fence handling to no longer depend on an external ParagraphState and to write verbatim lines directly via ParagraphWriter. Introduce ParagraphWriter::push_blank_line and push_verbatim to manage paragraph flushing before emitting structural lines.	`src/wrap/paragraph.rs` `src/wrap.rs` `src/wrap/fence.rs` `src/wrap/inline.rs` `src/wrap/line_buffer.rs` `src/wrap/tests.rs`
Make footnote renumbering scan stateful and remove panicking regex/write! assumptions.	Merge DefinitionScanContext and DefinitionAccumulator into a single DefinitionScanState struct that carries mapping, counters, collected definitions, and numeric candidates. Change numeric_candidate_from_line and prefix extraction to use optional regex captures instead of expect panics. Replace write!(...).expect(...) when building footnote definition lines with ignored write! results, avoiding panics on string write. Inline numeric candidate accumulation into DefinitionScanState and finalize them in-place when computing DefinitionUpdates.	`src/footnotes/renumber.rs`
Refine passthrough/wrapping classification and prefix handling in wrap.rs to avoid panics on missing captures.	Simplify is_passthrough_block to treat only tables, headings, markdownlint directives, and indented code as passthrough, and handle blank lines separately in wrap_text. Update prefix_line to return None on missing regex groups instead of panicking via expect, and to store prefixes as owned Strings in PrefixLine. Adjust fenced-block handling to push verbatim lines and continue without involving paragraph state.	`src/wrap.rs`
Refactor HTML table conversion to use an explicit HtmlTableState that properly supports indented multi-line tables.	Add HtmlTableState::in_html helper and move html/in-table tracking into the state struct based on buffer emptiness and depth. Reimplement push_html_line to trim leading indentation for tag depth counting while preserving original lines in the buffer, and to flush when depth returns to zero. Update html_table_to_markdown and convert_html_tables to drive conversion via HtmlTableState instead of separate buf/depth/in_html variables. Ensure flush_raw is called at the end of processing to emit any unterminated HTML table lines verbatim.	`src/html.rs` `tests/table/convert_html.rs`
Encapsulate list renumbering mutable state and add helpers for nesting and counter tracking.	Mark ListState as Default and add a reset method to clear indent stack and counters when encountering headings or thematic breaks. Introduce ListState::next_number to encapsulate pruning deeper levels, managing the indent stack, and incrementing counters. Update renumber_lists to use ListState::default and next_number, simplifying numbered-list handling logic.	`src/lists.rs`
Tighten inline wrapping and code/emphasis handling helpers and tests.	Introduce SplitContext::new constructor and use it in inline wrapping and line buffer tests to simplify call sites. Add regression test to ensure consume_code_affixes clears mixed pending prefix state and rewrites the token stream as expected.	`src/wrap/inline.rs` `src/wrap/line_buffer.rs` `src/wrap/tests.rs` `src/code_emphasis.rs`
Adjust cargo-binstall Linux override to install binaries from the current directory.	Change Cargo.toml bin-dir override for Linux GNU targets from a nested bin path to "." so installed binaries are found correctly.	`Cargo.toml`

Possibly linked issues

Implement header fallback for HTML tables #55: PR’s ListState and next_number refactor the list renumbering logic with stack+HashMap and entry-based counters as requested.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/footnotes/renumber.rs (1)

337-344: ⚠️ Potential issue | 🟠 Major

Remove the panic path in numeric_candidate_from_line.

Parse from num_match.as_str() directly. Line 339 still indexes caps["num"], which panics when the named group is absent and defeats the Option-based fallback.

🔧 Proposed fix

 fn numeric_candidate_from_line(line: &str, index: usize) -> Option<NumericCandidate> {
     let caps = FOOTNOTE_LINE_RE.captures(line)?;
-    let number = caps["num"].parse::<usize>().ok()?;
+    let num_match = caps.name("num")?;
+    let number = num_match.as_str().parse::<usize>().ok()?;
     let indent = caps.name("indent").map_or("", |m| m.as_str()).to_string();
-    let rest = caps.name("rest").map_or("", |m| m.as_str()).to_string();
-    let num_match = caps.name("num")?;
-    let rest_match = caps.name("rest")?;
+    let rest_match = caps.name("rest")?;
+    let rest = rest_match.as_str().to_string();
     let whitespace = line[num_match.end() + 1..rest_match.start()].to_string();
     Some(NumericCandidate {
         index,
         number,
         indent,

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests. Errors must be propagated."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/footnotes/renumber.rs` around lines 337 - 344, The function
numeric_candidate_from_line currently uses caps["num"] which can panic if the
named capture is absent; instead use the already-obtained num_match (from
caps.name("num")) and parse num_match.as_str() to get the usize (e.g., replace
the caps["num"].parse::<usize>() call with parsing num_match.as_str()), keeping
the .ok()? propagation so the function returns None on parse failure; also
remove any other direct indexing into caps[...] that can panic and rely on the
existing num_match/rest_match variables and safe slicing using their
start()/end() positions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/footnotes/renumber.rs`:
- Around line 328-329: The code currently discards the Result from write! when
building the definition header (e.g., the line using write!(&mut line,
"[^{new_number}]:") before pushing rewritten_rest), creating a silent failure
path; replace those write! calls (both occurrences around the build of the
header) with infallible string assembly such as let header =
format!("[^{new_number}]:"); line.push_str(&header); (or build the header with
String::from and push_str) so you don't ignore errors and avoid using a fallible
fmt write into the mutable string; apply the same replacement for the second
occurrence near lines 401–402.

In `@src/wrap/tests.rs`:
- Around line 309-314: The test wrap_text_uses_display_width_for_unicode_indent
is too weak: it uses "　a" with width 2 which doesn't fail even if indent width
is computed incorrectly; update the test to exercise display-width by using an
input like "　a b" and call wrap_text(&input, 4) so the ideographic space (width
2) causes a wrap, and assert that the result equals vec!["　a".to_string(),
"　b".to_string()]—modify the test body in
wrap_text_uses_display_width_for_unicode_indent to use that input, width, and
expected assertion.
- Around line 302-314: The new regression tests
wrap_text_normalizes_whitespace_only_lines and
wrap_text_uses_display_width_for_unicode_indent should be moved out of the
already-large src/wrap/tests.rs into a dedicated test module file to keep files
under the 400-line limit; create a new test file (for example
tests/wrap_regressions.rs or src/wrap/regressions_tests.rs), copy those two
#[test] functions (which call wrap_text) into it, preserve any necessary
use/imports for wrap_text, and remove them from src/wrap/tests.rs so the
original file stays below the line limit.

---

Outside diff comments:
In `@src/footnotes/renumber.rs`:
- Around line 337-344: The function numeric_candidate_from_line currently uses
caps["num"] which can panic if the named capture is absent; instead use the
already-obtained num_match (from caps.name("num")) and parse num_match.as_str()
to get the usize (e.g., replace the caps["num"].parse::<usize>() call with
parsing num_match.as_str()), keeping the .ok()? propagation so the function
returns None on parse failure; also remove any other direct indexing into
caps[...] that can panic and rely on the existing num_match/rest_match variables
and safe slicing using their start()/end() positions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d67b626d-6713-4149-8d79-2492c7423851

📥 Commits

Reviewing files that changed from the base of the PR and between b8962c0 and 55b5a15.

📒 Files selected for processing (12)

Cargo.toml
src/code_emphasis.rs
src/footnotes/renumber.rs
src/html.rs
src/lists.rs
src/wrap.rs
src/wrap/fence.rs
src/wrap/inline.rs
src/wrap/line_buffer.rs
src/wrap/paragraph.rs
src/wrap/tests.rs
tests/table/convert_html.rs

Keep HTML table depth tracking on a consistent trimmed view, restore borrowed prefix handling for common wrap prefixes, and add regression coverage for whitespace-only paragraph breaks and indented HTML table transitions. Document why numeric footnote candidates are finalized in reverse and add targeted renumbering tests covering existing definition headers, numeric candidates, and malformed candidate lines.

Replace the silent `write!` header assembly in the footnote renumberer with infallible string construction, and stop parsing numeric candidates through panic-prone capture indexing. Move the `wrap_text` regression coverage out of the oversized `src/wrap/tests.rs` file into the existing dedicated integration test surface, and strengthen the Unicode-indent case so it proves display width rather than byte length.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

src/wrap/tests.rs (2)

302-324: 🛠️ Refactor suggestion | 🟠 Major

Move these regressions into a dedicated test module.

Lines 302-324 keep src/wrap/tests.rs at 489 lines, so this PR still breaches the repository cap. Move the new wrapping regressions out of this module and keep the file below the limit.

As per coding guidelines, "Files must not exceed 400 lines in length".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/wrap/tests.rs` around lines 302 - 324, The three new regression tests
(wrap_text_normalizes_whitespace_only_lines,
wrap_text_treats_whitespace_only_lines_as_paragraph_breaks,
wrap_text_uses_display_width_for_unicode_indent) should be moved out of the
oversized tests.rs into a dedicated test module/file (e.g., a new module named
wrap_regressions or regressions) so the original file stays under the 400-line
limit; extract those #[test] functions into the new module, ensure the new
file/module is compiled as part of the test suite (keeping the same function
names and signatures and any necessary use/imports such as wrap_text), and
remove the original copies from the large tests.rs.

320-324: ⚠️ Potential issue | 🟡 Minor

Strengthen the Unicode-width regression.

Lines 320-324 still pass when the ideographic space is mismeasured as one column, so the test does not prove the fix. Use an input that only wraps when that indent consumes two columns, such as "　a b" at width 4, and assert the split output.

Patch

 fn wrap_text_uses_display_width_for_unicode_indent() {
-    let input = vec!["　a".to_string()];
-    let wrapped = wrap_text(&input, 2);
-    assert_eq!(wrapped, vec!["　a".to_string()]);
+    let input = vec!["　a b".to_string()];
+    let wrapped = wrap_text(&input, 4);
+    assert_eq!(wrapped, vec!["　a".to_string(), "　b".to_string()]);
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/wrap/tests.rs` around lines 320 - 324, The test
wrap_text_uses_display_width_for_unicode_indent currently uses "　a" which still
passes if the ideographic space is mismeasured; change it to exercise the wrap
threshold by using input "　a b" and width 4 so the indent (ideographic space = 2
columns) forces a wrap; update the test (function
wrap_text_uses_display_width_for_unicode_indent) to call wrap_text(&vec!["　a
b".to_string()], 4) and assert the expected split output (e.g.
assert_eq!(wrapped, vec!["　a".to_string(), "b".to_string()])).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/footnotes/renumber.rs`:
- Around line 490-528: This file is over the 400-line limit because the
#[cfg(test)] mod tests block is large; move the entire test module into a new
test file (either src/footnotes/renumber/tests.rs or
tests/footnotes_renumber.rs) preserving the imported functions
numeric_candidate_from_line and renumber_footnotes, and update module imports
accordingly; also replace the duplicated assertions in the
malformed_numeric_candidate_line_is_ignored test with a parameterised rstest
(use #[rstest] with two cases like "7." and "7:") to consolidate into a single
parameterised test function while keeping the existing test names for clarity.

In `@tests/table/convert_html.rs`:
- Around line 81-115: Collapse the two nearly-identical tests into a single
parameterised rstest: remove the functions
converts_indented_multiline_html_table and
converts_indented_table_without_touching_surrounding_content and replace them
with one #[rstest(...)] that supplies the two cases and a single test function
(e.g. converts_indented_html_table_cases) which takes input: Vec<String>,
expected: Vec<String> and asserts assert_eq!(convert_html_tables(&input),
expected); ensure the rstest attribute lists both case::... entries and the test
helper convert_html_tables is used unchanged.

---

Duplicate comments:
In `@src/wrap/tests.rs`:
- Around line 302-324: The three new regression tests
(wrap_text_normalizes_whitespace_only_lines,
wrap_text_treats_whitespace_only_lines_as_paragraph_breaks,
wrap_text_uses_display_width_for_unicode_indent) should be moved out of the
oversized tests.rs into a dedicated test module/file (e.g., a new module named
wrap_regressions or regressions) so the original file stays under the 400-line
limit; extract those #[test] functions into the new module, ensure the new
file/module is compiled as part of the test suite (keeping the same function
names and signatures and any necessary use/imports such as wrap_text), and
remove the original copies from the large tests.rs.
- Around line 320-324: The test wrap_text_uses_display_width_for_unicode_indent
currently uses "　a" which still passes if the ideographic space is mismeasured;
change it to exercise the wrap threshold by using input "　a b" and width 4 so
the indent (ideographic space = 2 columns) forces a wrap; update the test
(function wrap_text_uses_display_width_for_unicode_indent) to call
wrap_text(&vec!["　a b".to_string()], 4) and assert the expected split output
(e.g. assert_eq!(wrapped, vec!["　a".to_string(), "b".to_string()])).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4e0b8687-eb63-4e39-946c-a8b476a89d3d

📥 Commits

Reviewing files that changed from the base of the PR and between 55b5a15 and 58f905a.

📒 Files selected for processing (6)

src/footnotes/renumber.rs
src/html.rs
src/wrap.rs
src/wrap/paragraph.rs
src/wrap/tests.rs
tests/table/convert_html.rs

leynos · 2026-04-18T17:08:47Z

@coderabbitai Have the following now been resolved?

Annul any requirements that violate the en-GB-oxendict spelling (-ize / -yse / -our) conventions (for example a request to replace "normalize" with "normalise" or "artefact" with "artifact"), or where the requirement unnecessarily increases cyclomatic complexity.

Please address the comments from this code review:

## Overall Comments
- In `HtmlTableState::push_html_line` you increment `depth` using `TABLE_START_RE` on `trimmed` but decrement using `TABLE_END_RE` on the original `line`; consider applying both start and end regexes consistently on the same representation (trimmed or original) to avoid subtle mismatches with indented or oddly spaced tags.
- Now that `PrefixLine.prefix` is always a `String`, `prefix_line` unconditionally allocates even for simple matches; if performance on large files is a concern, you might consider retaining a borrowing representation for the common bullet/blockquote cases and only allocating when the prefix needs to be synthesized (e.g., footnotes).

## Individual Comments

### Comment 1
<location path="src/footnotes/renumber.rs" line_range="393-394" />
<code_context>
-        let new_number = assign_new_number(ctx.mapping, candidate.number, ctx.next_number);
-        let rewritten_rest = rewrite_tokens(&candidate.rest, ctx.mapping);
+fn finalize_numeric_candidates(state: &mut DefinitionScanState<'_>) {
+    for candidate in state.numeric_candidates.drain(..).rev() {
+        let new_number = assign_new_number(state.mapping, candidate.number, state.next_number);
+        let rewritten_rest = rewrite_tokens(&candidate.rest, state.mapping);
</code_context>
<issue_to_address>
**suggestion:** Consider draining numeric candidates in-place without reversing if ordering is not semantically required.

This now drains `numeric_candidates` in reverse, preserving the previous `into_iter().rev()` behavior. If that ordering isn’t required for correctness, consider iterating in insertion order and dropping the `.rev()`. If it *is* required (e.g., to avoid index churn while mutating), please add a short comment explaining the dependency on reverse order where this iteration is defined.

```suggestion
fn finalize_numeric_candidates(state: &mut DefinitionScanState<'_>) {
    for candidate in state.numeric_candidates.drain(..) {
```
</issue_to_address>

### Comment 2
<location path="src/wrap/tests.rs" line_range="303-306" />
<code_context>
 }

+#[test]
+fn wrap_text_normalizes_whitespace_only_lines() {
+    let input = vec![String::new(), "   ".to_string(), "\t\t".to_string()];
+    let wrapped = wrap_text(&input, 80);
+    assert_eq!(wrapped, vec![String::new(), String::new(), String::new()]);
+}
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add a case with whitespace-only lines inside a paragraph to prove paragraph-splitting behavior.

Since `wrap_text` now treats `trim().is_empty()` as a paragraph break, it’d be helpful to cover a mixed case like `["foo", "   ", "bar"]`, asserting that the middle line normalizes to an empty string and that `foo` and `bar` end up in separate wrapped paragraphs. That directly exercises the paragraph-boundary behavior of `push_blank_line` and guards against whitespace normalization merging or dropping paragraphs.

Suggested implementation:

```rust
 }

 #[test]
 fn wrap_text_normalizes_whitespace_only_lines() {
     let input = vec![String::new(), "   ".to_string(), "\t\t".to_string()];
     let wrapped = wrap_text(&input, 80);
     assert_eq!(wrapped, vec![String::new(), String::new(), String::new()]);
 }

 #[test]
 fn wrap_text_treats_whitespace_only_lines_as_paragraph_breaks() {
     let input = vec![
         "foo".to_string(),
         "   ".to_string(),
         "bar".to_string(),
     ];
     let wrapped = wrap_text(&input, 80);

     // The whitespace-only line should normalize to an empty string and act as
     // a paragraph separator, so "foo" and "bar" remain in separate paragraphs.
     assert_eq!(
         wrapped,
         vec![
             "foo".to_string(),
             String::new(),
             "bar".to_string(),
         ]
     );
 }


```

If `wrap_text` is not already imported into this test module, ensure there is a `use` bringing it into scope (for example, `use crate::wrap::wrap_text;` or similar based on the existing conventions in `src/wrap/tests.rs`).
</issue_to_address>

### Comment 3
<location path="tests/table/convert_html.rs" line_range="82-78" />
<code_context>
 }
+
+#[test]
+fn converts_indented_multiline_html_table() {
+    let input = lines_vec![
+        "  <table>",
+        "    <tr><th>A</th><th>B</th></tr>",
+        "    <tr><td>1</td><td>2</td></tr>",
+        "  </table>",
+    ];
+    let expected = lines_vec![
+        "  | A | B |",
+        "  | --- | --- |",
+        "  | 1 | 2 |",
+    ];
+    assert_eq!(convert_html_tables(&input), expected);
+}
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test where an indented HTML table is followed by non-table HTML/content at the same indentation.

This test covers the core regression for indented multi-line tables and the refactored `HtmlTableState`. To better exercise the `buf`/`depth` state and `flush_raw`, please also add a case with:

- indented non-table content
- an indented `<table>…</table>` block
- more indented non-table content at the same indentation

and assert that only the table is converted and the surrounding lines pass through unchanged, verifying the `in_html()`/`flush_raw` transitions into and out of indented tables.
</issue_to_address>

### Comment 4
<location path="src/footnotes/renumber.rs" line_range="328" />
<code_context>
     let mut line = String::with_capacity(parts.prefix.len() + rewritten_rest.len() + 8);
     line.push_str(parts.prefix);
-    write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");
+    let _ = write!(&mut line, "[^{new_number}]:");
     line.push_str(&rewritten_rest);
     DefinitionLine {
</code_context>
<issue_to_address>
**issue (review_instructions):** Add tests demonstrating the bug this change addresses (ignoring write! errors instead of panicking) and verifying that renumbering behaves correctly when formatting succeeds.

Previously this code used expect("write to string cannot fail"), which would panic if formatting failed; now it ignores the Result from write!. This is a behavioural change intended to avoid panics and is effectively a bug fix. Add targeted tests that:
- Exercise renumbering of footnotes and numeric candidates to confirm no panics occur and the output lines are correctly formatted.
- If applicable, cover edge cases that could previously trigger the panic, to show the bug is fixed and behaviour remains correct.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
Bug fixes must be demonstrated by a test.

</details>
</issue_to_address>

### Comment 5
<location path="src/footnotes/renumber.rs" line_range="342" />
<code_context>
-    let rest_match = caps
-        .name("rest")
-        .expect("numeric list capture missing rest");
+    let num_match = caps.name("num")?;
+    let rest_match = caps.name("rest")?;
     let whitespace = line[num_match.end() + 1..rest_match.start()].to_string();
</code_context>
<issue_to_address>
**issue (review_instructions):** Add a regression test showing that malformed numeric list lines no longer panic due to missing regex captures.

numeric_candidate_from_line previously called expect on the "num" and "rest" captures, which would panic on malformed or unexpected input; now it returns None when captures are missing. This is a bug fix in error handling. Add a test that feeds a line matching NUMERIC_LIST_RE partially or unexpectedly (e.g., missing rest or number group) and assert that renumbering completes without panic and with the expected behaviour (either ignoring the line or handling it in a defined way).

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
Bug fixes must be demonstrated by a test.

</details>
</issue_to_address>

coderabbitai · 2026-04-18T17:10:09Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== src/footnotes/renumber.rs (lines 320-360, 385-410) ==="
sed -n '310,420p' src/footnotes/renumber.rs

echo ""
echo "=== src/wrap/tests.rs (last 80 lines) ==="
tail -100 src/wrap/tests.rs

echo ""
echo "=== tests/table/convert_html.rs ==="
cat tests/table/convert_html.rs

echo ""
echo "=== src/html.rs (push_html_line) ==="
grep -n "push_html_line\|TABLE_START_RE\|TABLE_END_RE\|trimmed\|depth\|trim_start" src/html.rs | head -60

echo ""
echo "=== src/wrap/paragraph.rs (PrefixLine prefix field) ==="
grep -n "prefix\|Cow\|String\|alloc\|borrow" src/wrap/paragraph.rs | head -40

Length of output: 12108

🏁 Script executed:

#!/bin/bash
echo "=== tests/wrap_unit.rs ==="
cat tests/wrap_unit.rs

echo ""
echo "=== src/footnotes/renumber.rs (test section) ==="
grep -n "#\[cfg(test\|#\[test\|mod tests\|fn test_\|fn renumber\|fn ignor\|fn malform\|fn numeric" src/footnotes/renumber.rs
sed -n '420,600p' src/footnotes/renumber.rs

Length of output: 7747

@leynos: Here is the status of each point.

Overall Comments

#	Issue	Status
1	`TABLE_START_RE`/`TABLE_END_RE` applied inconsistently (trimmed vs. original `line`)	✅ Resolved — `push_html_line` now applies both regexes to `trimmed` (`line.trim_start()`).
2	`PrefixLine.prefix` unconditionally allocates; consider retaining a borrowing representation	❌ Annulled — `prefix` remains `Cow<'a, str>`, which already permits borrowing. Converting common paths to avoid allocation would increase cyclomatic complexity without fixing a defect, so this requirement is annulled per your instruction.

Individual Comments

#	Location	Issue	Status
1	`src/footnotes/renumber.rs` — `finalize_numeric_candidates`	`.rev()` kept without explanation	✅ Resolved — a prose comment ("Drain from the bottom so wrapped continuation lines stay attached…") explains the semantic dependency on reverse order.
2	`src/wrap/tests.rs`	Add a mixed `["foo", " ", "bar"]` case proving paragraph-splitting behaviour	✅ Resolved — `wrap_text_treats_whitespace_only_lines_as_paragraph_breaks` was added to `tests/wrap_unit.rs` and asserts the expected `["foo", "", "bar"]` output.
3	`tests/table/convert_html.rs`	Add a test with indented non-table content surrounding an indented table	✅ Resolved — `converts_indented_table_without_touching_surrounding_content` covers exactly this case.
4	`src/footnotes/renumber.rs` line 328	Tests required for the `write!` → `format!` bug fix	✅ Resolved — `renumber_footnotes_rewrites_existing_definition_headers` and `renumber_footnotes_rewrites_numeric_candidates` both exercise the `format!` path end-to-end and confirm correct output.
5	`src/footnotes/renumber.rs` line 342	Regression test for the `expect` → `?` fix on missing regex captures	✅ Resolved — `malformed_numeric_candidate_line_is_ignored` asserts `numeric_candidate_from_line` returns `None` for `"7."` and `"7:"`, confirming no panic occurs on malformed input.

Move the footnote renumber unit tests into a dedicated submodule file and extract the dense definition-scanning and reordering helpers into a separate implementation module. This brings `renumber.rs` back under the repository file-size limit without changing the top-level renumbering flow. Collapse the duplicated indented HTML table regression tests into a single `rstest` so the behaviour stays covered with less repetition.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/footnotes/renumber/definitions.rs`:
- Line 100: Replace the odd double-super path by re-exporting or importing
DefinitionParts from the immediate parent: in renumber.rs add a re-export/import
for parsing::DefinitionParts (e.g., make DefinitionParts available as
super::parsing::DefinitionParts) and then update the type in definitions.rs to
use the simpler super::parsing::DefinitionParts<'_> (replace
super::super::parsing::DefinitionParts<'_> occurrences). This keeps the module
path clear and locates the change around the DefinitionParts type usage in
definitions.rs and the renumber.rs parent module.

In `@src/footnotes/renumber/tests.rs`:
- Around line 13-18: The test malformed_numeric_candidate_line_is_ignored uses
assert! without a diagnostic message; update the assertion to include a clear
failure message that shows the input, e.g. change the assertion around
numeric_candidate_from_line(line, 0).is_none() to include a message like
"expected None for malformed numeric candidate line: {line}" so failures surface
the offending case.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3caf340c-8cb1-447f-8018-395c17e6be97

📥 Commits

Reviewing files that changed from the base of the PR and between 088948a and 6536e7a.

📒 Files selected for processing (4)

src/footnotes/renumber.rs
src/footnotes/renumber/definitions.rs
src/footnotes/renumber/tests.rs
tests/table/convert_html.rs

Expose `DefinitionParts` through the immediate `renumber` parent module so the definition-scanning helper module no longer reaches back through a double-super path. This keeps the module boundary clearer without changing the renumbering behaviour.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/footnotes/renumber.rs (1)

70-87: ⚠️ Potential issue | 🟠 Major

Replace expect with fallible handling in production code.

Line 73 uses expect in production code, which violates the guideline forbidding .expect() outside tests. Return early when the capture is missing instead of panicking.

🔧 Proposed fix

 fn rewrite_refs_in_segment(text: &str, mapping: &HashMap<usize, usize>) -> String {
     FOOTNOTE_REF_RE
         .replace_all(text, |caps: &Captures| {
-            let mat = caps.get(0).expect("regex matched without capture");
+            let Some(mat) = caps.get(0) else {
+                return caps[0].to_string();
+            };
             if is_definition_like(text, &mat) {
                 return caps[0].to_string();
             }

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests. Errors must be propagated."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/footnotes/renumber.rs` around lines 70 - 87, The closure in
rewrite_refs_in_segment currently calls caps.get(0).expect(...), which can
panic; replace that with fallible handling: match or if-let on caps.get(0)
(e.g., if let Some(mat) = caps.get(0) { ... } else { ... }), and in the None
branch return an appropriate safe fallback (the original matched text or
String::new()) instead of panicking; keep the rest of the logic (calling
is_definition_like(text, &mat), parsing caps["num"], and using mapping)
unchanged so the closure never uses .expect() or .unwrap().

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/footnotes/renumber.rs`:
- Around line 70-87: The closure in rewrite_refs_in_segment currently calls
caps.get(0).expect(...), which can panic; replace that with fallible handling:
match or if-let on caps.get(0) (e.g., if let Some(mat) = caps.get(0) { ... }
else { ... }), and in the None branch return an appropriate safe fallback (the
original matched text or String::new()) instead of panicking; keep the rest of
the logic (calling is_definition_like(text, &mat), parsing caps["num"], and
using mapping) unchanged so the closure never uses .expect() or .unwrap().

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 63e834dd-352d-4c55-b0ab-ef615c11d4b0

📥 Commits

Reviewing files that changed from the base of the PR and between 6536e7a and 8c1c7a7.

📒 Files selected for processing (2)

src/footnotes/renumber.rs
src/footnotes/renumber/definitions.rs

leynos · 2026-04-18T18:18:41Z

@coderabbitai pause

coderabbitai · 2026-04-18T18:18:49Z

✅ Actions performed

Reviews paused.

Handle the impossible-but-fallible `caps.get(0)` path in `rewrite_refs_in_segment` without panicking. If the regex capture is missing, return a safe empty replacement instead of aborting the renumbering pass.

leynos marked this pull request as ready for review April 18, 2026 15:55

This comment was marked as resolved.

Sign in to view

coderabbitai Bot requested changes Apr 18, 2026

View reviewed changes

Comment thread src/footnotes/renumber.rs Outdated

Comment thread src/wrap/tests.rs Outdated

Comment thread src/wrap/tests.rs Outdated

leynos added 2 commits April 18, 2026 17:24

coderabbitai Bot requested changes Apr 18, 2026

View reviewed changes

Comment thread src/footnotes/renumber.rs Outdated

Comment thread tests/table/convert_html.rs Outdated

coderabbitai Bot requested changes Apr 18, 2026

View reviewed changes

Comment thread src/footnotes/renumber/definitions.rs Outdated

Comment thread src/footnotes/renumber/tests.rs

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

coderabbitai Bot approved these changes Apr 18, 2026

View reviewed changes

Avoid panic on missing footnote regex match

766c589

Handle the impossible-but-fallible `caps.get(0)` path in `rewrite_refs_in_segment` without panicking. If the regex capture is missing, return a safe empty replacement instead of aborting the renumbering pass.

Conversation

leynos commented Apr 18, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (4 warnings)

Uh oh!

sourcery-ai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for updated wrap_text paragraph processing

Class diagram for updated wrapping, list, footnote, and HTML table state structs

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

leynos commented Apr 18, 2026

Uh oh!

coderabbitai Bot commented Apr 18, 2026

Overall Comments

Individual Comments

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

leynos commented Apr 18, 2026

Uh oh!

coderabbitai Bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

leynos commented Apr 18, 2026 •

edited by sourcery-ai Bot

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading

sourcery-ai Bot commented Apr 18, 2026 •

edited

Loading