Cargo binstall support by leynos · Pull Request #265 · leynos/mdtablefix

leynos · 2026-04-18T14:35:38Z

Summary by Sourcery

Add cargo-binstall packaging support and refactor core Markdown processing utilities while preserving existing behavior.

New Features:

Publish cargo-binstall-compatible tarball archives for Linux GNU targets and configure Cargo binstall metadata for mdtablefix.

Enhancements:

Refactor footnote renumbering to extract reusable helpers and introduce small context/accumulator structs for clearer state management.
Restructure paragraph wrapping into a ParagraphState/ParagraphWriter abstraction and factor out helpers for prefix handling and line break detection.
Extract helpers for merging inline code and emphasis, simplifying fix_code_emphasis token handling logic.
Introduce small state structs and helpers for list renumbering, HTML table conversion, process_stream buffering, and table reflow width calculation to clarify control flow.
Refine inline wrapping and tokenization utilities by introducing range-based spans, a SplitContext helper, and clearer scanning helpers for backticks and plain text.
Simplify frontmatter and scanning tests by consolidating parameters into case structs for better readability and reuse.
Factor footnote inline conversion into small data structs and helpers and mark DefinitionParts as Copy where appropriate.

Build:

Extend the release workflow matrix with cargo-binstall archive flags and generate versioned tarball artifacts plus checksums for supported Linux targets.

Documentation:

Document cargo-binstall support, Linux-specific archive targets, and the revised set of release targets in the release process docs and README installation instructions.

Tests:

Refactor wrapping, scanning, frontmatter, and common test helpers into smaller reusable functions and table-driven cases without changing coverage or behavior.

Declare the Linux GNU `cargo-binstall` package metadata in `Cargo.toml` and teach the release workflow to build matching `.tar.gz` archives for `x86_64-unknown-linux-gnu` and `aarch64-unknown-linux-gnu`. Document the new installation path in the README and release process notes so the published assets and the user-facing instructions stay aligned.

Break up the remaining high-nesting and high-arity helpers across wrapping, footnote, HTML, and list processing so `make lint` passes without suppressing the configured warnings. Reshape the affected tests and shared test helpers alongside the production refactors so the new helper boundaries remain covered while preserving existing behaviour.

coderabbitai · 2026-04-18T14:35:45Z

Overview

This PR adds cargo-binstall support to mdtablefix while conducting a comprehensive internal refactoring of the markdown processing pipeline. The changes maintain backward compatibility and preserve all existing formatting behaviour.

Cargo-binstall Support

New Installation Method:

Added [package.metadata.binstall] configuration to Cargo.toml with repository URL and archive instructions for Linux GNU targets (x86_64 and aarch64)
Extended GitHub Actions release workflow to build .tar.gz archives for supported Linux targets alongside existing binaries
Documented the new installation path (cargo binstall mdtablefix) in README and release-process documentation

Release Workflow Updates:

Modified the build matrix to include a cargo_binstall_archive flag (enabled for Linux GNU x86_64/aarch64; disabled for FreeBSD)
Updated artifact preparation with strict shell settings and logic to conditionally generate tar.gz archives with corresponding SHA-256 checksums
Release documentation narrowed to reflect actual supported targets (Linux x86_64/aarch64 and FreeBSD x86_64 only)

Markdown Processing Refactoring

The PR extensively refactors internal helpers across multiple modules to improve code organization and satisfy linting constraints:

Emphasis & Code Token Processing (src/code_emphasis.rs):

Extracted adjacency detection, text/code token handling, emphasis folding, and affix consumption into reusable helpers
Maintains identical outward behaviour for emphasis merging and code normalisation

Fence Handling (src/fences.rs):

Introduced helpers to locate orphan specifier targets and verify fence language attributes
Simplified control flow while preserving orphan specifier attachment logic

Footnote Processing (src/footnotes/):

Introduced InlineFootnote struct to bundle captured regex components instead of 5-tuples (inline.rs)
Added Copy derivation to DefinitionParts for implicit copying (parsing.rs)
Refactored definition collection into DefinitionScanContext, DefinitionAccumulator, and helper functions for structured scanning and numeric candidate processing (renumber.rs)

HTML Table Support (src/html.rs):

Replaced free-function table buffering with stateful HtmlTableState struct encapsulating buffer, depth tracking, and conversion logic
Centralised whitespace-collapsing behaviour in push_collapsed_text_char helper
Improved table state transitions with append_html_table_line, flush_completed_html_table, and flush_raw methods

List Renumbering (src/lists.rs):

Introduced ListState struct to encapsulate indent stack and counter tracking, replacing threaded mutable references
Refactored list state management into struct methods while preserving renumbering semantics

Paragraph Wrapping (src/wrap.rs, src/wrap/paragraph.rs):

Introduced ParagraphState and ParagraphWriter types to centralise paragraph buffering and wrapping logic
Added PrefixLine struct carrying prefix (as Cow), remaining text, and repeat-prefix flag
Introduced helpers: is_table_or_separator, is_passthrough_block, prefix_line, and line_break_parts for clearer control flow
Replaced direct line emission with push_verbatim path for passthrough blocks

Inline Wrapping & Tokenisation (src/wrap/inline.rs, src/wrap/tokenize/mod.rs):

Refactored push_span_with_carry to accept Range<usize> instead of separate indices
Introduced SplitContext wrapper in src/wrap/line_buffer.rs to bundle output buffer and width
Extracted scan_plain_text_end, should_stop_plain_text, and append_escaped_backtick helpers to improve inline tokenisation control flow

Stream Processing (src/process.rs):

Replaced free helper functions and external state variables with internal ProcessBuffer struct
Methods (flush, push_verbatim, handle_fence_line, handle_table_line) preserve operational boundaries while centralising buffering logic

Test Refactoring

Refactored frontmatter tests to use PrefixEmptyCase and FrontmatterSplitCase case structs instead of multiple parameters (src/frontmatter.rs)
Updated inline tokenisation tests to use ScanCollectCase struct for test case composition (src/wrap/tokenize/scanning.rs)
Updated wrap tests to pass SplitContext and range arguments to refactored helper functions (src/wrap/tests.rs)
Refactored common test utility assert_wrapped_list_item to use helpers for code-span detection (tests/common/mod.rs)

Documentation

Updated docs/release-process.md to document the cargo-binstall distribution model, including archive naming conventions and the configuration location in [package.metadata.binstall]. The supported target list was refined to reflect actual release targets.

Walkthrough

The PR extends release infrastructure to support cargo-binstall distribution of prebuilt Linux GNU binaries whilst refactoring internal code through systematic extraction of helper functions and introduction of state-carrying structs to consolidate mutable parameter threading. No user-facing functional changes introduced.

Changes

Cohort / File(s)	Summary
Release & Distribution Infrastructure `.github/workflows/release.yml`, `Cargo.toml`, `README.md`, `docs/release-process.md`	Extended build matrix with `cargo_binstall_archive` flag; updated workflow to conditionally produce `.tar.gz` archives with SHA-256 checksums for Linux GNU targets. Added binstall metadata to Cargo.toml, documented installation method in README, and updated release process documentation to reflect expanded Linux distribution support whilst narrowing macOS, Windows, and OpenBSD targets.
Code Processing Helpers `src/code_emphasis.rs`, `src/fences.rs`, `src/wrap/tokenize/mod.rs`	Extracted internal helper functions to centralise adjacency detection, text token handling, and plain-text scanning logic. Preserved outward behaviour whilst improving code organisation and readability.
Footnotes Module Refactoring `src/footnotes/inline.rs`, `src/footnotes/parsing.rs`, `src/footnotes/renumber.rs`	Introduced `InlineFootnote` struct to bundle regex captures; added `Copy` derivation to `DefinitionParts`; refactored definition-update collection with new `DefinitionScanContext`, `DefinitionAccumulator`, and helper functions for structured numeric-list processing.
State-Carrying Struct Introductions `src/lists.rs`, `src/process.rs`, `src/html.rs`, `src/table.rs`	Introduced `ListState`, `ProcessBuffer`, and `HtmlTableState` structs to encapsulate mutable parameters previously threaded through free functions. Centralised control flow whilst maintaining identical operational semantics.
Text Wrapping Module Restructure `src/wrap.rs`, `src/wrap/paragraph.rs`, `src/wrap/line_buffer.rs`, `src/wrap/fence.rs`, `src/wrap/inline.rs`	Introduced `ParagraphState`, `ParagraphWriter`, and `SplitContext` to replace scattered mutable-parameter threading. Refactored fence-line handling and span-based token indexing using `Range<usize>`. Centralised prefix-line processing and paragraph buffering logic.
Test Case Refactoring `src/frontmatter.rs`, `src/wrap/tokenize/scanning.rs`, `tests/common/mod.rs`	Replaced multiple individual test parameters with single case struct instances (`PrefixEmptyCase`, `FrontmatterSplitCase`, `ScanCollectCase`). Improved test readability and maintainability without altering assertions or control flow.

Poem

📦 Code grows cleaner, structs align,
Helper functions draw the line,
State flows through with purpose clear,
Release archives appear here,
Binstall brings the binaries near,
Refactored dreams now crystalline! ✨

🚥 Pre-merge checks | ✅ 5 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Testing	⚠️ Warning	PR lacks explicit test coverage for edge case logic at line 124 of src/code_emphasis.rs and dedicated unit tests for new internal structures (ParagraphState, ParagraphWriter, SplitContext, ListState).	Add explicit unit test for prefix-reset condition in code_emphasis.rs and create dedicated unit tests for internal state structures to verify correct behaviour in isolation.
Developer Documentation	⚠️ Warning	Pull request introduces substantial internal refactorings with multiple new state abstractions (ParagraphState, ProcessBuffer, HtmlTableState, ListState, SplitContext, footnote processing contexts) that are not documented in docs/architecture.md.	Extend docs/architecture.md with a new section documenting all introduced internal state abstractions, including their problems solved, responsibilities, and pipeline integration.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Cargo binstall support' directly and clearly summarises the primary change: adding cargo-binstall packaging support and metadata configuration for the project.
Description check	✅ Passed	The description comprehensively details all major changes: new cargo-binstall features, refactorings across markdown processing utilities, workflow enhancements, documentation updates, and test improvements—all aligned with the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
User-Facing Documentation	✅ Passed	The cargo-binstall installation capability is documented in README.md with supported Linux targets and command instructions. Code changes preserve existing functionality.
Module-Level Documentation	✅ Passed	All 19 modified modules carry appropriate module-level docstrings using Rust documentation conventions (//!), each clearly explaining the module's purpose.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch mdtablefix-cargo-binstall-support

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai · 2026-04-18T14:36:21Z

Reviewer's Guide

Refactors multiple markdown processing subsystems (footnotes, wrapping, inline tokenization, lists, HTML tables, process pipeline) into smaller helpers and state structs while adding cargo-binstall metadata and Linux tar.gz release artifacts, plus corresponding docs and tests updates.

Class diagram for refactored paragraph wrapping and tokenization

classDiagram
    class PrefixLine {
      +prefix: Cow<str>
      +rest: &str
      +repeat_prefix: bool
    }

    class ParagraphState {
      +buf: Vec<(String,bool)>
      +indent: String
      +clear()
      +note_indent(line: &str)
      +push(text: String, hard_break: bool)
    }

    class ParagraphWriter {
      -out: &mut Vec<String>
      -width: usize
      +new(out: &mut Vec<String>, width: usize) ParagraphWriter
      +flush_paragraph(state: &mut ParagraphState)
      +push_verbatim(state: &mut ParagraphState, line: &str)
      +handle_prefix_line(state: &mut ParagraphState, prefix_line: &PrefixLine)
      -append_wrapped_with_prefix(line: &PrefixLine)
      -push_wrapped_segment(indent: &str, segment: &str)
    }

    class LineBuffer {
      -text: String
      -width: usize
      -last_split: Option<usize>
      +new() LineBuffer
      +push_token(token: &str)
      +push_span(tokens: &[String], start: usize, end: usize)
      +split_with_span(ctx: &mut SplitContext, tokens: &[String], span: Range<usize>) bool
      +flush_trailing_whitespace(lines: &mut Vec<String>, tokens: &[String], span: Range<usize>) bool
      +flush_into(lines: &mut Vec<String>)
      +width() usize
    }

    class SplitContext {
      +lines: &mut Vec<String>
      +width: usize
    }

    class WrapInline {
      <<module>>
      +wrap_preserving_code(text: &str, width: usize) Vec<String>
      +attach_punctuation_to_previous_line(lines: &mut Vec<String>)
      -push_span_with_carry(buffer: &mut LineBuffer, tokens: &[String], span: Range<usize>, carried_whitespace: &mut String)
    }

    class WrapTokenizeMod {
      <<module>>
      +segment_inline(text: &str) Vec<String>
      -append_escaped_backtick(tokens: &mut Vec<String>)
      -scan_plain_text_end(text: &str, bytes: &[u8], index: usize) usize
      -should_stop_plain_text(text: &str, bytes: &[u8], index: usize, current: (char,bool)) bool
    }

    class WrapTopLevel {
      <<module>>
      +wrap_text(lines: &[String], width: usize) Vec<String>
      -is_table_or_separator(line: &str) bool
      -is_passthrough_block(line: &str) bool
      -prefix_line(line: &str) Option<PrefixLine>
      -line_break_parts(line: &str) (String,bool)
    }

    WrapTopLevel ..> ParagraphState : manages
    WrapTopLevel ..> ParagraphWriter : uses
    WrapTopLevel ..> PrefixLine : constructs
    WrapTopLevel ..> WrapInline : calls
    WrapTopLevel ..> WrapTokenizeMod : via WrapInline

    ParagraphWriter ..> PrefixLine : wraps
    ParagraphWriter ..> WrapInline : uses

    WrapInline ..> LineBuffer : buffers
    WrapInline ..> SplitContext : configures

    WrapTokenizeMod ..> WrapInline : token source

Class diagram for refactored footnote renumbering and inline handling

classDiagram
    class DefinitionParts {
      <<Copy>>
      +prefix: &str
      +number: usize
      +rest: &str
    }

    class DefinitionLine {
      +index: usize
      +new_number: usize
      +line: String
    }

    class DefinitionUpdates {
      +definitions: Vec<DefinitionLine>
      +is_definition_line: Vec<bool>
    }

    class DefinitionScanContext {
      +mapping: &mut HashMap<usize,usize>
      +next_number: &mut usize
      +numeric_list_range: Option<(usize,usize)>
      +skip_numeric_conversion: bool
    }

    class DefinitionAccumulator {
      +definitions: Vec<DefinitionLine>
      +is_definition_line: Vec<bool>
    }

    class NumericCandidate {
      +index: usize
      +number: usize
      +indent: String
      +whitespace: String
      +rest: String
    }

    class FootnoteRenumberMod {
      <<module>>
      +collect_reference_mapping(lines: &[String]) HashMap<usize,usize>
      -collect_reference_mapping_from_text(text: &str, mapping: &mut HashMap<usize,usize>, next: &mut usize)
      -collect_definition_updates(lines: &[String], mapping: &mut HashMap<usize,usize>) DefinitionUpdates
      -collect_scan_updates(lines: &[String], ctx: &mut DefinitionScanContext) (DefinitionAccumulator, Vec<NumericCandidate>)
      -finalize_numeric_candidates(numeric_candidates: Vec<NumericCandidate>, ctx: &mut DefinitionScanContext, acc: &mut DefinitionAccumulator)
      -definition_line_from_parts(index: usize, parts: DefinitionParts, mapping: &mut HashMap<usize,usize>, next_number: &mut usize) DefinitionLine
      -numeric_candidate_from_line(line: &str, index: usize) Option<NumericCandidate>
      -assign_new_number(mapping: &mut HashMap<usize,usize>, number: usize, next: &mut usize) usize
      -rewrite_tokens(rest: &str, mapping: &HashMap<usize,usize>) String
    }

    class InlineFootnote {
      +pre: &str
      +punc: &str
      +style: &str
      +num: &str
      +boundary: &str
    }

    class FootnoteInlineMod {
      <<module>>
      +convert_inline(text: &str) String
      -capture_parts(caps: &Captures) InlineFootnote
      -build_footnote(parts: InlineFootnote) String
    }

    FootnoteRenumberMod ..> DefinitionParts : uses
    FootnoteRenumberMod ..> DefinitionScanContext : configures
    FootnoteRenumberMod ..> DefinitionAccumulator : accumulates
    FootnoteRenumberMod ..> NumericCandidate : converts
    FootnoteRenumberMod ..> DefinitionUpdates : returns

    DefinitionAccumulator *-- DefinitionLine : aggregates
    DefinitionUpdates *-- DefinitionLine : aggregates

    FootnoteInlineMod ..> InlineFootnote : builds

Class diagram for refactored processing pipeline, HTML tables, and lists

classDiagram
    class ProcessBuffer {
      +out: Vec<String>
      +buf: Vec<String>
      +in_table: bool
      +flush()
      +push_verbatim(line: &str)
      +handle_fence_line(line: &str, fences: &mut FenceTracker) bool
      +handle_table_line(line: &str) bool
    }

    class HtmlTableState {
      +buf: Vec<String>
      +depth: usize
      +in_html: bool
      +flush_raw(out: &mut Vec<String>)
      +push_html_line(line: &str, out: &mut Vec<String>)
    }

    class HtmlMod {
      <<module>>
      +convert_html_tables(lines: &[String]) Vec<String>
      +html_table_to_markdown(lines: &[String]) Vec<String>
      -append_html_table_line(line: &str, buf: &mut Vec<String>, depth: &mut usize)
      -flush_completed_html_table(buf: &mut Vec<String>, depth: usize, out: &mut Vec<String>) bool
      -push_collapsed_text_char(ch: char, out: &mut String, last_space: &mut bool)
    }

    class ListState {
      +indent_stack: Vec<usize>
      +counters: HashMap<usize,usize>
      +prune_deeper(indent: usize, inclusive: bool)
      +handle_paragraph_restart(indent: usize, line: &str, prev_blank: bool) bool
    }

    class ListsMod {
      <<module>>
      +renumber_lists(lines: &[String]) Vec<String>
      -is_plain_paragraph_line(line: &str) bool
      -prune_deeper(indent: usize, inclusive: bool, indent_stack: &mut Vec<usize>, counters: &mut HashMap<usize,usize>)
    }

    class ProcessMod {
      <<module>>
      +process_stream_inner(lines: &[String], opts: Options) Vec<String>
    }

    class FenceTracker {
      +observe(line: &str) bool
      +in_fence() bool
    }

    class TableMod {
      <<module>>
      +reflow_table(lines: &[String]) Vec<String>
      -calculate_and_format(parsed: &ParsedTable, indent: &str) Vec<String>
    }

    class ParsedTable {
      +cleaned: Vec<Vec<String>>
      +output_rows: Vec<Vec<String>>
      +sep_cells: Option<Vec<String>>
      +max_cols: usize
    }

    ProcessMod ..> ProcessBuffer : manages
    ProcessMod ..> FenceTracker : tracks
    ProcessMod ..> HtmlMod : calls convert_html_tables

    ProcessBuffer ..> FenceTracker : uses
    ProcessBuffer ..> TableMod : uses reflow_table

    HtmlMod ..> HtmlTableState : uses

    ListsMod ..> ListState : manages

    TableMod ..> ParsedTable : consumes

File-Level Changes

Change	Details	Files
Refactor footnote renumbering to use reusable helpers and scan/accumulator context structs.	Extract collect_reference_mapping_from_text to handle mapping collection for text tokens Introduce DefinitionScanContext and DefinitionAccumulator to manage mapping, numeric range, and collected definitions Split definition scanning into collect_scan_updates and finalize_numeric_candidates for clearer numeric candidate handling Factor definition_line_from_parts and numeric_candidate_from_line helpers to construct DefinitionLine and NumericCandidate values	`src/footnotes/renumber.rs` `src/footnotes/parsing.rs`
Restructure paragraph wrapping into stateful writer and helpers, including fence and prefix handling.	Add ParagraphState and ParagraphWriter to encapsulate paragraph buffering, indent tracking, and flushing Extract is_passthrough_block, prefix_line, and line_break_parts helpers for block classification, prefix extraction, and hard-break detection Update wrap_text to operate via ParagraphWriter/ParagraphState and delegate fence handling to wrap::fence Adjust wrap::paragraph helpers to methods on ParagraphWriter and use PrefixLine with Cow for prefixes	`src/wrap.rs` `src/wrap/paragraph.rs` `src/wrap/fence.rs`
Improve code/emphasis merging logic by extracting helpers and simplifying fix_code_emphasis.	Add has_code_emphasis_adjacent guard to avoid unnecessary processing Introduce handle_text_token, try_fold_matching_emphasis, consume_code_affixes, and handle_code_token to encapsulate text/code handling Refactor fix_code_emphasis main loop to delegate to the new helpers while preserving behavior	`src/code_emphasis.rs`
Refactor inline tokenization and scanning utilities for clarity and reuse.	Add append_escaped_backtick, scan_plain_text_end, and should_stop_plain_text to simplify segment_inline’s control flow Extract ScanCollectCase struct in scanning tests to consolidate parameters and reduce repetition	`src/wrap/tokenize/mod.rs` `src/wrap/tokenize/scanning.rs`
Introduce range-based line buffer splitting context for inline wrapping.	Add SplitContext struct to carry lines and width for splitting operations Change LineBuffer::split_with_span and flush_trailing_whitespace to accept Range instead of start/end pairs Update wrap::inline and wrap tests to use SplitContext and Range-based calls	`src/wrap/line_buffer.rs` `src/wrap/inline.rs` `src/wrap/tests.rs`
Refactor list renumbering into a ListState helper struct.	Introduce ListState to hold indent_stack and counters with helper methods prune_deeper and handle_paragraph_restart Adjust renumber_lists to use ListState instead of passing raw stacks and maps to helpers	`src/lists.rs`
Refine HTML whitespace collapsing and table conversion using dedicated helpers and state structs.	Extract push_collapsed_text_char to handle whitespace collapsing in collect_text Factor HTML table line accumulation into append_html_table_line and flush_completed_html_table helpers Introduce HtmlTableState to manage HTML table buffering, depth, and flush behavior in convert_html_tables	`src/html.rs`
Simplify process pipeline buffering around tables and fences using a ProcessBuffer struct.	Add ProcessBuffer to own out/buf/in_table with methods flush, push_verbatim, handle_fence_line, and handle_table_line Update process_stream_inner to route all buffering/table handling through ProcessBuffer	`src/process.rs`
Tidy up tests by using case structs and shared helpers.	Replace many rstest parameter lists with PrefixEmptyCase, FrontmatterSplitCase, and ScanCollectCase structs Extract scan_code_spans, count_backticks, and toggle_code_span helpers for list-wrapping assertions in common test utilities	`src/frontmatter.rs` `src/wrap/tokenize/scanning.rs` `tests/common/mod.rs`
Streamline table reflow formatting interface.	Collapse calculate_and_format into a ParsedTable-based helper that uses its fields directly Update reflow_table to call the new calculate_and_format signature	`src/table.rs`
Improve fence specifier attachment by extracting orphan target helpers.	Add orphan_specifier_target and orphan_specifier_target_without_language helpers to locate applicable fences Use these helpers in attach_orphan_specifiers to simplify control flow and language checks	`src/fences.rs`
Add cargo-binstall support and Linux tar.gz release artifacts in CI.	Add package.metadata.binstall configuration in Cargo.toml for Linux GNU x86_64/aarch64 with tgz archives and bin-dir mapping Extend release workflow matrix with cargo_binstall_archive flag per target Update artifact preparation step to optionally build versioned --.tar.gz plus checksum for supported Linux targets	`Cargo.toml` `.github/workflows/release.yml`
Update documentation for cargo-binstall and adjusted release targets.	Document cargo-binstall installation for Linux x86_64/aarch64 in README Explain cargo-binstall archives, their naming, and matrix semantics in release-process docs; note Linux/FreeBSD support set	`README.md` `docs/release-process.md`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 5 issues, and left some high level feedback:

The new ListState type still exposes its indent_stack and counters to external mutation in a few places (e.g. direct clear() calls); consider encapsulating these operations behind ListState methods so list-renumbering behaviour remains centralized and easier to reason about.
In the HTML table handling, HtmlTableState::flush_raw and the manual if !html_state.buf.is_empty() { out.extend(html_state.buf); } at the end of convert_html_tables now duplicate behaviour; you could reuse flush_raw there to keep the flushing logic in a single place.
The introduction of multiple small helpers in wrap::inline and wrap::line_buffer (e.g. SplitContext, passing Range instead of start/end indices) improves clarity but also adds boilerplate at call sites; you might consider providing thin convenience wrappers for the common cases (e.g. constructing SplitContext once per wrap call) to keep the main wrapping loop easier to scan.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `ListState` type still exposes its `indent_stack` and `counters` to external mutation in a few places (e.g. direct `clear()` calls); consider encapsulating these operations behind `ListState` methods so list-renumbering behaviour remains centralized and easier to reason about.
- In the HTML table handling, `HtmlTableState::flush_raw` and the manual `if !html_state.buf.is_empty() { out.extend(html_state.buf); }` at the end of `convert_html_tables` now duplicate behaviour; you could reuse `flush_raw` there to keep the flushing logic in a single place.
- The introduction of multiple small helpers in `wrap::inline` and `wrap::line_buffer` (e.g. `SplitContext`, passing `Range` instead of start/end indices) improves clarity but also adds boilerplate at call sites; you might consider providing thin convenience wrappers for the common cases (e.g. constructing `SplitContext` once per wrap call) to keep the main wrapping loop easier to scan.

## Individual Comments

### Comment 1
<location path="Cargo.toml" line_range="16" />
<code_context>
+
+[package.metadata.binstall.overrides.'cfg(all(target_os = "linux", any(target_arch = "x86_64", target_arch = "aarch64"), target_env = "gnu"))']
+pkg-url = "{ repo }/releases/download/v{ version }/{ name }-{ version }-{ target }.tar.gz"
+bin-dir = "{ bin }{ binary-ext }"
+pkg-fmt = "tgz"
+
</code_context>
<issue_to_address>
**issue (bug_risk):** The `bin-dir` value in the cargo-binstall metadata looks like a file path instead of a directory, which may break installation.

In cargo-binstall metadata, `bin-dir` should point to the directory inside the archive that contains the binary (e.g. `"."` or `"bin"`), not the full binary path. With `bin-dir = "{ bin }{ binary-ext }"`, this expands to the full filename, while your `.tar.gz` places the binary at the archive root. To match the archive layout and let cargo-binstall find the binary, this should be `bin-dir = "."` (or another actual directory path if you change the archive structure).
</issue_to_address>

### Comment 2
<location path="src/footnotes/renumber.rs" line_range="282" />
<code_context>
     is_definition_line: Vec<bool>,
 }

+struct DefinitionScanContext<'a> {
+    mapping: &'a mut HashMap<usize, usize>,
+    next_number: &'a mut usize,
</code_context>
<issue_to_address>
**issue (complexity):** Consider collapsing the new scan context and accumulator into a single state struct and driving the scan/finalize flow directly from `collect_definition_updates` to reduce indirection.

You can keep the new helpers/behavior but reduce indirection by collapsing the scan context/accumulator and merging the two-phase scan/finalize into a single function-local flow.

### 1. Collapse `DefinitionScanContext` + `DefinitionAccumulator` into one local state

Instead of passing a `DefinitionScanContext` and `DefinitionAccumulator` between helpers, you can use a single local state struct inside `collect_definition_updates`:

```rust
struct ScanState<'a> {
    mapping: &'a mut HashMap<usize, usize>,
    next_number: &'a mut usize,
    numeric_list_range: Option<(usize, usize)>,
    skip_numeric_conversion: bool,

    definitions: Vec<DefinitionLine>,
    is_definition_line: Vec<bool>,
    numeric_candidates: Vec<NumericCandidate>,
}
```

Then `collect_definition_updates` owns the control flow and state:

```rust
fn collect_definition_updates(
    lines: &[String],
    mapping: &mut HashMap<usize, usize>,
) -> DefinitionUpdates {
    let mut next_number = mapping.values().copied().max().unwrap_or(0) + 1;
    let numeric_list_range = footnote_block_range(lines);
    let skip_numeric_conversion = numeric_list_range
        .as_ref()
        .is_some_and(|(start, _)| has_existing_footnote_block(lines, *start));

    let mut state = ScanState {
        mapping,
        next_number: &mut next_number,
        numeric_list_range,
        skip_numeric_conversion,
        definitions: Vec::new(),
        is_definition_line: vec![false; lines.len()],
        numeric_candidates: Vec::new(),
    };

    collect_scan_updates(lines, &mut state);
    finalize_numeric_candidates(&mut state);

    DefinitionUpdates {
        definitions: state.definitions,
        is_definition_line: state.is_definition_line,
    }
}
```

### 2. Simplify `collect_scan_updates` / `finalize_numeric_candidates` signatures

With the unified `ScanState`, the helpers become simpler and avoid tuple juggling and separate accumulator types:

```rust
fn collect_scan_updates(lines: &[String], state: &mut ScanState<'_>) {
    let mut in_fence = false;

    for (index, line) in lines.iter().enumerate() {
        if is_fence_line(line) {
            in_fence = !in_fence;
            continue;
        }
        if in_fence {
            continue;
        }

        if let Some(parts) = parse_definition(line) {
            state.definitions.push(definition_line_from_parts(
                index,
                parts,
                state.mapping,
                state.next_number,
            ));
            state.is_definition_line[index] = true;
            continue;
        }

        if !should_convert_numeric_line(
            index,
            state.numeric_list_range,
            state.skip_numeric_conversion,
        ) {
            continue;
        }
        if state.mapping.is_empty() && state.definitions.is_empty() {
            continue;
        }
        if let Some(candidate) = numeric_candidate_from_line(line, index) {
            state.numeric_candidates.push(candidate);
        }
    }
}

fn finalize_numeric_candidates(state: &mut ScanState<'_>) {
    for candidate in state.numeric_candidates.drain(..).rev() {
        let new_number = assign_new_number(state.mapping, candidate.number, state.next_number);
        let rewritten_rest = rewrite_tokens(&candidate.rest, state.mapping);
        let mut line = String::with_capacity(
            candidate.indent.len() + candidate.whitespace.len() + rewritten_rest.len() + 8,
        );
        line.push_str(&candidate.indent);
        write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");
        line.push_str(&candidate.whitespace);
        line.push_str(&rewritten_rest);
        state.definitions.push(DefinitionLine {
            index: candidate.index,
            new_number,
            line,
        });
        state.is_definition_line[candidate.index] = true;
    }
}
```

This keeps:

- `definition_line_from_parts` and `numeric_candidate_from_line` as reusable helpers.
- The “scan then reverse-apply numeric candidates” behavior.

But it:

- Removes `DefinitionScanContext` and `DefinitionAccumulator` as separate concepts.
- Avoids `(acc, numeric_candidates)` tuples and multiple lifetimes being threaded around.
- Keeps all the logic reachable from one primary function (`collect_definition_updates`), which matches the original linear mental model while retaining your refactoring improvements.
</issue_to_address>

### Comment 3
<location path="src/wrap.rs" line_range="124" />
<code_context>
 #[must_use]
 pub fn wrap_text(lines: &[String], width: usize) -> Vec<String> {
     let mut out = Vec::new();
-    let mut buf: Vec<(String, bool)> = Vec::new();
</code_context>
<issue_to_address>
**issue (complexity):** Consider merging `ParagraphState` into `ParagraphWriter`, simplifying `PrefixLine`, and inlining passthrough checks to make `wrap_text`’s control flow and APIs more linear and self‑contained.

1. Merge `ParagraphState` into `ParagraphWriter` to reduce cross‑type coupling  
Right now every call site has to thread `&mut writer` and `&mut state` together, and behavior depends on both. You can keep the same external behavior while collapsing them into a single type that owns the paragraph buffer and indentation, and exposes the same high‑level methods.

Example of the shape of the change:

```rust
// before
let mut state = ParagraphState::default();
let mut writer = ParagraphWriter::new(&mut out, width);

for line in lines {
    if fence::handle_fence_line(line, &mut writer, &mut state, &mut fence_tracker) { … }
    if fence_tracker.in_fence() {
        writer.push_verbatim(&mut state, line);
        continue;
    }
    if is_passthrough_block(line) {
        writer.push_verbatim(&mut state, line);
        continue;
    }
    if let Some(prefix_line) = prefix_line(line) {
        writer.handle_prefix_line(&mut state, &prefix_line);
        continue;
    }

    state.note_indent(line);
    let (text, hard_break) = line_break_parts(line);
    state.push(text, hard_break);
}

writer.flush_paragraph(&mut state);
```

```rust
// after: single struct owns the state
pub struct ParagraphWriter<'a> {
    out: &'a mut Vec<String>,
    width: usize,
    // moved fields from ParagraphState:
    buf: Vec<(String, bool)>,
    indent: String,
    // any other state...
}

impl<'a> ParagraphWriter<'a> {
    pub fn new(out: &'a mut Vec<String>, width: usize) -> Self { … }

    pub fn push_verbatim(&mut self, line: &str) { … }            // internally flushes buf/indent as needed
    pub fn handle_prefix_line(&mut self, prefix_line: &PrefixLine<'_>) { … }
    pub fn note_indent(&mut self, line: &str) { … }
    pub fn push_wrapped(&mut self, text: String, hard_break: bool) { … }
    pub fn flush_paragraph(&mut self) { … }
}
```

```rust
// wrap_text becomes simpler to read:
let mut writer = ParagraphWriter::new(&mut out, width);

for line in lines {
    if fence::handle_fence_line(line, &mut writer, &mut fence_tracker) {
        continue;
    }
    if fence_tracker.in_fence() {
        writer.push_verbatim(line);
        continue;
    }
    if is_passthrough_block(line) {
        writer.push_verbatim(line);
        continue;
    }
    if let Some(prefix_line) = prefix_line(line) {
        writer.handle_prefix_line(&prefix_line);
        continue;
    }

    writer.note_indent(line);
    let (text, hard_break) = line_break_parts(line);
    writer.push_wrapped(text, hard_break);
}

writer.flush_paragraph();
```

This keeps all current semantics (the logic inside `ParagraphState` just moves into `ParagraphWriter`), but reduces the number of moving parts and hidden early returns (only one object mutates paragraph state).

---

2. Simplify `PrefixLine` by avoiding `Cow` unless you really need it  
The only owned case is the footnote prefix `format!("{prefix}{marker}")`. If that optimization isn’t performance‑critical, you can drop `Cow` from public signatures and keep ownership concerns inside the writer. This makes lifetimes and method signatures easier to follow.

Example:

```rust
// before
pub struct PrefixLine<'a> {
    pub prefix: Cow<'a, str>,
    pub rest: &'a str,
    pub repeat_prefix: bool,
}
```

```rust
// after: only borrows; writer can allocate when necessary
pub struct PrefixLine<'a> {
    pub prefix: &'a str,
    pub rest: &'a str,
    pub repeat_prefix: bool,
}
```

```rust
fn prefix_line(line: &str) -> Option<PrefixLine<'_>> {
    if let Some(cap) = BULLET_RE.captures(line) {
        return Some(PrefixLine {
            prefix: cap.get(1).expect("bullet regex capture").as_str(),
            rest: cap.get(2).expect("bullet regex remainder capture").as_str(),
            repeat_prefix: false,
        });
    }

    if let Some(cap) = FOOTNOTE_RE.captures(line) {
        let prefix = cap.get(1).expect("footnote prefix capture").as_str();
        let marker = cap.get(2).expect("footnote marker capture").as_str();
        let rest = cap
            .get(3)
            .expect("footnote regex remainder capture")
            .as_str();
        // pass prefix + marker separately; concatenate inside writer
        return Some(PrefixLine {
            prefix,
            rest,
            repeat_prefix: false,
        });
    }

    BLOCKQUOTE_RE.captures(line).map(|cap| PrefixLine {
        prefix: cap.get(1).expect("blockquote prefix capture").as_str(),
        rest: cap.get(2).expect("blockquote regex remainder capture").as_str(),
        repeat_prefix: true,
    })
}
```

Then let `ParagraphWriter::handle_prefix_line` do the one allocation for footnotes when emitting the first line of the paragraph, instead of requiring `Cow` throughout the API.

---

3. Make passthrough control flow more linear in `wrap_text`  
The small helpers are fine, but you can make the early‑return cases easier to scan by inlining `is_table_or_separator` into `is_passthrough_block` and keeping all passthrough checks together:

```rust
fn is_passthrough_block(line: &str) -> bool {
    let trimmed = line.trim();
    line.trim_start().starts_with('|')
        || crate::table::SEP_RE.is_match(trimmed)
        || matches!(
            classify_block(line),
            Some(BlockKind::Heading | BlockKind::MarkdownlintDirective)
        )
        || trimmed.is_empty()
        || is_indented_code_line(line)
}
```

This keeps `wrap_text`’s top section as a clear sequence of “fence → passthrough → prefix → paragraph” checks without requiring the reader to jump between multiple tiny predicates.
</issue_to_address>

### Comment 4
<location path="src/wrap/paragraph.rs" line_range="18" />
<code_context>
+    true
+}
+
+#[derive(Default)]
+struct HtmlTableState {
+    buf: Vec<String>,
</code_context>
<issue_to_address>
**issue (complexity):** Consider merging `ParagraphState` into `ParagraphWriter`, inlining the prefix-wrapping logic into `handle_prefix_line`, and potentially simplifying `PrefixLine` to reduce cognitive overhead while preserving behavior.

You can retain the new behavior while reducing the number of moving parts by collapsing `ParagraphState` into `ParagraphWriter` and inlining some of the prefix handling. This removes the “two objects + lifetimes” cognitive load without changing semantics.

### 1. Merge `ParagraphState` into `ParagraphWriter`

Instead of passing `ParagraphState` around, let `ParagraphWriter` own the paragraph state. That way `wrap_text` only needs a single `&mut ParagraphWriter`.

```rust
pub(super) struct ParagraphWriter<'a> {
    out: &'a mut Vec<String>,
    width: usize,
    buf: Vec<(String, bool)>,
    indent: String,
}

impl<'a> ParagraphWriter<'a> {
    pub(super) fn new(out: &'a mut Vec<String>, width: usize) -> Self {
        Self {
            out,
            width,
            buf: Vec::new(),
            indent: String::new(),
        }
    }

    pub(super) fn clear(&mut self) {
        self.buf.clear();
        self.indent.clear();
    }

    pub(super) fn note_indent(&mut self, line: &str) {
        if self.buf.is_empty() {
            self.indent = line.chars().take_while(|c| c.is_whitespace()).collect();
        }
    }

    pub(super) fn push(&mut self, text: String, hard_break: bool) {
        self.buf.push((text, hard_break));
    }

    pub(super) fn flush_paragraph(&mut self) {
        if self.buf.is_empty() {
            return;
        }

        let mut segment = String::new();
        for (text, hard_break) in &self.buf {
            if !segment.is_empty() {
                segment.push(' ');
            }
            segment.push_str(text);
            if *hard_break {
                self.push_wrapped_segment(&self.indent, &segment);
                segment.clear();
            }
        }

        if !segment.is_empty() {
            self.push_wrapped_segment(&self.indent, &segment);
        }

        self.clear();
    }

    fn push_wrapped_segment(&mut self, indent: &str, segment: &str) {
        for line in wrap_preserving_code(segment, self.width - indent.len()) {
            self.out.push(format!("{indent}{line}"));
        }
    }

    pub(super) fn push_verbatim(&mut self, line: &str) {
        self.flush_paragraph();
        self.out.push(line.to_string());
    }
}
```

Callers that currently pass `&mut ParagraphState` can instead just use `&mut ParagraphWriter` (e.g. `writer.push(...)`, `writer.note_indent(...)`, etc.), and `flush_paragraph`/`push_verbatim` no longer need a `state` parameter.

### 2. Inline `append_wrapped_with_prefix` into `handle_prefix_line`

If `append_wrapped_with_prefix` is only used from `handle_prefix_line`, you can inline it to avoid one extra layer of indirection:

```rust
impl<'a> ParagraphWriter<'a> {
    pub(super) fn handle_prefix_line(&mut self, prefix_line: &PrefixLine<'_>) {
        self.flush_paragraph();

        let prefix = prefix_line.prefix.as_ref();
        let prefix_width = UnicodeWidthStr::width(prefix);
        let available = self.width.saturating_sub(prefix_width).max(1);

        let indent_str: String =
            prefix.chars().take_while(|c| c.is_whitespace()).collect();
        let indent_width = UnicodeWidthStr::width(indent_str.as_str());
        let wrapped_indent = if prefix_line.repeat_prefix {
            prefix.to_string()
        } else {
            format!("{}{}", indent_str, " ".repeat(prefix_width - indent_width))
        };

        let lines = wrap_preserving_code(prefix_line.rest, available);
        if lines.is_empty() {
            self.out.push(prefix.to_string());
            return;
        }

        for (index, wrapped_line) in lines.iter().enumerate() {
            if index == 0 {
                self.out.push(format!("{prefix}{wrapped_line}"));
            } else {
                self.out.push(format!("{wrapped_indent}{wrapped_line}"));
            }
        }
    }
}
```

This keeps the behavior identical but localizes all “prefix” behavior in one method.

### 3. Consider simplifying `PrefixLine` if `Cow` isn’t pulling its weight

If you don’t actually need to pass owned prefixes frequently, you can simplify the type:

```rust
pub(super) struct PrefixLine<'a> {
    pub(super) prefix: &'a str,
    pub(super) rest: &'a str,
    pub(super) repeat_prefix: bool,
}
```

Callers that need an owned prefix can still build a `String` and hold it locally, then pass `&prefix_string` into `PrefixLine`. This keeps the public API simpler while preserving all capabilities.
</issue_to_address>

### Comment 5
<location path="src/html.rs" line_range="228" />
<code_context>
+    true
+}
+
+#[derive(Default)]
+struct HtmlTableState {
+    buf: Vec<String>,
</code_context>
<issue_to_address>
**issue (complexity):** Consider consolidating table parsing into a single `HtmlTableState` abstraction that owns its helpers and state so callers no longer juggle parallel `buf + depth` logic and flags.

You now have two parallel abstractions for table state (`buf + depth` in `html_table_to_markdown` and `HtmlTableState` in `convert_html_tables`) plus tiny free helpers. You can keep the refactor while reducing moving parts by:

1. **Move the free helpers into `HtmlTableState` and hide the `depth` contract.**

```rust
impl HtmlTableState {
    fn push_html_line(&mut self, line: &str, out: &mut Vec<String>) -> bool {
        self.buf.push(line.to_string());
        self.depth += TABLE_START_RE.find_iter(line).count();
        if TABLE_END_RE.is_match(line) {
            self.depth = self.depth.saturating_sub(TABLE_END_RE.find_iter(line).count());
        }
        if self.depth == 0 {
            out.extend(table_lines_to_markdown(&self.buf));
            self.buf.clear();
            self.in_html = false;
            return true;
        }
        true
    }
}
```

Then `convert_html_tables` doesn’t need to know about `append_html_table_line` / `flush_completed_html_table`:

```rust
if html_state.in_html {
    html_state.push_html_line(line, &mut out);
    continue;
}

if TABLE_START_RE.is_match(line.trim_start()) {
    html_state.in_html = true;
    html_state.push_html_line(line, &mut out);
    continue;
}
```

`append_html_table_line` and `flush_completed_html_table` can then be private helpers *inside* the impl or removed entirely.

2. **Reuse `HtmlTableState` in `html_table_to_markdown` to avoid duplicating the state machine.**

```rust
pub(crate) fn html_table_to_markdown(lines: &[String]) -> Vec<String> {
    let mut out = Vec::new();
    let mut state = HtmlTableState::default();

    for line in lines {
        if state.depth > 0 || TABLE_START_RE.is_match(line.trim_start()) {
            state.in_html = true;
            state.push_html_line(line, &mut out);
            continue;
        }
        out.push(line.clone());
    }

    state.flush_raw(&mut out);
    out
}
```

This removes the second manual `buf + depth` pair and keeps all table behavior in one place.

3. **Consider dropping `in_html` and derive it from state.**

If you don’t have any “active but empty” table state, you can use `depth > 0` or `!buf.is_empty()` instead of a separate flag:

```rust
impl HtmlTableState {
    fn in_html(&self) -> bool {
        self.depth > 0
    }
}
```

Then `convert_html_tables` becomes:

```rust
if html_state.in_html() {
    html_state.push_html_line(line, &mut out);
    continue;
}

if TABLE_START_RE.is_match(line.trim_start()) {
    html_state.push_html_line(line, &mut out);
    continue;
}
```

This keeps the new organization but collapses the table handling into a single, self-contained abstraction with a clear API.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-04-18T14:55:11Z

+
+[package.metadata.binstall.overrides.'cfg(all(target_os = "linux", any(target_arch = "x86_64", target_arch = "aarch64"), target_env = "gnu"))']
+pkg-url = "{ repo }/releases/download/v{ version }/{ name }-{ version }-{ target }.tar.gz"
+bin-dir = "{ bin }{ binary-ext }"


issue (bug_risk): The bin-dir value in the cargo-binstall metadata looks like a file path instead of a directory, which may break installation.

In cargo-binstall metadata, bin-dir should point to the directory inside the archive that contains the binary (e.g. "." or "bin"), not the full binary path. With bin-dir = "{ bin }{ binary-ext }", this expands to the full filename, while your .tar.gz places the binary at the archive root. To match the archive layout and let cargo-binstall find the binary, this should be bin-dir = "." (or another actual directory path if you change the archive structure).

sourcery-ai · 2026-04-18T14:55:11Z

    is_definition_line: Vec<bool>,
 }

+struct DefinitionScanContext<'a> {


issue (complexity): Consider collapsing the new scan context and accumulator into a single state struct and driving the scan/finalize flow directly from collect_definition_updates to reduce indirection.

You can keep the new helpers/behavior but reduce indirection by collapsing the scan context/accumulator and merging the two-phase scan/finalize into a single function-local flow.

1. Collapse DefinitionScanContext + DefinitionAccumulator into one local state

Instead of passing a DefinitionScanContext and DefinitionAccumulator between helpers, you can use a single local state struct inside collect_definition_updates:

struct ScanState<'a> { mapping: &'a mut HashMap<usize, usize>, next_number: &'a mut usize, numeric_list_range: Option<(usize, usize)>, skip_numeric_conversion: bool, definitions: Vec<DefinitionLine>, is_definition_line: Vec<bool>, numeric_candidates: Vec<NumericCandidate>, }

Then collect_definition_updates owns the control flow and state:

fn collect_definition_updates( lines: &[String], mapping: &mut HashMap<usize, usize>, ) -> DefinitionUpdates { let mut next_number = mapping.values().copied().max().unwrap_or(0) + 1; let numeric_list_range = footnote_block_range(lines); let skip_numeric_conversion = numeric_list_range .as_ref() .is_some_and(|(start, _)| has_existing_footnote_block(lines, *start)); let mut state = ScanState { mapping, next_number: &mut next_number, numeric_list_range, skip_numeric_conversion, definitions: Vec::new(), is_definition_line: vec![false; lines.len()], numeric_candidates: Vec::new(), }; collect_scan_updates(lines, &mut state); finalize_numeric_candidates(&mut state); DefinitionUpdates { definitions: state.definitions, is_definition_line: state.is_definition_line, } }

2. Simplify collect_scan_updates / finalize_numeric_candidates signatures

With the unified ScanState, the helpers become simpler and avoid tuple juggling and separate accumulator types:

fn collect_scan_updates(lines: &[String], state: &mut ScanState<'_>) { let mut in_fence = false; for (index, line) in lines.iter().enumerate() { if is_fence_line(line) { in_fence = !in_fence; continue; } if in_fence { continue; } if let Some(parts) = parse_definition(line) { state.definitions.push(definition_line_from_parts( index, parts, state.mapping, state.next_number, )); state.is_definition_line[index] = true; continue; } if !should_convert_numeric_line( index, state.numeric_list_range, state.skip_numeric_conversion, ) { continue; } if state.mapping.is_empty() && state.definitions.is_empty() { continue; } if let Some(candidate) = numeric_candidate_from_line(line, index) { state.numeric_candidates.push(candidate); } } } fn finalize_numeric_candidates(state: &mut ScanState<'_>) { for candidate in state.numeric_candidates.drain(..).rev() { let new_number = assign_new_number(state.mapping, candidate.number, state.next_number); let rewritten_rest = rewrite_tokens(&candidate.rest, state.mapping); let mut line = String::with_capacity( candidate.indent.len() + candidate.whitespace.len() + rewritten_rest.len() + 8, ); line.push_str(&candidate.indent); write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail"); line.push_str(&candidate.whitespace); line.push_str(&rewritten_rest); state.definitions.push(DefinitionLine { index: candidate.index, new_number, line, }); state.is_definition_line[candidate.index] = true; } }

This keeps:

definition_line_from_parts and numeric_candidate_from_line as reusable helpers.

The “scan then reverse-apply numeric candidates” behavior.

But it:

Removes DefinitionScanContext and DefinitionAccumulator as separate concepts.

Avoids (acc, numeric_candidates) tuples and multiple lifetimes being threaded around.

Keeps all the logic reachable from one primary function (collect_definition_updates), which matches the original linear mental model while retaining your refactoring improvements.

sourcery-ai · 2026-04-18T14:55:11Z

 /// # Panics
 /// Panics if regex captures fail unexpectedly.
 #[must_use]
 pub fn wrap_text(lines: &[String], width: usize) -> Vec<String> {


issue (complexity): Consider merging ParagraphState into ParagraphWriter, simplifying PrefixLine, and inlining passthrough checks to make wrap_text’s control flow and APIs more linear and self‑contained.

Merge ParagraphState into ParagraphWriter to reduce cross‑type coupling
Right now every call site has to thread &mut writer and &mut state together, and behavior depends on both. You can keep the same external behavior while collapsing them into a single type that owns the paragraph buffer and indentation, and exposes the same high‑level methods.

Example of the shape of the change:

// before let mut state = ParagraphState::default(); let mut writer = ParagraphWriter::new(&mut out, width); for line in lines { if fence::handle_fence_line(line, &mut writer, &mut state, &mut fence_tracker) { … } if fence_tracker.in_fence() { writer.push_verbatim(&mut state, line); continue; } if is_passthrough_block(line) { writer.push_verbatim(&mut state, line); continue; } if let Some(prefix_line) = prefix_line(line) { writer.handle_prefix_line(&mut state, &prefix_line); continue; } state.note_indent(line); let (text, hard_break) = line_break_parts(line); state.push(text, hard_break); } writer.flush_paragraph(&mut state);

// after: single struct owns the state pub struct ParagraphWriter<'a> { out: &'a mut Vec<String>, width: usize, // moved fields from ParagraphState: buf: Vec<(String, bool)>, indent: String, // any other state... } impl<'a> ParagraphWriter<'a> { pub fn new(out: &'a mut Vec<String>, width: usize) -> Self { … } pub fn push_verbatim(&mut self, line: &str) { … } // internally flushes buf/indent as needed pub fn handle_prefix_line(&mut self, prefix_line: &PrefixLine<'_>) { … } pub fn note_indent(&mut self, line: &str) { … } pub fn push_wrapped(&mut self, text: String, hard_break: bool) { … } pub fn flush_paragraph(&mut self) { … } }

// wrap_text becomes simpler to read: let mut writer = ParagraphWriter::new(&mut out, width); for line in lines { if fence::handle_fence_line(line, &mut writer, &mut fence_tracker) { continue; } if fence_tracker.in_fence() { writer.push_verbatim(line); continue; } if is_passthrough_block(line) { writer.push_verbatim(line); continue; } if let Some(prefix_line) = prefix_line(line) { writer.handle_prefix_line(&prefix_line); continue; } writer.note_indent(line); let (text, hard_break) = line_break_parts(line); writer.push_wrapped(text, hard_break); } writer.flush_paragraph();

This keeps all current semantics (the logic inside ParagraphState just moves into ParagraphWriter), but reduces the number of moving parts and hidden early returns (only one object mutates paragraph state).

Simplify PrefixLine by avoiding Cow unless you really need it
The only owned case is the footnote prefix format!("{prefix}{marker}"). If that optimization isn’t performance‑critical, you can drop Cow from public signatures and keep ownership concerns inside the writer. This makes lifetimes and method signatures easier to follow.

Example:

// before pub struct PrefixLine<'a> { pub prefix: Cow<'a, str>, pub rest: &'a str, pub repeat_prefix: bool, }

// after: only borrows; writer can allocate when necessary pub struct PrefixLine<'a> { pub prefix: &'a str, pub rest: &'a str, pub repeat_prefix: bool, }

fn prefix_line(line: &str) -> Option<PrefixLine<'_>> { if let Some(cap) = BULLET_RE.captures(line) { return Some(PrefixLine { prefix: cap.get(1).expect("bullet regex capture").as_str(), rest: cap.get(2).expect("bullet regex remainder capture").as_str(), repeat_prefix: false, }); } if let Some(cap) = FOOTNOTE_RE.captures(line) { let prefix = cap.get(1).expect("footnote prefix capture").as_str(); let marker = cap.get(2).expect("footnote marker capture").as_str(); let rest = cap .get(3) .expect("footnote regex remainder capture") .as_str(); // pass prefix + marker separately; concatenate inside writer return Some(PrefixLine { prefix, rest, repeat_prefix: false, }); } BLOCKQUOTE_RE.captures(line).map(|cap| PrefixLine { prefix: cap.get(1).expect("blockquote prefix capture").as_str(), rest: cap.get(2).expect("blockquote regex remainder capture").as_str(), repeat_prefix: true, }) }

Then let ParagraphWriter::handle_prefix_line do the one allocation for footnotes when emitting the first line of the paragraph, instead of requiring Cow throughout the API.

Make passthrough control flow more linear in wrap_text
The small helpers are fine, but you can make the early‑return cases easier to scan by inlining is_table_or_separator into is_passthrough_block and keeping all passthrough checks together:

fn is_passthrough_block(line: &str) -> bool { let trimmed = line.trim(); line.trim_start().starts_with('|') || crate::table::SEP_RE.is_match(trimmed) || matches!( classify_block(line), Some(BlockKind::Heading | BlockKind::MarkdownlintDirective) ) || trimmed.is_empty() || is_indented_code_line(line) }

This keeps wrap_text’s top section as a clear sequence of “fence → passthrough → prefix → paragraph” checks without requiring the reader to jump between multiple tiny predicates.

sourcery-ai · 2026-04-18T14:55:11Z

+    pub(super) repeat_prefix: bool,
+}
+
+#[derive(Default)]


issue (complexity): Consider merging ParagraphState into ParagraphWriter, inlining the prefix-wrapping logic into handle_prefix_line, and potentially simplifying PrefixLine to reduce cognitive overhead while preserving behavior.

You can retain the new behavior while reducing the number of moving parts by collapsing ParagraphState into ParagraphWriter and inlining some of the prefix handling. This removes the “two objects + lifetimes” cognitive load without changing semantics.

1. Merge ParagraphState into ParagraphWriter

Instead of passing ParagraphState around, let ParagraphWriter own the paragraph state. That way wrap_text only needs a single &mut ParagraphWriter.

pub(super) struct ParagraphWriter<'a> { out: &'a mut Vec<String>, width: usize, buf: Vec<(String, bool)>, indent: String, } impl<'a> ParagraphWriter<'a> { pub(super) fn new(out: &'a mut Vec<String>, width: usize) -> Self { Self { out, width, buf: Vec::new(), indent: String::new(), } } pub(super) fn clear(&mut self) { self.buf.clear(); self.indent.clear(); } pub(super) fn note_indent(&mut self, line: &str) { if self.buf.is_empty() { self.indent = line.chars().take_while(|c| c.is_whitespace()).collect(); } } pub(super) fn push(&mut self, text: String, hard_break: bool) { self.buf.push((text, hard_break)); } pub(super) fn flush_paragraph(&mut self) { if self.buf.is_empty() { return; } let mut segment = String::new(); for (text, hard_break) in &self.buf { if !segment.is_empty() { segment.push(' '); } segment.push_str(text); if *hard_break { self.push_wrapped_segment(&self.indent, &segment); segment.clear(); } } if !segment.is_empty() { self.push_wrapped_segment(&self.indent, &segment); } self.clear(); } fn push_wrapped_segment(&mut self, indent: &str, segment: &str) { for line in wrap_preserving_code(segment, self.width - indent.len()) { self.out.push(format!("{indent}{line}")); } } pub(super) fn push_verbatim(&mut self, line: &str) { self.flush_paragraph(); self.out.push(line.to_string()); } }

Callers that currently pass &mut ParagraphState can instead just use &mut ParagraphWriter (e.g. writer.push(...), writer.note_indent(...), etc.), and flush_paragraph/push_verbatim no longer need a state parameter.

2. Inline append_wrapped_with_prefix into handle_prefix_line

If append_wrapped_with_prefix is only used from handle_prefix_line, you can inline it to avoid one extra layer of indirection:

impl<'a> ParagraphWriter<'a> { pub(super) fn handle_prefix_line(&mut self, prefix_line: &PrefixLine<'_>) { self.flush_paragraph(); let prefix = prefix_line.prefix.as_ref(); let prefix_width = UnicodeWidthStr::width(prefix); let available = self.width.saturating_sub(prefix_width).max(1); let indent_str: String = prefix.chars().take_while(|c| c.is_whitespace()).collect(); let indent_width = UnicodeWidthStr::width(indent_str.as_str()); let wrapped_indent = if prefix_line.repeat_prefix { prefix.to_string() } else { format!("{}{}", indent_str, " ".repeat(prefix_width - indent_width)) }; let lines = wrap_preserving_code(prefix_line.rest, available); if lines.is_empty() { self.out.push(prefix.to_string()); return; } for (index, wrapped_line) in lines.iter().enumerate() { if index == 0 { self.out.push(format!("{prefix}{wrapped_line}")); } else { self.out.push(format!("{wrapped_indent}{wrapped_line}")); } } } }

This keeps the behavior identical but localizes all “prefix” behavior in one method.

3. Consider simplifying PrefixLine if Cow isn’t pulling its weight

If you don’t actually need to pass owned prefixes frequently, you can simplify the type:

pub(super) struct PrefixLine<'a> { pub(super) prefix: &'a str, pub(super) rest: &'a str, pub(super) repeat_prefix: bool, }

Callers that need an owned prefix can still build a String and hold it locally, then pass &prefix_string into PrefixLine. This keeps the public API simpler while preserving all capabilities.

sourcery-ai · 2026-04-18T14:55:11Z

+    true
+}
+
+#[derive(Default)]


issue (complexity): Consider consolidating table parsing into a single HtmlTableState abstraction that owns its helpers and state so callers no longer juggle parallel buf + depth logic and flags.

You now have two parallel abstractions for table state (buf + depth in html_table_to_markdown and HtmlTableState in convert_html_tables) plus tiny free helpers. You can keep the refactor while reducing moving parts by:

Move the free helpers into HtmlTableState and hide the depth contract.

impl HtmlTableState { fn push_html_line(&mut self, line: &str, out: &mut Vec<String>) -> bool { self.buf.push(line.to_string()); self.depth += TABLE_START_RE.find_iter(line).count(); if TABLE_END_RE.is_match(line) { self.depth = self.depth.saturating_sub(TABLE_END_RE.find_iter(line).count()); } if self.depth == 0 { out.extend(table_lines_to_markdown(&self.buf)); self.buf.clear(); self.in_html = false; return true; } true } }

Then convert_html_tables doesn’t need to know about append_html_table_line / flush_completed_html_table:

if html_state.in_html { html_state.push_html_line(line, &mut out); continue; } if TABLE_START_RE.is_match(line.trim_start()) { html_state.in_html = true; html_state.push_html_line(line, &mut out); continue; }

append_html_table_line and flush_completed_html_table can then be private helpers inside the impl or removed entirely.

Reuse HtmlTableState in html_table_to_markdown to avoid duplicating the state machine.

pub(crate) fn html_table_to_markdown(lines: &[String]) -> Vec<String> { let mut out = Vec::new(); let mut state = HtmlTableState::default(); for line in lines { if state.depth > 0 || TABLE_START_RE.is_match(line.trim_start()) { state.in_html = true; state.push_html_line(line, &mut out); continue; } out.push(line.clone()); } state.flush_raw(&mut out); out }

This removes the second manual buf + depth pair and keeps all table behavior in one place.

Consider dropping in_html and derive it from state.

If you don’t have any “active but empty” table state, you can use depth > 0 or !buf.is_empty() instead of a separate flag:

impl HtmlTableState { fn in_html(&self) -> bool { self.depth > 0 } }

Then convert_html_tables becomes:

if html_state.in_html() { html_state.push_html_line(line, &mut out); continue; } if TABLE_START_RE.is_match(line.trim_start()) { html_state.push_html_line(line, &mut out); continue; }

This keeps the new organization but collapses the table handling into a single, self-contained abstraction with a clear API.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3d51aed524

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T14:59:10Z

+            classify_block(line),
+            Some(BlockKind::Heading | BlockKind::MarkdownlintDirective)
+        )
+        || line.trim().is_empty()


Treat whitespace-only passthrough lines as empty

Including line.trim().is_empty() in passthrough detection causes wrap_text to emit blank lines verbatim via writer.push_verbatim, so inputs like " " or "\t\t" are preserved instead of normalized to empty lines. That changes user-visible output (and conflicts with existing process_stream expectations for whitespace-only lines), leaving trailing whitespace in formatted documents.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/code_emphasis.rs`:
- Around line 101-128: Add a unit test that exercises the edge case in
consume_code_affixes where a non-empty pending prefix combines with a next Text
token whose split_marks(mid) is non-empty so the function resets prefix to empty
(the branch with "else { prefix = \"\"; }"). Construct tokens (using the Token
enum and IntoIter<Token<'a>> in the same way other tests do) so pending is a
non-empty marker string, the next Token::Text contains lead+mid+suffix (i.e.
both markers and content), and assert the returned tuple is ("" /*empty
prefix*/, lead or suffix as appropriate, true) and that the next text is sliced
by lead.len(); place the test alongside existing tests for code_emphasis.rs to
cover the documented "Mixed surrounding markers are left untouched" behavior.

In `@src/footnotes/renumber.rs`:
- Line 418: The use of .expect() on write!(&mut line, "[^{new_number}]:") must
be replaced with proper error propagation/handling: remove .expect(...) and use
the ? operator on the write! result (i.e. write!(...).? ) or map the error into
the function's existing error type, and update the enclosing function's return
type to Result<..., std::fmt::Error> (or convert via .map_err(...) if changing
the signature is not desired). Locate the write! invocation in renumber.rs (the
line variable and new_number) and apply the same fix as done earlier at the
other occurrence (replace .expect with ?/map_err and adjust the enclosing
function signature or error conversion accordingly).
- Around line 344-350: The code currently uses .expect() on regex captures
(caps.name("num") and caps.name("rest")) which is forbidden; replace both
.expect() calls with fallible handling using let-else guards (e.g., let
Some(num_match) = caps.name("num") else { return Err(...) } and similarly for
rest_match) or propagate a suitable error/result from the surrounding function
instead of panicking; then compute whitespace from line using the obtained
num_match and rest_match as before.
- Line 330: Replace the .expect() on the infallible write! call by ignoring the
fmt::Result instead of panicking; specifically update the write!(&mut line,
"[^{new_number}]:") usage (where `line` and `new_number` are defined) to use the
infallibility pattern (e.g., let _ = write!(...)) or otherwise handle the
fmt::Result, removing the .expect() call to comply with the no
.expect()/unwrap() guideline.

In `@src/html.rs`:
- Around line 211-216: The function append_html_table_line pushes the raw line
then counts opening <table> tags using TABLE_START_RE which is anchored at the
line start; trim the input before counting starts so indented/multiline tables
are recognized: replace the use of line when calling
TABLE_START_RE.find_iter(...) with a trimmed version (e.g., let trimmed =
line.trim_start()) and keep TABLE_END_RE matching on the original line as
before; update the increment to use TABLE_START_RE.find_iter(trimmed).count()
while leaving the decrement logic unchanged.

In `@src/lists.rs`:
- Around line 100-103: Add a derived Default impl to the ListState struct and
use its default constructor when initializing: add #[derive(Default)] above the
ListState declaration (which contains indent_stack: Vec<usize> and counters:
HashMap<usize, usize>) and replace the manual initializer with let mut state =
ListState::default(); so initialization is centralized and boilerplate is
removed.

In `@src/wrap.rs`:
- Around line 65-98: The prefix_line function currently calls .expect() on regex
capture groups (BULLET_RE, FOOTNOTE_RE, BLOCKQUOTE_RE) which must be removed;
change each branch to use let-else guards or safe captures and return None when
a capture is missing: for the BULLET_RE and FOOTNOTE_RE branches extract
cap.get(1), cap.get(2) (and cap.get(3) for FOOTNOTE_RE) with let Some(...) = ...
else { return None } before building the PrefixLine, and for the BLOCKQUOTE_RE
branch similarly map the capture to a PrefixLine only if cap.get(1) and
cap.get(2) are Some; ensure PrefixLine fields (prefix, rest, repeat_prefix) are
set the same way but without any .expect() calls.

In `@src/wrap/paragraph.rs`:
- Around line 100-103: The push_wrapped_segment implementation uses indent.len()
(byte length) causing wrong available width and potential underflow; change it
to compute the display width via UnicodeWidthStr::width(indent), then compute
available = self.width.saturating_sub(indent_width).max(1) and pass that
available value into wrap_preserving_code instead of self.width - indent.len();
update references in push_wrapped_segment and keep prefixing lines with the
original indent as before.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f781191a-957a-4f54-9cb6-154c1a435958

📥 Commits

Reviewing files that changed from the base of the PR and between 8c50fc8 and 3d51aed.

📒 Files selected for processing (23)

.github/workflows/release.yml
Cargo.toml
README.md
docs/release-process.md
src/code_emphasis.rs
src/fences.rs
src/footnotes/inline.rs
src/footnotes/parsing.rs
src/footnotes/renumber.rs
src/frontmatter.rs
src/html.rs
src/lists.rs
src/process.rs
src/table.rs
src/wrap.rs
src/wrap/fence.rs
src/wrap/inline.rs
src/wrap/line_buffer.rs
src/wrap/paragraph.rs
src/wrap/tests.rs
src/wrap/tokenize/mod.rs
src/wrap/tokenize/scanning.rs
tests/common/mod.rs

coderabbitai · 2026-04-18T15:03:37Z

+fn consume_code_affixes<'a>(
+    tokens: &mut Peekable<IntoIter<Token<'a>>>,
+    pending: &mut &'a str,
+) -> (&'a str, &'a str, bool) {
+    let mut prefix = std::mem::take(pending);
+    let mut suffix = "";
+    let mut modified = !prefix.is_empty();
+
+    let Some(Token::Text(next)) = tokens.peek_mut() else {
+        return (prefix, suffix, modified);
+    };
+
+    let (lead, mid, _) = split_marks(next);
+    if lead.is_empty() {
+        return (prefix, suffix, modified);
+    }
+
+    modified = true;
+    if prefix.is_empty() {
+        prefix = lead;
+    } else if mid.is_empty() {
+        suffix = lead;
+    } else {
+        prefix = "";
+    }
+    *next = &next[lead.len()..];
+    (prefix, suffix, modified)
+}


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for test cases covering consume_code_affixes edge cases rg -n "consume_code_affixes\|code_emphasis" --type rust -A 5 -B 2 src/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 43

🏁 Script executed:

# First, let's confirm the file exists and check the repository structure fd -type f -name "*.rs" | head -20

Repository: leynos/mdtablefix

Length of output: 231

🏁 Script executed:

# Search more broadly for the file and function git ls-files | grep -E "(code|emphasis)"

Repository: leynos/mdtablefix

Length of output: 156

🏁 Script executed:

# Look for test files that might contain tests for this function git ls-files | grep -i test

Repository: leynos/mdtablefix

Length of output: 3034

🏁 Script executed:

# Check the size of the source file wc -l src/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 85

🏁 Script executed:

# Read the source file cat -n src/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 8421

🏁 Script executed:

# Check test file size and read it wc -l tests/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 87

🏁 Script executed:

# Read the test file cat -n tests/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 5151

🏁 Script executed:

# Check if there are test files with mixed emphasis patterns grep -r "\\*\\*" tests/code_emphasis.rs tests/data/ | head -20

Repository: leynos/mdtablefix

Length of output: 2305

🏁 Script executed:

# Look for any edge case patterns in test data files find tests/data -name "*emphasis*" -o -name "*code*" | xargs cat 2>/dev/null | head -50

Repository: leynos/mdtablefix

Length of output: 43

🏁 Script executed:

# Let's create a focused test to understand the edge case cat > /tmp/test_edge_case.rs << 'EOF' // Testing the specific edge case: // What happens when prefix is non-empty AND mid is non-empty? // // Example input that would trigger this: // `code` followed by **bold** (where ** comes with text) // split_marks("**bold**") returns ("**", "bold", "**") // If pending = "**" from previous token // Then: prefix="**" (non-empty) // lead="**" (non-empty) // mid="bold" (non-empty) // This hits line 124: prefix = "" // The question: is this the correct handling? EOF cat /tmp/test_edge_case.rs

Repository: leynos/mdtablefix

Length of output: 539

🏁 Script executed:

# Search for any documentation about the intended behaviour grep -r "mixed" src/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 43

🏁 Script executed:

# Look at the module-level documentation more carefully head -20 src/code_emphasis.rs

Repository: leynos/mdtablefix

Length of output: 868

🏁 Script executed:

# Let me trace through a specific example to verify the logic cat > /tmp/trace_logic.py << 'EOF' # Simulating the consume_code_affixes logic def split_marks(s): """Split emphasis markers at both ends""" first = next((i for i, c in enumerate(s) if c not in '*_'), len(s)) last = next((i for i in range(len(s)-1, -1, -1) if s[i] not in '*_'), first-1) + 1 if last <= first: last = first return s[:first], s[first:last], s[last:] def consume_code_affixes_trace(pending_in, next_text): """Trace through the logic with examples""" prefix = pending_in suffix = "" modified = bool(pending_in) print(f"Input: pending='{pending_in}', next_text='{next_text}'") if not next_text: print(f" No next text token") return prefix, suffix, modified lead, mid, trail = split_marks(next_text) print(f" split_marks('{next_text}') = ('{lead}', '{mid}', '{trail}')") if not lead: print(f" No leading marks, return early") return prefix, suffix, modified modified = True print(f" Set modified=True") if not prefix: prefix = lead print(f" prefix was empty, set prefix='{lead}'") elif not mid: suffix = lead print(f" mid was empty, set suffix='{lead}'") else: prefix = "" print(f" EDGE CASE: mid is not empty, reset prefix to '' (avoid mixed markers)") print(f" Result: prefix='{prefix}', suffix='{suffix}', modified={modified}") return prefix, suffix, modified # Test cases print("=== Case 1: No pending, plain text ===") consume_code_affixes_trace("", "text") print("\n=== Case 2: Pending markers, next text is all markers ===") consume_code_affixes_trace("**", "**") print("\n=== Case 3: EDGE CASE - Pending markers, next text has markers AND content ===") consume_code_affixes_trace("**", "**more**") print("\n=== Case 4: EDGE CASE - Pending underscore, next text has asterisks ===") consume_code_affixes_trace("_", "*more*") EOF python3 /tmp/trace_logic.py

Repository: leynos/mdtablefix

Length of output: 1048

🏁 Script executed:

# Let's create a comprehensive test to verify the actual behaviour with the example cases cat > /tmp/verify_behavior.py << 'EOF' # Test the actual scenario the review mentions def trace_scenario(input_str, label): print(f"\n=== {label} ===") print(f"Input: {repr(input_str)}") # The tokenizer would break this into tokens # Then consume_code_affixes logic applies # We need to understand what the expected output should be print(f"(Requires understanding of tokenization)") trace_scenario("**code**", "Simple case: both leading and trailing emphasis") trace_scenario("`code`**text**", "Code followed by emphasis and text") trace_scenario("**`code`**", "Emphasis wrapping code (no affixes needed)") trace_scenario("`code`**", "Code followed by emphasis (no following text)") # The review asks: does the edge case (line 124: prefix = "") handle **code** correctly? # But **code** without surrounding backticks wouldn't be tokenized as code # The function only processes Text tokens adjacent to Code tokens # The actual concern seems to be: when there's pending emphasis from a Text token # followed by a Code token, and then another Text token with both markers and content, # should we keep or discard the pending emphasis? print("\n=== Analysis of edge case ===") print("When prefix is non-empty AND mid is non-empty (line 124):") print("- prefix = '' clears the pending emphasis markers") print("- modified = True still wraps the code") print("- This avoids: **`code`**more** (ambiguous nesting)") print("- Instead: `code`**more** (next text handles its own markers)") EOF python3 /tmp/verify_behavior.py

Repository: leynos/mdtablefix

Length of output: 804

🏁 Script executed:

# Let me check if there's test coverage specifically for the edge case scenario grep -n "prefix\|mid\|suffix" src/code_emphasis.rs | head -20

Repository: leynos/mdtablefix

Length of output: 766

🏁 Script executed:

# Let me check if there are any tests that would trigger the line 124 edge case # by looking for test inputs with pending emphasis + content + markers cat > /tmp/analyze_tests.py << 'EOF' # Analyzing test coverage for the edge case test_cases = [ ("`StepContext`** Enhancement (in **`crates/rstest-bdd/src/context.rs`**)**", "Complex case with nested code and emphasis"), ("**`code`**", "Emphasis only around code"), ("`code` text", "Standalone code"), ] print("Test Coverage Analysis:") print("=" * 60) for input_str, description in test_cases: print(f"\nTest: {description}") print(f"Input: {repr(input_str)}") # The edge case occurs when: # 1. There's pending emphasis from previous text token (prefix non-empty) # 2. Next token is code # 3. After code, there's text with BOTH markers AND content # (i.e., split_marks returns non-empty lead AND non-empty mid) if "**" in input_str and "`" in input_str: print(" ✓ Contains both emphasis and code") # Check if it would trigger the edge case # The edge case: prefix non-empty AND mid non-empty # This happens when: pending emphasis + code + text with markers+content print("\n" + "=" * 60) print("\nEdge Case Scenario:") print("- Pending emphasis markers from previous text token") print("- Code token is processed") print("- Next text token has split_marks returning:") print(" - lead (markers): non-empty") print(" - mid (content): non-empty") print(" - This triggers line 124: prefix = ''") print("\nNo explicit test case found for this specific scenario!") EOF python3 /tmp/analyze_tests.py

Repository: leynos/mdtablefix

Length of output: 809

Add explicit test coverage for the edge case where pending emphasis combines with code and marked content.

The logic at lines 119–125 intentionally handles three scenarios. When prefix is non-empty and the next text token's mid (content between markers) is also non-empty, the code resets prefix to empty (line 124). This prevents ambiguous marker nesting—avoiding patterns like **code**more**.

This behaviour is by design per the module documentation ("Mixed surrounding markers are left untouched"), but it lacks explicit test coverage. Add a test case triggering the scenario where pending emphasis from one text token combines with code followed by another text token containing both markers and content. This confirms the edge case handling is correct.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/code_emphasis.rs` around lines 101 - 128, Add a unit test that exercises the edge case in consume_code_affixes where a non-empty pending prefix combines with a next Text token whose split_marks(mid) is non-empty so the function resets prefix to empty (the branch with "else { prefix = \"\"; }"). Construct tokens (using the Token enum and IntoIter<Token<'a>> in the same way other tests do) so pending is a non-empty marker string, the next Token::Text contains lead+mid+suffix (i.e. both markers and content), and assert the returned tuple is ("" /*empty prefix*/, lead or suffix as appropriate, true) and that the next text is sliced by lead.len(); place the test alongside existing tests for code_emphasis.rs to cover the documented "Mixed surrounding markers are left untouched" behavior.

coderabbitai · 2026-04-18T15:03:37Z

+    let rewritten_rest = rewrite_tokens(parts.rest, mapping);
+    let mut line = String::with_capacity(parts.prefix.len() + rewritten_rest.len() + 8);
+    line.push_str(parts.prefix);
+    write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");


⚠️ Potential issue | 🟡 Minor

Replace .expect() on infallible write with explicit handling.

Writing to a String via write! cannot fail, but the coding guidelines forbid .expect() outside tests. Use let _ = write!(...) or the write! macro's infallibility guarantee via fmt::Write for String.

🛠️ Proposed fix

- write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail"); + // Writing to String is infallible; discard the result. + let _ = write!(&mut line, "[^{new_number}]:");

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");

// Writing to String is infallible; discard the result.

let _ = write!(&mut line, "[^{new_number}]:");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/footnotes/renumber.rs` at line 330, Replace the .expect() on the infallible write! call by ignoring the fmt::Result instead of panicking; specifically update the write!(&mut line, "[^{new_number}]:") usage (where `line` and `new_number` are defined) to use the infallibility pattern (e.g., let _ = write!(...)) or otherwise handle the fmt::Result, removing the .expect() call to comply with the no .expect()/unwrap() guideline.

coderabbitai · 2026-04-18T15:03:37Z

+    let num_match = caps
+        .name("num")
+        .expect("numeric list capture missing number");
+    let rest_match = caps
+        .name("rest")
+        .expect("numeric list capture missing rest");
+    let whitespace = line[num_match.end() + 1..rest_match.start()].to_string();


⚠️ Potential issue | 🟠 Major

Replace .expect() with fallible handling or use let-else.

Per coding guidelines, .expect() is forbidden outside tests. Even though the regex guarantees these captures exist after a successful match, propagate the error or use a let-else guard to satisfy the lint.

🛠️ Proposed fix using let-else guards

- let num_match = caps - .name("num") - .expect("numeric list capture missing number"); - let rest_match = caps - .name("rest") - .expect("numeric list capture missing rest"); + let Some(num_match) = caps.name("num") else { + return None; + }; + let Some(rest_match) = caps.name("rest") else { + return None; + };

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests. Errors must be propagated."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let num_match = caps

.name("num")

.expect("numeric list capture missing number");

let rest_match = caps

.name("rest")

.expect("numeric list capture missing rest");

let whitespace = line[num_match.end() + 1..rest_match.start()].to_string();

let Some(num_match) = caps.name("num") else {

return None;

};

let Some(rest_match) = caps.name("rest") else {

return None;

};

let whitespace = line[num_match.end() + 1..rest_match.start()].to_string();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/footnotes/renumber.rs` around lines 344 - 350, The code currently uses .expect() on regex captures (caps.name("num") and caps.name("rest")) which is forbidden; replace both .expect() calls with fallible handling using let-else guards (e.g., let Some(num_match) = caps.name("num") else { return Err(...) } and similarly for rest_match) or propagate a suitable error/result from the surrounding function instead of panicking; then compute whitespace from line using the obtained num_match and rest_match as before.

coderabbitai · 2026-04-18T15:03:37Z

-        new_line.push_str(&rewritten_rest);
-        definitions.push(DefinitionLine {
+        line.push_str(&candidate.indent);
+        write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");


⚠️ Potential issue | 🟡 Minor

Same .expect() pattern requires adjustment.

Apply the same fix as suggested for line 330.

🛠️ Proposed fix

- write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail"); + let _ = write!(&mut line, "[^{new_number}]:");

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");

let _ = write!(&mut line, "[^{new_number}]:");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/footnotes/renumber.rs` at line 418, The use of .expect() on write!(&mut line, "[^{new_number}]:") must be replaced with proper error propagation/handling: remove .expect(...) and use the ? operator on the write! result (i.e. write!(...).? ) or map the error into the function's existing error type, and update the enclosing function's return type to Result<..., std::fmt::Error> (or convert via .map_err(...) if changing the signature is not desired). Locate the write! invocation in renumber.rs (the line variable and new_number) and apply the same fix as done earlier at the other occurrence (replace .expect with ?/map_err and adjust the enclosing function signature or error conversion accordingly).

coderabbitai · 2026-04-18T15:03:37Z

+fn append_html_table_line(line: &str, buf: &mut Vec<String>, depth: &mut usize) {
    buf.push(line.to_string());
    *depth += TABLE_START_RE.find_iter(line).count();
    if TABLE_END_RE.is_match(line) {
        *depth = depth.saturating_sub(TABLE_END_RE.find_iter(line).count());
-        if *depth == 0 {
-            out.extend(html_table_to_markdown(buf));
-            buf.clear();
-            *in_html = false;
+    }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, examine the append_html_table_line function and surrounding context sed -n '200,230p' src/html.rs # Check the regex definitions grep -n 'TABLE_START_RE\|TABLE_END_RE' src/html.rs | head -20 # Look at call sites mentioned in the review sed -n '275,290p' src/html.rs sed -n '340,355p' src/html.rs

Repository: leynos/mdtablefix

Length of output: 2112

🏁 Script executed:

# Get the regex definitions sed -n '18,24p' src/html.rs

Repository: leynos/mdtablefix

Length of output: 369

🏁 Script executed:

# Search for tests covering tables, especially indented ones rg -n 'TABLE|table|indent' src/html.rs tests/ | grep -i 'test\|fn\|<table' # Look for test files find . -name '*test*' -o -name 'tests.rs' -o -path '*/tests/*' | head -20

Repository: leynos/mdtablefix

Length of output: 23848

🏁 Script executed:

# Search more specifically for indented HTML table tests grep -n 'indent.*table\|table.*indent' tests/table/*.rs # Check the convert_html tests more carefully head -100 tests/table/convert_html.rs # Look at process_stream_tests for HTML table handling grep -B5 -A5 'TABLE_START_RE\|trim_start' src/html.rs | head -50

Repository: leynos/mdtablefix

Length of output: 4055

🏁 Script executed:

# Verify if there's a test for indented HTML tables grep -r 'indented.*html\|html.*indented' tests/ # Check if any HTML table tests use indentation grep -B2 -A2 'html_table' tests/table/convert_html.rs | grep -E '^\s{4}<' # Let's look at the TABLE_END_RE usage in append_html_table_line sed -n '211,216p' src/html.rs # Verify the TABLE_END_RE pattern doesn't have anchoring grep 'TABLE_END_RE' src/html.rs | head -5

Repository: leynos/mdtablefix

Length of output: 501

🏁 Script executed:

# Check if any test includes an indented HTML table (4+ spaces before <table>) grep -r '^\s\s\s\s<table' tests/ # Look at what indented_table fixture contains sed -n '41,45p' tests/table/mod.rs # Verify the actual behavior with a simple check of how depth tracking works # Look at process_stream to see if it calls convert_html_tables grep -n 'convert_html_tables\|html_table_to_markdown' src/html.rs

Repository: leynos/mdtablefix

Length of output: 517

🏁 Script executed:

# Verify TABLE_END_RE doesn't have the same issue (no anchor) grep -A1 'TABLE_END_RE.*lazy_regex' src/html.rs # Create a minimal test case to confirm the bug # Simulate what happens with an indented HTML table python3 << 'EOF' import re TABLE_START_RE = re.compile(r"(?i)^<table(?:\s|>|$)") TABLE_END_RE = re.compile(r"(?i)</table>") # Indented table line line = " <table>" trimmed = line.lstrip() print(f"Raw line: '{line}'") print(f"Trimmed line: '{trimmed}'") print(f"TABLE_START_RE matches raw: {bool(TABLE_START_RE.search(line))}") print(f"TABLE_START_RE matches trimmed: {bool(TABLE_START_RE.search(trimmed))}") print(f"find_iter on raw: {len(TABLE_START_RE.findall(line))}") print(f"find_iter on trimmed: {len(TABLE_START_RE.findall(trimmed))}") # Simulating the flow print("\n--- Simulating current code flow ---") depth = 0 print(f"Initial depth: {depth}") print(f"Check: depth > 0 or TABLE_START_RE.is_match('{trimmed}'): {depth > 0 or bool(TABLE_START_RE.search(trimmed))}") if depth > 0 or bool(TABLE_START_RE.search(trimmed)): print("Entered table mode") depth += TABLE_START_RE.findall(line).__len__() # Uses raw line print(f"After counting on raw line: depth = {depth}") if TABLE_END_RE.search(line): depth -= TABLE_END_RE.findall(line).__len__() print(f"Would flush? {depth == 0}") EOF

Repository: leynos/mdtablefix

Length of output: 414

🏁 Script executed:

# Verify TABLE_END_RE behavior with indented closing tag python3 << 'EOF' import re TABLE_END_RE = re.compile(r"(?i)</table>") # Test with indented closing tag line = " </table>" print(f"Line: '{line}'") print(f"TABLE_END_RE.is_match: {bool(TABLE_END_RE.search(line))}") print(f"TABLE_END_RE.find_iter count: {len(TABLE_END_RE.findall(line))}") print("\nCONCLUSION: TABLE_END_RE matches regardless of indentation because it has no anchor") EOF # Look at the complete flow in both callers echo "=== html_table_to_markdown ===" sed -n '275,290p' src/html.rs echo -e "\n=== convert_html_tables ===" sed -n '321,360p' src/html.rs

Repository: leynos/mdtablefix

Length of output: 1732

Trim the line before counting <table> openings.

Count start tags on line.trim_start(), not on the raw line. TABLE_START_RE is anchored at ^<table, so an indented multiline table inside a list or block quote never increments depth; the callers at lines 283 and 348 then flush the buffer after the first line.

Patch

fn append_html_table_line(line: &str, buf: &mut Vec<String>, depth: &mut usize) { buf.push(line.to_string()); - *depth += TABLE_START_RE.find_iter(line).count(); + let trimmed = line.trim_start(); + *depth += TABLE_START_RE.find_iter(trimmed).count(); if TABLE_END_RE.is_match(line) { *depth = depth.saturating_sub(TABLE_END_RE.find_iter(line).count()); } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/html.rs` around lines 211 - 216, The function append_html_table_line pushes the raw line then counts opening <table> tags using TABLE_START_RE which is anchored at the line start; trim the input before counting starts so indented/multiline tables are recognized: replace the use of line when calling TABLE_START_RE.find_iter(...) with a trimmed version (e.g., let trimmed = line.trim_start()) and keep TABLE_END_RE matching on the original line as before; update the increment to use TABLE_START_RE.find_iter(trimmed).count() while leaving the decrement logic unchanged.

coderabbitai · 2026-04-18T15:03:37Z

+    let mut state = ListState {
+        indent_stack: Vec::new(),
+        counters: HashMap::new(),
+    };


🧹 Nitpick | 🔵 Trivial

Derive Default for ListState and initialize via default().

Remove boilerplate and centralize initialization behaviour.

Apply concise initialization

+#[derive(Default)] struct ListState { indent_stack: Vec<usize>, counters: HashMap<usize, usize>, } @@ - let mut state = ListState { - indent_stack: Vec::new(), - counters: HashMap::new(), - }; + let mut state = ListState::default();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/lists.rs` around lines 100 - 103, Add a derived Default impl to the ListState struct and use its default constructor when initializing: add #[derive(Default)] above the ListState declaration (which contains indent_stack: Vec<usize> and counters: HashMap<usize, usize>) and replace the manual initializer with let mut state = ListState::default(); so initialization is centralized and boilerplate is removed.

coderabbitai · 2026-04-18T15:03:38Z

+fn prefix_line(line: &str) -> Option<PrefixLine<'_>> {
+    if let Some(cap) = BULLET_RE.captures(line) {
+        let prefix = cap.get(1).expect("bullet regex capture").as_str();
+        let rest = cap.get(2).expect("bullet regex remainder capture").as_str();
+        return Some(PrefixLine {
+            prefix: Cow::Borrowed(prefix),
+            rest,
+            repeat_prefix: false,
+        });
+    }
+
+    if let Some(cap) = FOOTNOTE_RE.captures(line) {
+        let prefix = cap.get(1).expect("footnote prefix capture").as_str();
+        let marker = cap.get(2).expect("footnote marker capture").as_str();
+        let rest = cap
+            .get(3)
+            .expect("footnote regex remainder capture")
+            .as_str();
+        return Some(PrefixLine {
+            prefix: Cow::Owned(format!("{prefix}{marker}")),
+            rest,
+            repeat_prefix: false,
+        });
+    }
+
+    BLOCKQUOTE_RE.captures(line).map(|cap| PrefixLine {
+        prefix: Cow::Borrowed(cap.get(1).expect("blockquote prefix capture").as_str()),
+        rest: cap
+            .get(2)
+            .expect("blockquote regex remainder capture")
+            .as_str(),
+        repeat_prefix: true,
+    })
+}


⚠️ Potential issue | 🟠 Major

Replace .expect() calls with propagation or let-else guards.

The prefix_line function uses .expect() on regex capture groups (lines 67-68, 77-82, 91-95). Per coding guidelines, .expect() is forbidden outside tests. Since the function returns Option, return None on capture failure.

🛠️ Proposed fix using let-else guards

fn prefix_line(line: &str) -> Option<PrefixLine<'_>> { if let Some(cap) = BULLET_RE.captures(line) { - let prefix = cap.get(1).expect("bullet regex capture").as_str(); - let rest = cap.get(2).expect("bullet regex remainder capture").as_str(); + let prefix = cap.get(1)?.as_str(); + let rest = cap.get(2)?.as_str(); return Some(PrefixLine { prefix: Cow::Borrowed(prefix), rest, repeat_prefix: false, }); } if let Some(cap) = FOOTNOTE_RE.captures(line) { - let prefix = cap.get(1).expect("footnote prefix capture").as_str(); - let marker = cap.get(2).expect("footnote marker capture").as_str(); - let rest = cap - .get(3) - .expect("footnote regex remainder capture") - .as_str(); + let prefix = cap.get(1)?.as_str(); + let marker = cap.get(2)?.as_str(); + let rest = cap.get(3)?.as_str(); return Some(PrefixLine { prefix: Cow::Owned(format!("{prefix}{marker}")), rest, repeat_prefix: false, }); } - BLOCKQUOTE_RE.captures(line).map(|cap| PrefixLine { - prefix: Cow::Borrowed(cap.get(1).expect("blockquote prefix capture").as_str()), - rest: cap - .get(2) - .expect("blockquote regex remainder capture") - .as_str(), + let cap = BLOCKQUOTE_RE.captures(line)?; + Some(PrefixLine { + prefix: Cow::Borrowed(cap.get(1)?.as_str()), + rest: cap.get(2)?.as_str(), repeat_prefix: true, }) }

As per coding guidelines: ".expect() and .unwrap() are forbidden outside of tests. Errors must be propagated."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn prefix_line(line: &str) -> Option<PrefixLine<'_>> {

if let Some(cap) = BULLET_RE.captures(line) {

let prefix = cap.get(1).expect("bullet regex capture").as_str();

let rest = cap.get(2).expect("bullet regex remainder capture").as_str();

return Some(PrefixLine {

prefix: Cow::Borrowed(prefix),

rest,

repeat_prefix: false,

});

}

if let Some(cap) = FOOTNOTE_RE.captures(line) {

let prefix = cap.get(1).expect("footnote prefix capture").as_str();

let marker = cap.get(2).expect("footnote marker capture").as_str();

let rest = cap

.get(3)

.expect("footnote regex remainder capture")

.as_str();

return Some(PrefixLine {

prefix: Cow::Owned(format!("{prefix}{marker}")),

rest,

repeat_prefix: false,

});

}

BLOCKQUOTE_RE.captures(line).map(|cap| PrefixLine {

prefix: Cow::Borrowed(cap.get(1).expect("blockquote prefix capture").as_str()),

rest: cap

.get(2)

.expect("blockquote regex remainder capture")

.as_str(),

repeat_prefix: true,

})

}

fn prefix_line(line: &str) -> Option<PrefixLine<'_>> {

if let Some(cap) = BULLET_RE.captures(line) {

let prefix = cap.get(1)?.as_str();

let rest = cap.get(2)?.as_str();

return Some(PrefixLine {

prefix: Cow::Borrowed(prefix),

rest,

repeat_prefix: false,

});

}

if let Some(cap) = FOOTNOTE_RE.captures(line) {

let prefix = cap.get(1)?.as_str();

let marker = cap.get(2)?.as_str();

let rest = cap.get(3)?.as_str();

return Some(PrefixLine {

prefix: Cow::Owned(format!("{prefix}{marker}")),

rest,

repeat_prefix: false,

});

}

let cap = BLOCKQUOTE_RE.captures(line)?;

Some(PrefixLine {

prefix: Cow::Borrowed(cap.get(1)?.as_str()),

rest: cap.get(2)?.as_str(),

repeat_prefix: true,

})

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/wrap.rs` around lines 65 - 98, The prefix_line function currently calls .expect() on regex capture groups (BULLET_RE, FOOTNOTE_RE, BLOCKQUOTE_RE) which must be removed; change each branch to use let-else guards or safe captures and return None when a capture is missing: for the BULLET_RE and FOOTNOTE_RE branches extract cap.get(1), cap.get(2) (and cap.get(3) for FOOTNOTE_RE) with let Some(...) = ... else { return None } before building the PrefixLine, and for the BLOCKQUOTE_RE branch similarly map the capture to a PrefixLine only if cap.get(1) and cap.get(2) are Some; ensure PrefixLine fields (prefix, rest, repeat_prefix) are set the same way but without any .expect() calls.

coderabbitai · 2026-04-18T15:03:38Z

+    fn push_wrapped_segment(&mut self, indent: &str, segment: &str) {
+        for line in wrap_preserving_code(segment, self.width - indent.len()) {
+            self.out.push(format!("{indent}{line}"));
        }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Read the target function and surrounding context echo "=== Lines 95-110 of src/wrap/paragraph.rs ===" sed -n '95,110p' src/wrap/paragraph.rs echo echo "=== Search for UnicodeWidthStr usage and imports ===" rg -n 'UnicodeWidthStr|use.*unicode' src/wrap/paragraph.rs | head -20 echo echo "=== Search for wrap_preserving_code definition ===" rg -n 'fn wrap_preserving_code' src/ echo echo "=== Check the pattern mentioned in comment (saturating_sub.max) ===" rg -n 'saturating_sub.*max|width.*saturating_sub' src/wrap/paragraph.rs

Repository: leynos/mdtablefix

Length of output: 1476

🏁 Script executed:

#!/bin/bash set -euo pipefail # Read the wrap_preserving_code function signature and context echo "=== wrap_preserving_code function (lines 195-210) ===" sed -n '195,210p' src/wrap/inline.rs echo echo "=== flush_paragraph to see how push_wrapped_segment is called ===" rg -n 'flush_paragraph|push_wrapped_segment' src/wrap/paragraph.rs | head -15 echo echo "=== Check the full append_wrapped_with_prefix for context ===" sed -n '45,65p' src/wrap/paragraph.rs

Repository: leynos/mdtablefix

Length of output: 2054

Use display width for wrapped indent to guard against underflow.

indent.len() counts bytes, not display columns. For non-ASCII indentation, this miscalculates available width. When indent exceeds self.width, the subtraction panics in checked builds and wraps in unchecked builds.

Adopt the pattern already used at line 52 in append_wrapped_with_prefix:

Patch

fn push_wrapped_segment(&mut self, indent: &str, segment: &str) { - for line in wrap_preserving_code(segment, self.width - indent.len()) { + let indent_width = UnicodeWidthStr::width(indent); + let available = self.width.saturating_sub(indent_width).max(1); + for line in wrap_preserving_code(segment, available) { self.out.push(format!("{indent}{line}")); } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/wrap/paragraph.rs` around lines 100 - 103, The push_wrapped_segment implementation uses indent.len() (byte length) causing wrong available width and potential underflow; change it to compute the display width via UnicodeWidthStr::width(indent), then compute available = self.width.saturating_sub(indent_width).max(1) and pass that available value into wrap_preserving_code instead of self.width - indent.len(); update references in push_wrapped_segment and keep prefixing lines with the original indent as before.

leynos · 2026-04-18T15:04:53Z

@coderabbitai please can you give me an assessment of the cargo-binstall support. This is the focus of this pr.

coderabbitai · 2026-04-18T15:06:18Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== Cargo.toml ==="
cat Cargo.toml

echo ""
echo "=== .github/workflows/release.yml ==="
cat .github/workflows/release.yml

echo ""
echo "=== README.md ==="
cat README.md

echo ""
echo "=== docs/release-process.md ==="
cat docs/release-process.md

Length of output: 16628

@leynos: Here is a full assessment of the cargo-binstall support.

Cargo-binstall Support Assessment

1. `Cargo.toml` Metadata

The [package.metadata.binstall] configuration is correct.

repository field — Added as a prerequisite for the { repo } template variable used in pkg-url. Without it, { repo } would be empty and binstall would produce a broken URL.

cfg override target filter:

cfg(all(target_os = "linux", any(target_arch = "x86_64", target_arch = "aarch64"), target_env = "gnu"))

This precisely matches the two targets that have cargo_binstall_archive: true in the workflow matrix (x86_64-unknown-linux-gnu and aarch64-unknown-linux-gnu). The FreeBSD target is correctly excluded, and binstall will fall back to building from source for it.

pkg-url template:

pkg-url = "{ repo }/releases/download/v{ version }/{ name }-{ version }-{ target }.tar.gz"

Expands for a v0.3.1 tag to:

https://github.com/leynos/mdtablefix/releases/download/v0.3.1/mdtablefix-0.3.1-x86_64-unknown-linux-gnu.tar.gz

bin-dir:

bin-dir = "{ bin }{ binary-ext }"

This correctly expects the binary at the archive root, with no subdirectory. The { binary-ext } resolves to an empty string on Linux. ✅

2. Workflow Archive Production

The archive is created as:

archive_name="${{ env.REPO_NAME }}-${version}-${{ matrix.target }}.tar.gz"
tar -C "target/${{ matrix.target }}/release" -czf \
  "${artifact_dir}/${archive_name}" "${{ env.REPO_NAME }}${{ matrix.ext }}"

The -C flag changes into the release directory before archiving, so the binary lands at the archive root with no path prefix. This aligns exactly with bin-dir = "{ bin }{ binary-ext }". ✅

The version variable strips the leading v from the tag (version="${GITHUB_REF_NAME#v}"), so for tag v0.3.1 the archive is named mdtablefix-0.3.1-x86_64-unknown-linux-gnu.tar.gz. The pkg-url template uses v{ version } in the download path and { version } (without v) in the filename — this is consistent. ✅

Name/URL alignment summary:

Component	Value
Workflow produces	`mdtablefix-0.3.1-x86_64-unknown-linux-gnu.tar.gz`
`pkg-url` resolves to	`.../v0.3.1/mdtablefix-0.3.1-x86_64-unknown-linux-gnu.tar.gz`
Match	✅

3. Release Job Sequencing

The release job correctly:

Creates the GitHub release (via softprops/action-gh-release@v1)
Downloads all build artifacts
Uploads every file via gh release upload

The artifact directory glob artifacts/${{ env.REPO_NAME }}-* correctly enumerates all per-target directories and picks up the binary, its SHA-256, the tar.gz archive, and the archive SHA-256. ✅

4. Documentation

Both README.md and docs/release-process.md accurately document the new installation path and clarify which targets produce binstall archives. The documented target list matches the workflow matrix and the cfg override. ✅

Summary

The implementation is well-structured and internally consistent. The three moving parts — the Cargo.toml override, the workflow archive naming, and the release upload glob — are all aligned. No issues found.

leynos added 2 commits April 18, 2026 15:31

leynos marked this pull request as ready for review April 18, 2026 14:53

sourcery-ai Bot reviewed Apr 18, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 18, 2026

View reviewed changes

coderabbitai Bot requested changes Apr 18, 2026

View reviewed changes

leynos merged commit b8962c0 into main Apr 18, 2026
3 checks passed

leynos deleted the mdtablefix-cargo-binstall-support branch April 18, 2026 15:07

This was referenced Apr 18, 2026

Address review feedback for wrapping and footnotes #266

Open

Switch wrapping to textwrap with adapter and unified prefixes #271

Merged

	write!(&mut line, "[^{new_number}]:").expect("write to string cannot fail");
	// Writing to String is infallible; discard the result.
	let _ = write!(&mut line, "[^{new_number}]:");

Conversation

leynos commented Apr 18, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Cargo-binstall Support

Markdown Processing Refactoring

Test Refactoring

Documentation

Walkthrough

Changes

Poem

❌ Failed checks (2 warnings)

Uh oh!

sourcery-ai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for refactored paragraph wrapping and tokenization

Class diagram for refactored footnote renumbering and inline handling

Class diagram for refactored processing pipeline, HTML tables, and lists

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 18, 2026

Choose a reason for hiding this comment

1. Collapse DefinitionScanContext + DefinitionAccumulator into one local state

2. Simplify collect_scan_updates / finalize_numeric_candidates signatures

Uh oh!

sourcery-ai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Apr 18, 2026

Choose a reason for hiding this comment

1. Merge ParagraphState into ParagraphWriter

2. Inline append_wrapped_with_prefix into handle_prefix_line

3. Consider simplifying PrefixLine if Cow isn’t pulling its weight

Uh oh!

sourcery-ai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 18, 2026

leynos commented Apr 18, 2026 •

edited by sourcery-ai Bot

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading

sourcery-ai Bot commented Apr 18, 2026 •

edited

Loading

1. Collapse `DefinitionScanContext` + `DefinitionAccumulator` into one local state

2. Simplify `collect_scan_updates` / `finalize_numeric_candidates` signatures

1. Merge `ParagraphState` into `ParagraphWriter`

2. Inline `append_wrapped_with_prefix` into `handle_prefix_line`

3. Consider simplifying `PrefixLine` if `Cow` isn’t pulling its weight

1. `Cargo.toml` Metadata