Skip to content

Preserve trailing spaces on final flush#211

Merged
leynos merged 5 commits intomainfrom
codex/fix-inconsistency-in-wrap_preserving_code
Sep 14, 2025
Merged

Preserve trailing spaces on final flush#211
leynos merged 5 commits intomainfrom
codex/fix-inconsistency-in-wrap_preserving_code

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Sep 11, 2025

Summary

  • stop trimming trailing spaces when wrap_preserving_code flushes without a split

Testing

  • make fmt
  • make lint
  • make test

closes #66


https://chatgpt.com/codex/tasks/task_e_68c2b687e2488322a91ef6102f515f31

Summary by Sourcery

Bug Fixes:

  • Stop trimming trailing spaces when wrap_preserving_code flushes without a split

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Sep 11, 2025

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adjust wrap_preserving_code to keep trailing spaces when emitting the final line and streamline buffer handling by using std::mem::take.

File-Level Changes

Change Details Files
Preserve trailing spaces on final flush and simplify buffer reset
  • Remove trim_end() and trimmed.to_string(), so trailing spaces are retained
  • Use std::mem::take(&mut current) to push the entire buffer and clear it in one step
  • Drop the explicit current.clear() call
  • Maintain empty-string check using current.is_empty()
src/wrap.rs

Assessment against linked issues

Issue Objective Addressed Explanation
#66 Remove trimming of trailing spaces in the final flush operation of wrap_preserving_code, so that it matches the earlier loop behavior.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 11, 2025

Summary by CodeRabbit

  • New Features
    • Improved text wrapping to keep inline code spans intact with adjacent punctuation.
  • Bug Fixes
    • Trailing spaces are now preserved in wrapped text, HTML-to-Markdown table conversion, and table reflow, preventing unintended trimming.
    • Preserves trailing spaces within code spans and at line ends across wrapping operations.
  • Tests
    • Added cases verifying trailing-space preservation in wrappers.
    • Added test ensuring HTML table cells retain trailing spaces in Markdown output.

Walkthrough

Update internal wrapping and whitespace handling to preserve trailing spaces, merge code spans, and reuse allocations; extend tests and HTML/table processing to preserve per-line trailing whitespace across wrapping, table buffering, and reflow collection.

Changes

Cohort / File(s) Summary
Wrap implementation
src/wrap.rs
Add helpers merge_code_span, flush_current, flush_trailing_whitespace; detect and merge code spans; preserve trailing spaces on mid-wrap and final flush; reuse current allocation via mem::take; keep punctuation glued after code spans. No public API changes.
Wrap tests
src/wrap/tests.rs
Import and call wrap_preserving_code; replace/expand trailing-space test with preserves_trailing_spaces (multiple cases); add no_split_forced_flush_no_trim to verify forced flush retains trailing spaces and avoids unwanted splits; assert against Vec<String>.
HTML text extraction
src/html.rs
Change node_text to preserve a single trailing space when original text ended with whitespace (use trim_start + append space instead of full trim); stop trimming each line before building HTML input in table_lines_to_markdown.
Table buffering / processing
src/process.rs
Preserve trailing whitespace when buffering table lines: replace line.trim_end().to_string() with line.to_string() in three handle_table_line code paths.
Reflow cell collection
src/reflow.rs
Preserve trailing whitespace when appending non-last chunks in collect_cells by not trimming before adding the ROW_END sentinel, so ROW_END follows any trailing spaces.
HTML table tests
tests/table/convert_html.rs
Add preserves_trailing_spaces_in_cells test to assert that convert_html_tables output retains trailing spaces inside table cells.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Caller
    participant Wrap as wrap_preserving_code
    participant Helpers as merge/flush helpers
    participant Lines as Vec<String>

    Caller->>Wrap: call(tokens, width)
    rect rgba(230,240,255,0.9)
      note right of Wrap: iterate tokens, build `current` buffer
      alt token is "`" (code span)
        Wrap->>Helpers: merge_code_span(tokens, i, width)
        Helpers-->>Wrap: new index, adjusted width
        Wrap->>Wrap: append whole code-span (keep punctuation glued)
      else token normal
        Wrap->>Wrap: append token to current
      end
      alt overflow or explicit flush
        Wrap->>Helpers: flush_trailing_whitespace / flush_current
        Helpers->>Lines: push(mem::take(&mut current) [+ optional whitespace token])
        Helpers-->>Wrap: current (reused capacity)
      end
    end
    Wrap-->>Caller: return lines (preserving trailing spaces)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Poem

Keep the trailing spaces safe and sound,
Merge code-spans, reuse the ground.
Flush the buffer, let whitespace stay,
Tables and wraps now keep their way.
Tiny fixes, tidy array — hooray!

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning Flag that the PR also modifies html.rs, process.rs, reflow.rs and adds a table conversion test to preserve trailing spaces in HTML table handling, which is outside the narrowly scoped objective in issue #66 that referenced only wrap_preserving_code and is not documented in the PR description. Split the table/HTML whitespace changes into a separate PR or update the description to justify and document these wider behavioural changes, and add focused tests and review notes showing their intent and impact before merging.
Docstring Coverage ⚠️ Warning Docstring coverage is 76.92% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed Confirm that the title accurately and concisely summarises the primary change: it states that trailing spaces are preserved on the final flush, which matches the changes to wrap_preserving_code and related whitespace-preservation updates.
Linked Issues Check ✅ Passed Confirm that the changes implement linked issue #66: wrap_preserving_code no longer trims trailing spaces on final flush, helper functions and tests were added to exercise preserved trailing-space behaviour, and the primary objective from the issue is addressed.
Description Check ✅ Passed Confirm that the PR description relates to the changeset by describing the stop-trimming change to wrap_preserving_code, listing test commands, and referencing the closed issue, thereby satisfying the lenient requirement.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/fix-inconsistency-in-wrap_preserving_code

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a09c9cc and 9858859.

📒 Files selected for processing (5)
  • src/html.rs (2 hunks)
  • src/process.rs (3 hunks)
  • src/reflow.rs (1 hunks)
  • src/wrap.rs (5 hunks)
  • tests/table/convert_html.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Use eyre::Report for human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and to eyre only in the main main() entrypoint or top-level async task.

Files:

  • src/reflow.rs
  • src/html.rs
  • tests/table/convert_html.rs
  • src/process.rs
  • src/wrap.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • src/reflow.rs
  • src/html.rs
  • tests/table/convert_html.rs
  • src/process.rs
  • src/wrap.rs
🧬 Code graph analysis (1)
tests/table/convert_html.rs (1)
src/html.rs (1)
  • convert_html_tables (303-346)
🔍 Remote MCP Ref

Summary of additional relevant facts for reviewing PR #211

  • Files changed and key edits:

    • src/wrap.rs — adds helpers merge_code_span(tokens,i,width) -> usize, flush_current(lines,current) (uses mem::take to reuse allocation), and flush_trailing_whitespace(lines,current,token). wrap_preserving_code updated to use these helpers; final-flush path now pushes current as-is (preserving trailing spaces) and reuses allocation instead of trim_end() + to_string(). Behavior: trailing spaces preserved on mid-wrap and final flush; code-span punctuation handling preserved via merge_code_span and extend_punctuation.
    • src/wrap/tests.rs — tests updated and added:
      • preserves_trailing_spaces (rstest with cases: "trail " width 80; "code span " width 12; "foo " width 3; "x " width 1).
      • no_split_forced_flush_no_trim (cases: "aaaaaaaaaaaa" w=5; "abcde" w=3; "codespan" w=6) asserting forced flush does not split nor trim trailing spaces.
    • src/html.rs, src/process.rs, src/reflow.rs — several whitespace-preservation changes:
      • src/html.rs: node_text no longer fully trims end; appends single space when last_space true and uses trim_start() only; table_lines_to_markdown no longer trims per-line trailing spaces (uses join as-is).
      • src/process.rs: handle_table_line now pushes line.to_string() instead of line.trim_end().to_string() in three buffering locations (preserves trailing spaces in table buffering).
      • src/reflow.rs: collect_cells no longer trims trailing whitespace before appending ROW_END sentinel; ROW_END appended after original chunk (preserves trailing whitespace into split_cells).
  • Tests added/changed to cover behavior:

    • New unit test preserves_trailing_spaces_in_cells in tests/table/convert_html.rs verifying HTML->Markdown table conversion preserves trailing spaces in cells.
    • wrap tests (above) cover both normal final flush and forced-flush (small width) behaviors to ensure no trimming and no unintended splitting.
  • Behavior and impact notes for review:

    • Purpose: fix inconsistency noted in issue Inconsistent trailing space handling in wrap_preserving_code function #66 by preserving trailing spaces on final flush to match loop behavior; PR closes Inconsistent trailing space handling in wrap_preserving_code function #66. No public API/signature changes.
    • Memory behavior: use of mem::take(&mut current) in flush_current reuses capacity and leaves current emptied; ensure no code assumes the previous buffer contents after mem::take (no visible outward API, but verify tests and other callsites).
    • Potential ripple effects: several modules now preserve trailing spaces earlier trimmed (html parsing, table buffering, reflow). Reviewer should verify downstream behavior (table formatting, split_cells/split_into_rows consumers, output diffs) and confirm tests cover these interactions (new test for convert_html_tables added).

Tools/sources used

  • Repository source files read: raw src files (wrap.rs, wrap/tests.rs, html.rs, process.rs, reflow.rs, tests/table/convert_html.rs) via repository raw URL reads.
  • Documentation search attempted for wrap_preserving_code references.
🔇 Additional comments (12)
src/process.rs (1)

91-92: Stop trimming table lines — correct alignment with whitespace‑preservation goal.

Switching to line.to_string() in all table‑buffering paths keeps per‑line trailing spaces intact for reflow_table.

Also applies to: 101-102, 115-116

src/html.rs (2)

32-36: Preserve a single trailing space from HTML text nodes — good.

Appending one space when last_space is true and only trimming the start matches the PR’s intent without over‑collapsing.


184-184: Join HTML lines without trimming — correct.

Joining with lines.join("\n") keeps trailing spaces per source line for the parser.

tests/table/convert_html.rs (1)

65-79: Add trailing‑space preservation test — good coverage.

The case asserts the end‑to‑end preservation from HTML → Markdown table output.

src/wrap.rs (7)

101-106: Capacity‑preserving flush — good.

Reusing capacity via std::mem::take avoids thrash on long wraps.


108-115: Whitespace‑only flush path — good precondition.

debug_assert! on whitespace tokens plus delegating to the shared flush keeps this path safe and DRY.


166-192: Mid‑wrap trimming vs end‑wrap preservation — correct split.

Discarding trailing spaces on mid‑wrap lines while preserving them at paragraph end matches Markdown hard‑break semantics.


195-204: Final‑token whitespace flush — handles “forced flush” without trimming.

Flushing the accumulated buffer including the last whitespace token fixes the inconsistency called out in #66.


206-208: Use shared flush helper — avoids duplicate capacity logic.

Calling flush_current here keeps the implementation consistent.


223-226: Final flush preserves trailing spaces — aligns with PR objective.

Pushing current as‑is ensures hard breaks survive wrapping.


81-99: Merge inline‑code spans with punctuation — confirm multi‑backtick behaviour

Confirm that multi‑backtick fences (e.g. code, code) are tokenised as single tokens and not split or mis‑measured by merge_code_span; tests reference double‑backtick spans but the sandbox test run failed.

rg -nP '\x60{2,}code\x60{2,}' -g 'src/**' -g 'tests/**' || true
sed -n '200,230p' src/textproc.rs
cargo test -- --nocapture wrap_preserving_code

Inspect src/wrap.rs (lines ~81–99) and src/textproc.rs (around lines 211–216); run the commands locally/CI and paste the test output.

src/reflow.rs (1)

29-29: Preserve trailing spaces on cell collection — LGTM; verify no off-by-one.
Verify that appending the sentinel with a leading space (ch.push_str(" |ROW_END|") in src/reflow.rs) preserves trailing whitespace and does not inflate trailing-space counts in cells.
Run the following test locally and paste the output:

#!/bin/bash
set -euo pipefail
cargo test -q -- --exact preserves_trailing_spaces_in_cells

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Sep 12, 2025

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 12, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/wrap.rs (1)

383-385: Add regression tests to lock in trailing‑space behaviour.

Cover:

  • Final flush preserves trailing spaces.
  • No‑split overflow flush does not trim.

Sample tests using rstest:

use super::wrap_preserving_code;
use rstest::rstest;

#[rstest]
#[case("trail  ", 80, &["trail  "])]
#[case("`code span`  ", 12, &["`code span`  "])]
fn preserves_trailing_spaces(#[case] input: &str, #[case] width: usize, #[case] expected: &[&str]) {
    let out = wrap_preserving_code(input, width);
    assert_eq!(out, expected.iter().map(|s| s.to_string()).collect::<Vec<_>>());
}

#[rstest]
#[case("aaaaaaaaaaaa", 5, &["aaaaaaaaaaaa"])] // forced flush without split
fn no_split_forced_flush_no_trim(#[case] input: &str, #[case] width: usize, #[case] expected: &[&str]) {
    let out = wrap_preserving_code(input, width);
    assert_eq!(out, expected.iter().map(|s| s.to_string()).collect::<Vec<_>>());
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e7a3dd and af32d7e.

📒 Files selected for processing (1)
  • src/wrap.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • src/wrap.rs
🔍 Remote MCP Ref

Additional context relevant to reviewing PR #211

  • PR metadata — "Preserve trailing spaces on final flush" (PR #211) by leynos; branch codex/fix-inconsistency-in-wrap_preserving_code → main; closes issue #66; testing commands: make fmt, make lint, make test.

  • Code change (high level) — src/wrap.rs: when flushing due to width the code now pushes the whole current segment by using std::mem::take(&mut current) instead of pushing a trimmed/cloned string and calling current.clear(). This preserves trailing whitespace on mid-wrap flushes; the final flush likewise stops using trim_end() and pushes current as-is.

  • Motivating issue — fixes inconsistency reported in issue #66 where the final flush used trim_end() (losing trailing spaces) while the earlier loop did not. The PR aligns final-flush behavior with the loop.

  • Related changes to review for overlap/regressions — related PRs/issues (#193, #62) also touch wrap_preserving_code and trailing-space handling; check for merge conflicts or duplicated fixes and verify the intended behavior is consistent across all code paths.

  • Suggested review checklist (concrete items)

    • Confirm the exact hunk in src/wrap.rs (ownership move via mem::take) matches the description and contains no accidental logic changes beyond whitespace handling.
    • Verify unit tests exist (or add) that assert preservation of trailing spaces for both mid-wrap flush and final flush; run make test / CI.
    • Run make fmt and make lint locally or confirm CI passes to ensure style/lint rules are satisfied.
    • Review for unintended downstream effects where callers rely on trimmed output (search for other uses of wrap_preserving_code or expectations about trailing-space removal).
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Sourcery review
🔇 Additional comments (3)
src/wrap.rs (3)

159-161: Stop trimming on no‑split flush: correct and aligned with the objective.

Pushing the moved current via std::mem::take preserves trailing whitespace and avoids clone+clear. LGTM.


176-179: Final flush preserves trailing spaces.

This matches the requirement to keep Markdown hard breaks intact.


149-154: Preserve trailing spaces on the other forced‑flush or document the exception.

Preserve trailing spaces in this branch to match the other forced‑flush path, or add a short comment explaining why trim_end() is intentional.

File: src/wrap.rs — lines 149–154

-            if current_width > width {
-                lines.push(current.trim_end().to_string());
-                current.clear();
-                current_width = 0;
-                last_split = None;
-            }
+            if current_width > width {
+                let cap = current.capacity();
+                lines.push(std::mem::take(&mut current));
+                current = String::with_capacity(cap);
+                current_width = 0;
+                last_split = None;
+            }

Comment thread src/wrap.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af32d7e and 3df0c18.

📒 Files selected for processing (2)
  • src/wrap.rs (2 hunks)
  • src/wrap/tests.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Use eyre::Report for human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and to eyre only in the main main() entrypoint or top-level async task.

Files:

  • src/wrap/tests.rs
  • src/wrap.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • src/wrap/tests.rs
  • src/wrap.rs
🧬 Code graph analysis (2)
src/wrap/tests.rs (1)
src/wrap.rs (1)
  • wrap_preserving_code (88-195)
src/wrap.rs (1)
src/html.rs (1)
  • lines (181-184)
🔍 Remote MCP Ref

Summary of additional relevant facts for reviewing PR #211 (preserve trailing spaces)

  • File changes (branch codex/fix-inconsistency-in-wrap_preserving_code):

    • src/wrap.rs: adds a private helper flush_trailing_whitespace(lines, current, token); replaces mid-wrap/current final flush logic to use std::mem::take(&mut current) and reinitialize current with previous capacity instead of current.clear(); final flush now pushes current as-is (no trim_end), preserving trailing spaces.
  • Tests:

    • src/wrap/tests.rs: imports wrap_preserving_code directly; renames/adjusts preserves_trailing_spaces test cases to assert preserved trailing spaces (e.g., "trail " and "code span "); adds new test no_split_forced_flush_no_trim asserting forced flush at small width does not split and preserves trailing spaces for input "aaaaaaaaaaaa".
  • Relation to main branch:

    • Compared to main/src/wrap.rs behavior, the PR branch changes the final-flush behavior to match in-loop behavior (no trimming). Confirmed code targeted the same function reported in issue #66.

Review checklist items supported by these findings

  • Confirm mem::take usage only moves current and does not change tokenization/width logic (verify no other logic branches altered). [source: src/wrap.rs change above]
  • Ensure added tests cover both mid-wrap and final-flush scenarios (tests file shows direct cases and the new forced-flush case). Run CI/tests to validate. [source: src/wrap/tests.rs]
  • Search repo for other callers that expect trimmed output; if any exist, they may require adjustment (no callers were changed in this PR). Suggested quick grep in repo for wrap_preserving_code usages before merging.

Sources:

  • Updated wrap.rs (PR branch):
  • Tests (PR branch):
  • Current main wrap.rs for comparison:
🔇 Additional comments (2)
src/wrap/tests.rs (1)

9-9: Import the SUT explicitly

Keep the test surface tight by importing wrap_preserving_code directly. LGTM.

src/wrap.rs (1)

160-169: Approve changes — preserve trailing whitespace on final whitespace-only token; verify downstream trimming behaviour

Tests (src/wrap/tests.rs) and docs (docs/trailing-spaces.md) assert the new behaviour; wrap_preserving_code is used in src/wrap.rs. Confirm no other code assumes trimmed trailing spaces (especially HTML output).

Run the script below from the repository root and paste the output.

#!/bin/bash
set -euo pipefail

echo "Repo root: $(git rev-parse --show-toplevel 2>/dev/null || pwd)"
echo

echo "Call sites of wrap_preserving_code:"
rg -nP --hidden --no-ignore -g '!target/**' -C2 '\bwrap_preserving_code\s*\(' || git ls-files | xargs -r rg -nP -C2 '\bwrap_preserving_code\s*\(' || echo "No wrap_preserving_code matches found."

echo
echo "Search for trim usages (.trim, .trim_end, trim_end_matches):"
rg -nP --hidden --no-ignore -g '!target/**' -n -C2 '\.trim(_end(_matches)?)?\s*\(' || git ls-files | xargs -r rg -nP -C2 '\.trim(_end(_matches)?)?\s*\(' || echo "No .trim* matches found."

echo
echo "Show src/html.rs if present (inspect trimming for HTML output):"
if [ -f src/html.rs ]; then
  nl -ba src/html.rs | sed -n '1,240p'
else
  git ls-files | rg 'html\.rs' || echo "No src/html.rs found."
fi

Comment thread src/wrap.rs
Comment thread src/wrap.rs
Comment thread src/wrap/tests.rs
Comment thread src/wrap/tests.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/wrap.rs (1)

112-221: Fix trailing-space inconsistency; ensure preserved trailing spaces survive HTML rendering

  • Document that mid-wrap lines are intentionally trimmed in src/wrap.rs (lines 165–168 and 180–181).
  • Prevent the HTML renderer from discarding preserved trailing spaces: remove or conditionalise the .map(|l| l.trim_end()) in src/html.rs (around line 183), or explicitly convert trailing spaces to <br> at render time.
  • Audit other trim_end() sites that could drop authorial hard breaks: src/reflow.rs:29, src/process.rs:91–116, src/lists.rs:129, src/breaks.rs:55, src/footnotes.rs:102.
♻️ Duplicate comments (1)
src/wrap.rs (1)

103-110: Whitespace‑only trailing token flush: LGTM (precondition enforced)

The debug_assert and delegation to flush_current are spot on.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3df0c18 and a09c9cc.

📒 Files selected for processing (2)
  • src/wrap.rs (3 hunks)
  • src/wrap/tests.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums: Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status.
Use an opaque error only at the app boundary: Use eyre::Report for human-readable logs; these should not be exposed in public APIs.
Never export the opaque type from a library: Convert to domain enums at API boundaries, and to eyre only in the main main() entrypoint or top-level async task.

Files:

  • src/wrap.rs
  • src/wrap/tests.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • src/wrap.rs
  • src/wrap/tests.rs
🧬 Code graph analysis (2)
src/wrap.rs (1)
src/html.rs (1)
  • lines (181-184)
src/wrap/tests.rs (1)
src/wrap.rs (1)
  • wrap_preserving_code (112-221)
🔍 Remote MCP Ref

Summary of additional relevant facts for reviewing PR #211 (Preserve trailing spaces on final flush)

  • Core change: wrap_preserving_code no longer trims trailing spaces on final flush; it now pushes current as-is and preserves trailing whitespace behavior that the loop already maintained. This is implemented by replacing the final trim+push with a direct push and by introducing helpers (merge_code_span, flush_current, flush_trailing_whitespace) and mem::take usage to reuse allocations.,

  • Tests added/updated:

    • preserves_trailing_spaces: expanded cases to include inputs with trailing spaces and code spans (e.g., "trail ", "code span ") verifying preservation at widths that won’t force splits.
    • no_split_forced_flush_no_trim: new test asserting forced flush at small width does not split lines and does not trim trailing spaces for inputs like "aaaaaaaaaaaa" (w=5), "abcde" (w=3), "codespan" (w=6). These exercise both mid-wrap and final-flush behavior.,
  • Memory/allocation change: use of std::mem::take(&mut current) in flush_current to reuse allocated capacity instead of trimming/clearing, reducing allocation churn while preserving contents pushed to lines. Verify this does not accidentally expose moved-from buffers.

  • No public API changes: all changes are internal/private. No exported signatures altered.

  • Related/closing references:

    • This PR closes issue #66 (inconsistent trailing-space handling) and is related to PR #193 which also adjusted final-flush behavior. Confirm no other call-sites expect trimmed output.

Notes about retrieval attempts

  • Attempted to read PR page and search repository for related code/details; the retrieval tools returned rate-limit/access errors during the attempt. Useful context above was drawn from the provided PR context and previous research summary; the tool calls attempted are listed here for traceability:,
🔇 Additional comments (4)
src/wrap.rs (1)

96-101: Capacity‑preserving flush: LGTM

Reuse of allocation via mem::take + with_capacity is correct and addresses prior capacity thrash.

src/wrap/tests.rs (3)

117-127: Strengthen final‑flush coverage: LGTM

Cases exercise wide and narrow widths and confirm trailing spaces survive.


129-142: Over‑width token behaviour: LGTM

Asserts no splitting or trimming for long tokens and code spans. Good guard against regressions.


9-9: Avoid wildcard import in tests

Import only what’s used to keep lints quiet.

-use super::super::*;
+use super::wrap_text;
 use super::wrap_preserving_code;

Likely an incorrect or invalid review comment.

Comment thread src/wrap.rs
@leynos leynos merged commit 02cb3a9 into main Sep 14, 2025
3 checks passed
@leynos leynos deleted the codex/fix-inconsistency-in-wrap_preserving_code branch September 14, 2025 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent trailing space handling in wrap_preserving_code function

1 participant