Context
The current implementation for text wrapping and prefix handling involves custom tokenization and manual logic, which increases code complexity and maintenance burden. The suggestion is to replace this with the textwrap and unicode-width crates, which are well-tested and handle Unicode, word boundaries, and hyphens more robustly.
Proposed Improvements
-
Replace Custom Tokenization and Wrapping:
- Remove the
tokenize_markdown and wrap_preserving_code custom logic.
- Use
textwrap's wrap function with appropriate options (e.g., break_words(false) and UnicodeBreakWordSeparator) to handle word boundaries and Unicode correctly.
-
Consolidate Prefix Handling:
- Merge the logic of
append_wrapped_with_prefix and handle_prefix_line into a single helper function (e.g., wrap_with_prefix).
- Compute the available width once, wrap the text, and then prepend the prefix or indent as needed for each line.
-
Simplify Main Loop:
- In the main wrapping function (e.g.,
wrap_text), call the new wrap_with_prefix helper instead of the old manual functions.
- Remove all manual tokenization and buffer-flushing code.
Benefits
- Reduces code size from ~400 lines to ~50 lines.
- Leverages well-tested libraries for Unicode and word handling.
- Maintains all existing test cases unchanged.
Suggested Action Items
Example Implementation
See the following code snippets for reference:
use textwrap::{wrap, Options};
fn wrap_preserving_code(text: &str, width: usize) -> Vec<String> {
let opts = Options::new(width)
.break_words(false)
.word_separator(textwrap::word_separator::UnicodeBreakWordSeparator::new());
wrap(text, &opts)
.into_iter()
.map(String::from)
.collect()
}
use unicode_width::UnicodeWidthStr;
fn wrap_with_prefix(
out: &mut Vec<String>,
prefix: &str,
text: &str,
width: usize,
repeat_prefix: bool,
) {
let pw = UnicodeWidthStr::width(prefix);
let avail = width.saturating_sub(pw).max(1);
let lines = wrap_preserving_code(text, avail);
for (i, line) in lines.into_iter().enumerate() {
if i == 0 {
out.push(format!("{prefix}{line}"));
} else {
let indent = if repeat_prefix {
prefix.to_string()
} else {
" ".repeat(pw)
};
out.push(format!("{indent}{line}"));
}
}
}
This refactor will greatly simplify the codebase and improve maintainability. Please consider implementing these changes.
I created this issue for @leynos from #79 (comment).
Tips and commands
Getting Help
Context
The current implementation for text wrapping and prefix handling involves custom tokenization and manual logic, which increases code complexity and maintenance burden. The suggestion is to replace this with the
textwrapandunicode-widthcrates, which are well-tested and handle Unicode, word boundaries, and hyphens more robustly.Proposed Improvements
Replace Custom Tokenization and Wrapping:
tokenize_markdownandwrap_preserving_codecustom logic.textwrap'swrapfunction with appropriate options (e.g.,break_words(false)andUnicodeBreakWordSeparator) to handle word boundaries and Unicode correctly.Consolidate Prefix Handling:
append_wrapped_with_prefixandhandle_prefix_lineinto a single helper function (e.g.,wrap_with_prefix).Simplify Main Loop:
wrap_text), call the newwrap_with_prefixhelper instead of the old manual functions.Benefits
Suggested Action Items
textwrapandunicode-widthfor wrapping and prefix handling.Example Implementation
See the following code snippets for reference:
This refactor will greatly simplify the codebase and improve maintainability. Please consider implementing these changes.
I created this issue for @leynos from #79 (comment).
Tips and commands
Getting Help