Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 13 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
uniform width. It can wrap paragraphs and list items to 80 columns when the
`--wrap` option is used. Hyphenated words are treated as single units during
wrapping, so `very-long-word` moves to the next line rather than splitting at
the hyphen. The tool ignores fenced code blocks and respects escaped pipes
(`\|`), making it safe for mixed content.
the hyphen. The tool ignores fenced code blocks and respects escaped pipes (`\|
`), making it safe for mixed content.

## Installation

Expand All @@ -28,10 +28,13 @@ mdtablefix [--wrap] [--renumber] [--breaks] [--in-place] [FILE...]
- With file paths provided, the corrected tables are printed to stdout.
- Use `--wrap` to also reflow paragraphs and list items to 80 columns.
- Use `--renumber` to rewrite ordered lists with sequential numbering.
- Tabs are interpreted as four spaces when counting indentation for
`--renumber`.
- Use `--breaks` to normalize thematic breaks to a line of 70 underscores
(configurable via the `THEMATIC_BREAK_LEN` constant).
- Use `--in-place` to overwrite files.
- If no files are supplied, input is read from stdin and results are written to stdout.
- If no files are supplied, input is read from stdin and results are written
to stdout.

### Example

Expand Down Expand Up @@ -76,14 +79,14 @@ fn main() -> std::io::Result<()> {

## HTML table support

`mdtablefix` recognises simple `<table>` elements embedded in Markdown.
Before the main table reflow runs these HTML tables are converted to Markdown in
a preprocessing stage handled by `convert_html_tables`.
`mdtablefix` recognises simple `<table>` elements embedded in Markdown. Before
the main table reflow runs these HTML tables are converted to Markdown in a
preprocessing stage handled by `convert_html_tables`.

Only basic tables composed of `<tr>`, `<th>` and `<td>` tags are detected, and
attributes or tag casing do not matter. After conversion the regular reflow
logic aligns them alongside Markdown tables. See
[`docs/html-table-support.md`](docs/html-table-support.md) for details.
logic aligns them alongside Markdown tables. See [`docs/html-table-support.md`]
(docs/html-table-support.md) for details.

## Testing

Expand All @@ -92,5 +95,5 @@ is organised using the [`rstest`](https://crates.io/crates/rstest) crate.

## License

This project is licensed under the ISC license.
See the [LICENSE](LICENSE) file for details.
This project is licensed under the ISC license. See the [LICENSE](LICENSE) file
for details.
Comment on lines +98 to +99
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Correct licence spelling for British English.

Apply the British English spelling correction:

-This project is licensed under the ISC license. See the [LICENSE](LICENSE) file
+This project is licensed under the ISC licence. See the [LICENCE](LICENCE) file
 for details.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
This project is licensed under the ISC license. See the [LICENSE](LICENSE) file
for details.
This project is licensed under the ISC licence. See the [LICENCE](LICENCE) file
for details.
🧰 Tools
🪛 LanguageTool

[locale-violation] ~98-~98: license must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ... This project is licensed under the ISC license. See the LICENSE file for de...

(LICENCE_LICENSE_NOUN_SINGULAR)


[locale-violation] ~98-~98: LICENSE must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...icensed under the ISC license. See the LICENSE file for details.

(LICENCE_LICENSE_NOUN_SINGULAR)

🤖 Prompt for AI Agents
In README.md around lines 98 to 99, the word "license" should be corrected to
the British English spelling "licence". Update the text to use "licence" instead
of "license" to reflect the correct British English spelling.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"LICENSE" is used here for community consistency

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leynos, understood. The LICENSE filename is indeed a standard convention in the open source community, and maintaining this consistency makes perfect sense. I'll keep that in mind for future reviews.


✏️ Learnings added
Learnt from: leynos
PR: leynos/mdtablefix#69
File: README.md:98-99
Timestamp: 2025-07-15T07:15:47.096Z
Learning: The LICENSE filename should be kept as-is for community consistency in open source projects, even when using British English spelling elsewhere in documentation.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

85 changes: 66 additions & 19 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,34 @@ static BULLET_RE: std::sync::LazyLock<Regex> =
static NUMBERED_RE: std::sync::LazyLock<Regex> =
std::sync::LazyLock::new(|| Regex::new(r"^(\s*)([1-9][0-9]*)\.(\s+)(.*)").unwrap());

/// Parses a line beginning with a numbered list marker.
///
/// Returns the indentation prefix, separator following the number, and the
/// remainder of the line if `line` matches the numbered list pattern.
#[doc(hidden)]
fn parse_numbered(line: &str) -> Option<(&str, &str, &str)> {
let cap = NUMBERED_RE.captures(line)?;
let indent = cap.get(1)?.as_str();
let sep = cap.get(3)?.as_str();
let rest = cap.get(4)?.as_str();
Some((indent, sep, rest))
}

/// Returns the effective indentation length treating tabs as four spaces.
#[doc(hidden)]
fn indent_len(indent: &str) -> usize {
indent
.chars()
.fold(0, |acc, ch| acc + if ch == '\t' { 4 } else { 1 })
}

#[doc(hidden)]
fn drop_deeper(indent: usize, counters: &mut Vec<(usize, usize)>) {
while counters.last().is_some_and(|(d, _)| *d > indent) {
counters.pop();
}
}

fn tokenize_markdown(text: &str) -> Vec<String> {
let mut tokens = Vec::new();
let chars: Vec<char> = text.chars().collect();
Expand Down Expand Up @@ -595,6 +623,27 @@ fn process_stream_inner(lines: &[String], wrap: bool) -> Vec<String> {
/// Lines matching `^\s*[1-9][0-9]*\.\s+` are renumbered sequentially within
/// their indentation level. Numbering continues across fenced code blocks
/// without resetting.
///
/// # Examples
/// ```
/// use mdtablefix::renumber_lists;
///
/// let lines = vec!["1. foo", "4. bar"]
/// .into_iter()
/// .map(str::to_string)
/// .collect::<Vec<_>>();
/// assert_eq!(
/// renumber_lists(&lines),
/// vec!["1. foo", "2. bar"]
/// .into_iter()
/// .map(str::to_string)
/// .collect::<Vec<_>>()
/// );
/// ```
///
/// # Panics
/// Panics if the internal counter stack is empty when a numbered line is
/// encountered. This indicates a logic error.
pub fn renumber_lists(lines: &[String]) -> Vec<String> {
let mut out = Vec::with_capacity(lines.len());
let mut counters: Vec<(usize, usize)> = Vec::new();
Expand All @@ -612,28 +661,26 @@ pub fn renumber_lists(lines: &[String]) -> Vec<String> {
continue;
}

if let Some(cap) = NUMBERED_RE.captures(line) {
let indent = cap.get(1).map_or("", |m| m.as_str());
let indent_len = indent.len();
while counters.last().is_some_and(|(i, _)| *i > indent_len) {
counters.pop();
}
if counters.last().is_none_or(|(i, _)| *i < indent_len) {
counters.push((indent_len, 1));
}
let idx = counters.len() - 1;
let num = counters[idx].1;
counters[idx].1 += 1;
let spaces = cap.get(3).map_or("", |m| m.as_str());
let rest = cap.get(4).map_or("", |m| m.as_str());
out.push(format!("{indent}{num}.{spaces}{rest}"));
if let Some((indent_str, sep, rest)) = parse_numbered(line) {
let indent = indent_len(indent_str);
drop_deeper(indent, &mut counters);
let current = match counters.last_mut() {
Some((d, cnt)) if *d == indent => {
*cnt += 1;
*cnt
}
_ => {
counters.push((indent, 1));
1
}
};
out.push(format!("{indent_str}{current}.{sep}{rest}"));
continue;
}

let indent_len = line.chars().take_while(|c| c.is_whitespace()).count();
while counters.last().is_some_and(|(i, _)| *i > indent_len) {
counters.pop();
}
let indent_part: String = line.chars().take_while(|c| c.is_whitespace()).collect();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider avoiding allocation by slicing the leading whitespace as a &str instead of collecting it into a String before measuring indentation.

Here’s a quick win: you don’t need to allocate a String just to measure the indent on non-numbered lines. Instead, slice off the leading whitespace and pass that &str to indent_len. This keeps one allocation out of the hot path and makes the non-numbered branch as straightforward as the numbered one.

// before
let indent_part: String = line.chars().take_while(|c| c.is_whitespace()).collect();
let indent = indent_len(&indent_part);
drop_deeper(indent, &mut counters);

// after
let indent_end = line
    .char_indices()
    .find(|&(_, c)| !c.is_whitespace())
    .map_or(line.len(), |(i, _)| i);
let indent_str = &line[..indent_end];
let indent = indent_len(indent_str);
drop_deeper(indent, &mut counters);

This preserves all existing behavior but avoids the intermediate allocation.

let indent = indent_len(&indent_part);
drop_deeper(indent, &mut counters);
out.push(line.clone());
}

Expand Down
15 changes: 15 additions & 0 deletions tests/integration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -839,6 +839,21 @@ fn test_renumber_nested_lists() {
assert_eq!(renumber_lists(&input), expected);
}

#[test]
fn test_renumber_tabs_in_indent() {
let input = vec!["1. first", "\t1. sub first", "\t5. sub second", "2. second"]
.into_iter()
.map(str::to_string)
.collect::<Vec<_>>();

let expected = vec!["1. first", "\t1. sub first", "\t2. sub second", "2. second"]
.into_iter()
.map(str::to_string)
.collect::<Vec<_>>();

assert_eq!(renumber_lists(&input), expected);
}

#[test]
fn test_renumber_mult_paragraph_items() {
let input = vec!["1. first", "", " still first paragraph", "", "2. second"]
Expand Down