Conversation
Reviewer's GuideIntroduces an optional fence normalization stage that standardizes code block delimiters to three backticks and merges orphaned language specifiers, exposed via a new Class diagram for new and updated modules (fences, process, lib)classDiagram
class fences {
<<module>>
+compress_fences(lines: &[String]) Vec<String>
+attach_orphan_specifiers(lines: &[String]) Vec<String>
}
class process {
<<module>>
+process_stream_inner(lines: &[String], opts: Options) Vec<String>
+process_stream(lines: &[String]) Vec<String>
+process_stream_no_wrap(lines: &[String]) Vec<String>
+process_stream_opts(lines: &[String], opts: Options) Vec<String>
+Options
}
class lib {
<<module>>
+compress_fences()
+attach_orphan_specifiers()
+process_stream_opts()
+Options
}
process ..> fences : uses compress_fences, attach_orphan_specifiers
lib --> fences
lib --> process
fences <|-- process : normalization step
process <|-- lib : public API
Class diagram for Options struct changesclassDiagram
class Options {
+bool wrap
+bool ellipsis
+bool fences
+bool footnotes
}
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 5 minutes and 26 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
Summary by CodeRabbit
WalkthroughIntroduce a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant Process
participant Fences
User->>CLI: Run mdtablefix with --fences
CLI->>Process: process_stream_opts(..., fences=true)
Process->>Fences: compress_fences(lines)
Fences-->>Process: Normalised lines
Process->>Fences: attach_orphan_specifiers(lines)
Fences-->>Process: Lines with specifiers attached
Process->>Process: Further processing (tables, wrapping, ellipsis)
Process-->>CLI: Output lines
CLI-->>User: Display result
Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Consider refactoring the multiple boolean flags (wrap, ellipsis, fences) into a single
Optionsstruct to simplify theprocess_stream_innersignature and reduce the chance of passing arguments in the wrong order. - Add documentation and a test case that explicitly shows tilde-based fences are unaffected by
--fencesso users know that only backtick delimiters are normalized. - Include an end-to-end test that combines fence normalization with other options (e.g. wrapping or renumbering) to verify the processing order remains correct.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider refactoring the multiple boolean flags (wrap, ellipsis, fences) into a single `Options` struct to simplify the `process_stream_inner` signature and reduce the chance of passing arguments in the wrong order.
- Add documentation and a test case that explicitly shows tilde-based fences are unaffected by `--fences` so users know that only backtick delimiters are normalized.
- Include an end-to-end test that combines fence normalization with other options (e.g. wrapping or renumbering) to verify the processing order remains correct.
## Individual Comments
### Comment 1
<location> `src/fences.rs:76` </location>
<code_context>
+ if in_fence {
+ in_fence = false;
+ } else {
+ if trimmed == "```"
+ && let Some(idx) = out
+ .iter()
+ .enumerate()
</code_context>
<issue_to_address>
The logic for attaching orphan specifiers only checks the most recent lines.
If you want to avoid attaching specifiers across blank lines, restrict the check to only the line directly above.
</issue_to_address>
### Comment 2
<location> `src/fences.rs:67` </location>
<code_context>
+/// assert_eq!(fixed[0], "```Rust");
+/// ```
+#[must_use]
+pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> {
+ let mut out: Vec<String> = Vec::with_capacity(lines.len());
+ let mut in_fence = false;
</code_context>
<issue_to_address>
Consider rewriting the loop as a state machine using a Peekable iterator to simplify the orphan specifier attachment logic.
You can avoid the reverse-search + `remove(idx)` hack by turning your loop into a small state machine over a `Peekable` iterator. That way you just look ahead once for a “language only” line followed by exactly ````` and swallow the fence, instead of digging back into your output buffer. This preserves the exact same behavior but is much easier to follow:
```rust
pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> {
let mut out = Vec::with_capacity(lines.len());
let mut in_fence = false;
let mut iter = lines.iter().peekable();
while let Some(line) = iter.next() {
let trimmed = line.trim();
// If we’re not already inside a fence, and we see a
// bare language line + next line is exactly "```",
// emit "```<lang>" and consume the fence in one shot.
if !in_fence && ORPHAN_LANG_RE.is_match(trimmed) {
if iter.peek().map(|next| next.trim() == "```").unwrap_or(false) {
out.push(format!("```{trimmed}"));
iter.next(); // consume the fence
in_fence = true; // now inside
continue;
}
}
// Toggle fence state on every line that starts a fence
if trimmed.starts_with("```") {
in_fence = !in_fence;
}
// Otherwise just copy the line verbatim
out.push(line.clone());
}
out
}
```
This:
- Eliminates the `.rev().find(…)` + `out.remove(idx)` back-splice.
- Uses one pass with `peekable()` to match “lang” + “```” pairs.
- Keeps all existing behavior and tests intact.
</issue_to_address>
### Comment 3
<location> `src/process.rs:11` </location>
<code_context>
};
#[must_use]
-pub fn process_stream_inner(lines: &[String], wrap: bool, ellipsis: bool) -> Vec<String> {
- let pre = convert_html_tables(lines);
+pub fn process_stream_inner(
</code_context>
<issue_to_address>
Consider replacing the multiple boolean arguments with a single self-documenting options struct to clarify function calls and improve maintainability.
Here’s a small refactoring that keeps all behavior exactly the same but collapses those four `bool` arguments into one self-documenting `Options` struct. You can still call your helpers with different presets, but you’ll never need to remember “what’s the 3rd `bool`” again.
```rust
// 1. Define a small Options struct
#[derive(Clone, Copy)]
pub struct ProcessOptions {
pub wrap: bool,
pub ellipsis: bool,
pub fences: bool,
}
impl ProcessOptions {
/// Default behavior: wrap text, no ellipsis, no fence-handling
pub const DEFAULT: Self = ProcessOptions {
wrap: true,
ellipsis: false,
fences: false,
};
/// Like DEFAULT but with wrapping turned off
pub const NO_WRAP: Self = ProcessOptions {
wrap: false,
..ProcessOptions::DEFAULT
};
}
```
```rust
// 2. Change the inner API to take that struct
#[must_use]
pub fn process_stream_inner(lines: &[String], opts: ProcessOptions) -> Vec<String> {
// fence handling is now explicit
let lines = if opts.fences {
let tmp = compress_fences(lines);
attach_orphan_specifiers(&tmp)
} else {
lines.to_vec()
};
let pre = convert_html_tables(&lines);
// …rest of your existing logic…
let mut out = if opts.wrap {
wrap_text(&out, 80)
} else {
out
};
if opts.ellipsis {
out = replace_ellipsis(&out);
}
out
}
```
```rust
// 3. Update your public helpers to use those named presets
#[must_use]
pub fn process_stream(lines: &[String]) -> Vec<String> {
process_stream_inner(lines, ProcessOptions::DEFAULT)
}
#[must_use]
pub fn process_stream_no_wrap(lines: &[String]) -> Vec<String> {
process_stream_inner(lines, ProcessOptions::NO_WRAP)
}
#[must_use]
pub fn process_stream_opts(
lines: &[String],
wrap: bool,
ellipsis: bool,
fences: bool,
) -> Vec<String> {
process_stream_inner(lines, ProcessOptions { wrap, ellipsis, fences })
}
```
Benefits:
- callers no longer need to remember the order or meaning of each `bool`
- adding future flags is just adding fields to `ProcessOptions`
- your internal logic (“if fences…”, “if wrap…”) stays exactly the same
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 7
🔭 Outside diff range comments (2)
src/main.rs (1)
24-27: Update the clippy expect annotation to reflect the current number of flags.The annotation mentions "four independent flags" but there are now five flags in the struct (wrap, renumber, breaks, ellipsis, fences).
-#[expect( - clippy::struct_excessive_bools, - reason = "CLI exposes four independent flags" -)] +#[expect( + clippy::struct_excessive_bools, + reason = "CLI exposes five independent flags" +)]README.md (1)
159-163: Use en-GB spelling: "licence" instead of "license" when used as a noun.The coding guidelines require en-GB-oxendict spelling. Replace "License" with "Licence" when used as a noun.
-## License +## Licence -This project is licensed under the ISC License. See the [LICENSE](LICENSE) file +This project is licensed under the ISC Licence. See the [LICENCE](LICENCE) file
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
README.md(5 hunks)docs/module-relationships.md(3 hunks)src/fences.rs(1 hunks)src/lib.rs(2 hunks)src/main.rs(1 hunks)src/process.rs(2 hunks)tests/cli.rs(1 hunks)tests/fences.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🧬 Code Graph Analysis (2)
src/main.rs (1)
src/process.rs (1)
process_stream_opts(104-111)
src/lib.rs (1)
src/fences.rs (2)
attach_orphan_specifiers(67-96)compress_fences(28-45)
🪛 LanguageTool
README.md
[style] ~43-~43: Would you like to use the Oxford spelling “standardize”? The spelling ‘standardise’ is also correct.
Context: ...indentation level. - Use --breaks to standardise thematic breaks to a line of 70 undersc...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~46-~46: Consider using the typographical ellipsis character here instead.
Context: ...ipsis to replace groups of three dots (...) with the ellipsis character (…`). ...
(ELLIPSIS)
[style] ~49-~49: Would you like to use the Oxford spelling “normalize”? The spelling ‘normalise’ is also correct.
Context: ...dots are preserved. - Use --fences to normalise code block delimiters to three backtick...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~54-~54: You have used the passive voice repeatedly in nearby sentences. To make your writing clearer and easier to read, consider using active voice.
Context: ...ed, input is read from stdin and output is written to stdout. ### Example: Table Reflow...
(REP_PASSIVE_VOICE)
[style] ~109-~109: Since ownership is already implied, this phrasing may be redundant.
Context: ...for embedding the table reflow logic in your own Rust project: ```rust use mdtablefix::...
(PRP_OWN)
[style] ~132-~132: Would you like to use the Oxford spelling “normalization”? The spelling ‘normalisation’ is also correct.
Context: ...apping, ellipsis substitution and fence normalisation. - `rewrite(path: &Path) -> std::io::R...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~139-~139: Would you like to use the Oxford spelling “recognizes”? The spelling ‘recognises’ is also correct.
Context: ...e. ## HTML table support mdtablefix recognises basic HTML <table> elements embedded ...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~141-~141: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...sing stage using convert_html_tables, prior to reflow. Only simple tables composed of...
(EN_WORDINESS_PREMIUM_PRIOR_TO)
[locale-violation] ~159-~159: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...th-rstest-fixtures.md) for details. ## License This project is licensed under the ISC...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ... This project is licensed under the ISC License. See the LICENSE file for fu...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: LICENSE must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...icensed under the ISC License. See the LICENSE file for full details.
(LICENCE_LICENSE_NOUN_SINGULAR)
🔇 Additional comments (21)
src/lib.rs (1)
15-15: Clean module integration follows established patterns.The new
fencesmodule declaration and public re-exports are consistent with the existing codebase structure and conventions.Also applies to: 32-32
src/main.rs (1)
41-47: Proper integration of the new fences option.The CLI flag definition and integration into the processing pipeline follows the established pattern and is correctly implemented.
tests/fences.rs (2)
1-1: Module documentation follows the required pattern.The module-level documentation correctly uses
//!comments and explains the module's purpose.
7-33: Comprehensive test coverage for fence normalisation.The tests cover the main functionality well with clear, descriptive names and appropriate test scenarios.
tests/cli.rs (3)
137-151: Solid CLI test for fence normalisation.The test properly verifies that the
--fencesoption normalises backtick fences from four to three backticks whilst preserving language specifiers.
153-167: Excellent test for processing order verification.This test ensures that fence normalisation occurs before ellipsis processing, which is crucial for maintaining the correct pipeline behaviour. Good practice for testing feature interactions.
169-183: Comprehensive test for orphan specifier handling.The test correctly verifies that orphaned language specifiers are properly attached to fence lines when the
--fencesoption is enabled.README.md (4)
28-30: LGTM! Clear documentation of the new CLI option.The command-line usage documentation clearly shows the new
--fencesoption in the correct position amongst the other flags.
112-128: LGTM! Library usage example properly demonstrates the new parameter.The Rust code example correctly shows the new
fencesparameter in theprocess_stream_optsfunction call and maintains consistency with the existing parameter documentation style.
46-48: Replace three dots with typographical ellipsis character.Use the ellipsis character (…) instead of three dots (...) for consistency with the feature being documented.
-- Use `--ellipsis` to replace groups of three dots (`...`) with the ellipsis +- Use `--ellipsis` to replace groups of three dots (`…`) with the ellipsisLikely an incorrect or invalid review comment.
139-139: Use en-GB spelling: "recognizes" instead of "recognises".The coding guidelines require en-GB-oxendict spelling. Replace "recognises" with "recognizes".
-`mdtablefix` recognises basic HTML `<table>` elements embedded in Markdown. +`mdtablefix` recognizes basic HTML `<table>` elements embedded in Markdown.Likely an incorrect or invalid review comment.
src/fences.rs (5)
10-14: LGTM! Well-structured regex patterns with proper static initialization.The regex patterns correctly identify backtick fences and orphan language specifiers. Using
LazyLockfor static regex compilation is the appropriate approach for performance.
28-45: LGTM! Robust fence compression implementation.The function correctly handles indentation preservation, language specifier retention, and uses functional programming patterns appropriately. The logic properly distinguishes between fences with and without language specifiers.
67-96: LGTM! Sophisticated orphan specifier attachment logic.The implementation correctly:
- Tracks fence state to avoid processing specifiers inside code blocks
- Uses reverse iteration to find the most recent orphan specifier
- Properly removes the orphan line and attaches it to the fence
- Handles the complex control flow with appropriate early returns
The logic is sound and handles edge cases well.
98-146: LGTM! Comprehensive test coverage.The test suite covers all key scenarios:
- Basic fence compression
- Indented fences with language specifiers
- Tilde fence exclusion (correct behaviour)
- Orphan specifier attachment
The tests are well-structured and verify the expected behaviour thoroughly.
1-5: Use en-GB spelling: "normalizing" instead of "normalising".The coding guidelines require en-GB-oxendict spelling in comments and documentation.
-//! Pre-processing utilities for normalising fenced code block delimiters. +//! Pre-processing utilities for normalizing fenced code block delimiters.Likely an incorrect or invalid review comment.
src/process.rs (5)
5-5: LGTM! Proper import of fence processing functions.The import correctly brings in the new fence processing functions from the fences module.
12-23: LGTM! Clean integration of fence processing into the pipeline.The fence processing is correctly:
- Added as a new parameter to the function signature
- Conditionally applied only when the
fencesflag is true- Positioned before HTML table conversion in the processing pipeline
- Implemented with proper ownership semantics
The two-step approach with
compress_fencesfollowed byattach_orphan_specifiersmatches the expected behaviour.
94-96: LGTM! Consistent parameter passing in wrapper functions.The
process_streamfunction correctly passesfalsefor the newfencesparameter, maintaining backward compatibility.
99-101: LGTM! Consistent parameter passing maintained.The
process_stream_no_wrapfunction correctly passesfalsefor thefencesparameter.
104-111: LGTM! Public API correctly exposes the new fences parameter.The
process_stream_optsfunction properly:
- Adds the new
fencesparameter to the public API- Passes all parameters through to the internal function
- Maintains the established parameter ordering pattern
This enables CLI and library users to control fence normalization behaviour.
| if trimmed == "```" | ||
| && let Some(idx) = out | ||
| .iter() | ||
| .enumerate() | ||
| .rev() | ||
| .find(|(_, l)| !l.trim().is_empty() && ORPHAN_LANG_RE.is_match(l.trim())) | ||
| .map(|(i, _)| i) | ||
| { |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consider extracting the orphan detection logic into a helper function.
The nested iterator chain with let Some(idx) = out.iter().enumerate().rev().find(...) is complex and could benefit from extraction for improved readability.
fn find_orphan_specifier_index(lines: &[String]) -> Option<usize> {
lines
.iter()
.enumerate()
.rev()
.find(|(_, line)| !line.trim().is_empty() && ORPHAN_LANG_RE.is_match(line.trim()))
.map(|(i, _)| i)
}Then use it in the main function:
if trimmed == "```" && let Some(idx) = find_orphan_specifier_index(&out) {
// ... rest of the logic
}🤖 Prompt for AI Agents
In src/fences.rs around lines 76 to 83, the complex nested iterator chain used
to detect orphan specifiers should be extracted into a helper function for
better readability. Create a function named find_orphan_specifier_index that
takes a slice of strings and returns an Option<usize> by performing the reversed
enumeration and find operation with the existing conditions. Replace the
original iterator chain in the if condition with a call to this new helper
function, passing &out as the argument.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/fences.rs (1)
28-40: Add dedicated unit test forattach_orphan_specifiers.These tests combine both
compress_fencesandattach_orphan_specifiersfunctions. Whilst they validate integration behaviour, add separate unit tests forattach_orphan_specifiersto maintain test isolation and improve debugging clarity.Add dedicated unit tests:
+#[test] +fn attaches_orphan_specifier_unit() { + let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; + let out = attach_orphan_specifiers(&input); + assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); +} + +#[test] +fn attaches_orphan_specifier_with_blank_line_unit() { + let input = lines_vec!["Rust", "", "```", "fn main() {}", "```"]; + let out = attach_orphan_specifiers(&input); + assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); +}src/fences.rs (1)
67-99: Simplify the orphan specifier attachment logic using a state machine.The current implementation uses complex reverse iteration and removal operations. Rewrite using a
Peekableiterator to improve readability and maintainability.Replace the complex logic with this cleaner state machine approach:
-pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> { - let mut out: Vec<String> = Vec::with_capacity(lines.len()); - let mut in_fence = false; - for line in lines { - let trimmed = line.trim(); - if trimmed.starts_with("```") { - if in_fence { - in_fence = false; - } else { - if trimmed == "```" - && let Some(idx) = out - .iter() - .enumerate() - .rev() - .find(|(_, l)| !l.trim().is_empty() && ORPHAN_LANG_RE.is_match(l.trim())) - .map(|(i, _)| i) - { - let lang = out[idx].trim().to_string(); - while out.len() > idx + 1 && out[idx + 1].trim().is_empty() { - out.remove(idx + 1); - } - out.remove(idx); - out.push(format!("```{lang}")); - in_fence = true; - continue; - } - in_fence = true; - } - } - out.push(line.clone()); - } - out -} +pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> { + let mut out = Vec::with_capacity(lines.len()); + let mut in_fence = false; + let mut iter = lines.iter().peekable(); + + while let Some(line) = iter.next() { + let trimmed = line.trim(); + + // Handle orphan language specifier followed by opening fence + if !in_fence && ORPHAN_LANG_RE.is_match(trimmed) { + // Look ahead for fence, potentially with blank lines in between + let mut peek_ahead = iter.clone(); + let mut found_fence = false; + + // Skip blank lines + while let Some(next_line) = peek_ahead.peek() { + let next_trimmed = next_line.trim(); + if next_trimmed.is_empty() { + peek_ahead.next(); + } else if next_trimmed == "```" { + found_fence = true; + break; + } else { + break; + } + } + + if found_fence { + // Skip blank lines in the actual iterator + while let Some(next_line) = iter.peek() { + if next_line.trim().is_empty() { + iter.next(); + } else { + break; + } + } + // Consume the fence line and emit combined line + iter.next(); + out.push(format!("```{trimmed}")); + in_fence = true; + continue; + } + } + + // Toggle fence state on fence lines + if trimmed.starts_with("```") { + in_fence = !in_fence; + } + + out.push(line.clone()); + } + out +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/fences.rs(1 hunks)tests/fences.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🔇 Additional comments (8)
tests/fences.rs (4)
1-6: LGTM: Well-structured test module with proper documentation.The module documentation follows the required
//!format and clearly describes the module's purpose. The imports are correctly structured.
7-12: LGTM: Clear and focused unit test.This test properly validates the core functionality of
compress_fenceswith a straightforward case.
14-19: LGTM: Good coverage of indented fence handling.This test ensures that indentation and language specifiers are preserved during compression, which is important for maintaining markdown structure.
21-26: LGTM: Proper negative test case.This test verifies that non-backtick content remains unchanged, ensuring the function doesn't modify unrelated lines.
src/fences.rs (4)
1-9: LGTM: Excellent module documentation and imports.The module documentation clearly explains the purpose and functionality. The use of
LazyLockfor regex compilation is appropriate for performance.
10-14: LGTM: Well-designed regex patterns.The regex patterns correctly capture backtick fences with optional language specifiers and orphaned language identifiers. The patterns are comprehensive and handle the expected use cases.
16-45: LGTM: Clean and efficient implementation.The
compress_fencesfunction is well-documented with examples and handles all edge cases correctly. The implementation is functional and efficient.
47-66: LGTM: Comprehensive documentation with examples.The function documentation clearly explains the purpose and includes a practical example showing the expected input and output format.
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| #[test] | ||
| fn compresses_simple_fence() { | ||
| let input = vec![ | ||
| "````rust".to_string(), | ||
| "code".to_string(), | ||
| "````".to_string(), | ||
| ]; | ||
| let out = compress_fences(&input); | ||
| assert_eq!(out[0], "```rust"); | ||
| assert_eq!(out[2], "```"); | ||
| } | ||
|
|
||
| #[test] | ||
| fn compresses_indented_fence() { | ||
| let input = vec![" `````foo,bar ".to_string()]; | ||
| let out = compress_fences(&input); | ||
| assert_eq!(out[0], " ```foo,bar"); | ||
| } | ||
|
|
||
| #[test] | ||
| fn ignores_tilde_fence() { | ||
| let input = vec!["~~~~".to_string()]; | ||
| let out = compress_fences(&input); | ||
| assert_eq!(out, input); | ||
| } | ||
|
|
||
| #[test] | ||
| fn attaches_orphan_specifier() { | ||
| let input = vec![ | ||
| "Rust".to_string(), | ||
| "```".to_string(), | ||
| "fn main() {}".to_string(), | ||
| "```".to_string(), | ||
| ]; | ||
| let out = attach_orphan_specifiers(&compress_fences(&input)); | ||
| assert_eq!( | ||
| out, | ||
| vec![ | ||
| "```Rust".to_string(), | ||
| "fn main() {}".to_string(), | ||
| "```".to_string(), | ||
| ] | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
LGTM: Comprehensive unit tests with good coverage.
The tests cover the main functionality and edge cases. However, consider moving these to the dedicated test file to maintain consistency with the project structure.
Move these tests to tests/fences.rs to consolidate all fence-related tests in one location and maintain consistency with the project's testing structure.
🤖 Prompt for AI Agents
In src/fences.rs around lines 101 to 149, the unit tests for fence compression
and specifier attachment should be moved to the dedicated test file
tests/fences.rs. Cut the entire tests module from src/fences.rs and paste it
into tests/fences.rs, ensuring the module imports and test annotations remain
intact. This will consolidate all fence-related tests in one place and maintain
project testing consistency.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
README.md (3)
43-43: Past review comment is incorrect - keep en-GB spelling.The word "standardise" is correct for en-GB-oxendict spelling as required by the coding guidelines. The past review comment suggesting "standardize" is incorrect.
49-49: Past review comment is incorrect - keep en-GB spelling.The word "normalise" is correct for en-GB-oxendict spelling as required by the coding guidelines. The past review comment suggesting "normalize" is incorrect.
132-132: Past review comment is incorrect - keep en-GB spelling.The word "normalisation" is correct for en-GB-oxendict spelling as required by the coding guidelines. The past review comment suggesting "normalization" is incorrect.
src/fences.rs (1)
77-84: Complexity issue already tracked in GitHub issue #103.The complex iterator chain for orphan detection is already documented in the existing GitHub issue for refactoring to use a Peekable iterator approach.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
README.md(4 hunks)docs/module-relationships.md(3 hunks)src/fences.rs(1 hunks)tests/fences.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
docs/**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🪛 LanguageTool
README.md
[style] ~46-~46: Consider using the typographical ellipsis character here instead.
Context: ...ipsis to replace groups of three dots (...) with the ellipsis character (…`). ...
(ELLIPSIS)
[style] ~54-~54: You have used the passive voice repeatedly in nearby sentences. To make your writing clearer and easier to read, consider using active voice.
Context: ...ed, input is read from stdin and output is written to stdout. ### Example: Table Reflow...
(REP_PASSIVE_VOICE)
[style] ~109-~109: Since ownership is already implied, this phrasing may be redundant.
Context: ...for embedding the table reflow logic in your own Rust project: ```rust use mdtablefix::...
(PRP_OWN)
[style] ~139-~139: Would you like to use the Oxford spelling “recognizes”? The spelling ‘recognises’ is also correct.
Context: ...e. ## HTML table support mdtablefix recognises basic HTML <table> elements embedded ...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~141-~141: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...sing stage using convert_html_tables, prior to reflow. Only simple tables composed of...
(EN_WORDINESS_PREMIUM_PRIOR_TO)
[locale-violation] ~159-~159: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...th-rstest-fixtures.md) for details. ## License This project is licensed under the ISC...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ... This project is licensed under the ISC License. See the LICENSE file for fu...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: LICENSE must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...icensed under the ISC License. See the LICENSE file for full details.
(LICENCE_LICENSE_NOUN_SINGULAR)
🔇 Additional comments (8)
docs/module-relationships.md (2)
39-43: LGTM! Both functions are now properly documented.The fences module correctly shows both
compress_fences()andattach_orphan_specifiers()functions as requested in previous reviews.
71-71: LGTM! Usage relationship correctly reflects both functions.The process module usage relationship now properly indicates use of both
compress_fencesandattach_orphan_specifiersfrom the fences module.README.md (1)
118-123: LGTM! API documentation correctly reflects the new Options struct.The example properly demonstrates the new
Optionsstruct with thefencesfield and updated function signature.tests/fences.rs (1)
1-55: LGTM! Comprehensive test coverage with good separation.The test suite provides excellent coverage of fence compression and orphan specifier attachment functionality. The addition of separate unit tests for
attach_orphan_specifiers(lines 43-47) alongside integration tests addresses the previous review feedback about test isolation.src/fences.rs (4)
1-6: LGTM! Excellent module documentation.The module-level documentation clearly explains the module's purpose and utility as required by the coding guidelines.
17-27: LGTM! Clear function documentation with examples.The function documentation includes clear examples as required by the coding guidelines.
29-46: LGTM! Clean implementation following Rust best practices.The function correctly uses
#[must_use], handles the regex matching properly, and follows the guidelines for immutable data and clear logic flow.
68-100: LGTM! Implementation correctly handles fence state tracking.The function properly manages fence state and handles the orphan specifier attachment logic, despite the complexity noted in previous reviews.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/fences.rs (1)
68-114: Complex but correct orphan specifier attachment logic.The implementation correctly handles orphan specifier attachment with proper fence state tracking. The complexity concerns raised in previous reviews have been acknowledged and tracked in GitHub issue #103 for future refactoring to use a Peekable iterator state machine approach.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
README.md(4 hunks)docs/module-relationships.md(3 hunks)src/fences.rs(1 hunks)src/html.rs(1 hunks)src/lib.rs(2 hunks)tests/breaks.rs(1 hunks)tests/cli.rs(1 hunks)tests/fences.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
Instructions used from:
Sources:
⚙️ CodeRabbit Configuration File
**/*.md
Instructions used from:
Sources:
⚙️ CodeRabbit Configuration File
🧬 Code Graph Analysis (1)
src/lib.rs (3)
src/fences.rs (2)
attach_orphan_specifiers(68-114)compress_fences(29-46)src/io.rs (2)
rewrite(11-16)rewrite_no_wrap(22-27)src/process.rs (3)
process_stream(100-109)process_stream_no_wrap(112-121)process_stream_opts(124-126)
🪛 LanguageTool
README.md
[style] ~46-~46: Consider using the typographical ellipsis character here instead.
Context: ...ipsis to replace groups of three dots (...) with the ellipsis character (…`). ...
(ELLIPSIS)
[style] ~54-~54: You have used the passive voice repeatedly in nearby sentences. To make your writing clearer and easier to read, consider using active voice.
Context: ...ed, input is read from stdin and output is written to stdout. ### Example: Table Reflow...
(REP_PASSIVE_VOICE)
[style] ~109-~109: Since ownership is already implied, this phrasing may be redundant.
Context: ...for embedding the table reflow logic in your own Rust project: ```rust use mdtablefix::...
(PRP_OWN)
[style] ~141-~141: ‘prior to’ might be wordy. Consider a shorter alternative.
Context: ...sing stage using convert_html_tables, prior to reflow. Only simple tables composed of...
(EN_WORDINESS_PREMIUM_PRIOR_TO)
[locale-violation] ~159-~159: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...th-rstest-fixtures.md) for details. ## License This project is licensed under the ISC...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: License must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ... This project is licensed under the ISC License. See the LICENSE file for fu...
(LICENCE_LICENSE_NOUN_SINGULAR)
[locale-violation] ~161-~161: LICENSE must be spelled with a “c” when used as a noun in British English. Use “licence”.
Context: ...icensed under the ISC License. See the LICENSE file for full details.
(LICENCE_LICENSE_NOUN_SINGULAR)
🔇 Additional comments (14)
docs/module-relationships.md (2)
39-43: LGTM!The fences module documentation correctly includes both exported functions and properly integrates with the module relationship diagram.
60-61: LGTM!The dependency and usage relationships correctly reflect the integration of the fences module into the processing pipeline.
Also applies to: 71-71
src/lib.rs (3)
15-15: LGTM!The new fences module is properly declared and integrated into the crate structure.
32-32: LGTM!The fences module functions are correctly re-exported for public use.
36-36: LGTM!The Options struct export properly supports the new unified configuration API.
README.md (4)
29-30: LGTM!The CLI usage documentation correctly includes the new
--fencesoption with proper formatting.
49-50: LGTM!The
--fencesoption is clearly documented with appropriate description of its functionality.
118-123: LGTM!The Options struct usage example correctly demonstrates the new API with proper field assignments.
130-132: LGTM!The function signature documentation accurately reflects the updated API that accepts the Options struct.
tests/fences.rs (1)
1-61: Excellent comprehensive test coverage for fence normalisation functionality.The test suite effectively covers all scenarios including basic compression, indentation preservation, tilde conversion, orphan specifier attachment, and edge cases with blank lines. The separation of unit and integration tests addresses previous feedback appropriately.
tests/cli.rs (1)
137-209: Well-structured CLI integration tests for the new --fences option.The tests thoroughly validate the CLI functionality including basic fence normalisation, processing order with other options, orphan specifier attachment, and combinations with existing features. The test for ensuring fence processing occurs before ellipsis replacement is particularly valuable for verifying the correct processing pipeline order.
src/fences.rs (3)
1-6: Excellent module documentation following coding guidelines.The module doc comment clearly explains the purpose and utility of the fence normalisation functionality, adhering to the requirement for module documentation.
11-15: Proper use of LazyLock for regex compilation efficiency.The static regex patterns are correctly implemented using
LazyLockto ensure one-time compilation and good performance characteristics.
17-46: Clean and well-documented fence compression implementation.The
compress_fencesfunction is straightforward, well-documented with examples, and correctly handles the regex capture groups to preserve indentation and language specifiers whilst normalising fence delimiters.
6381a6b to
846b37c
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes - here's some feedback:
- Consider implementing Default for the Options struct so you can simplify calls in process_stream and process_stream_no_wrap (e.g. Options::default().with_fences(true)) instead of manually building identical structs each time.
- It would be good to add a CLI integration test for tilde-based fences (e.g.
~~~~) to verify that the --fences flag correctly normalizes both backtick and tilde fences. - The attach_orphan_specifiers function has a fairly complex peekable loop; you might simplify its logic or extract smaller helpers to make the specifier-attachment state transitions clearer and easier to maintain.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider implementing Default for the Options struct so you can simplify calls in process_stream and process_stream_no_wrap (e.g. Options::default().with_fences(true)) instead of manually building identical structs each time.
- It would be good to add a CLI integration test for tilde-based fences (e.g. `~~~~`) to verify that the --fences flag correctly normalizes both backtick and tilde fences.
- The attach_orphan_specifiers function has a fairly complex peekable loop; you might simplify its logic or extract smaller helpers to make the specifier-attachment state transitions clearer and easier to maintain.
## Individual Comments
### Comment 1
<location> `tests/fences.rs:22` </location>
<code_context>
+}
+
+#[test]
+fn compresses_tilde_fences() {
+ let input = lines_vec!["~~~~rust", "code", "~~~~"];
+ let out = compress_fences(&input);
+ assert_eq!(out, lines_vec!["```rust", "code", "```"]);
+}
+
</code_context>
<issue_to_address>
Missing test for mixed backtick and tilde fences.
Add a test where the opening and closing fences use different characters to verify correct handling or rejection of mixed fences.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
#[test]
fn compresses_tilde_fences() {
let input = lines_vec!["~~~~rust", "code", "~~~~"];
let out = compress_fences(&input);
assert_eq!(out, lines_vec!["```rust", "code", "```"]);
}
=======
#[test]
fn compresses_tilde_fences() {
let input = lines_vec!["~~~~rust", "code", "~~~~"];
let out = compress_fences(&input);
assert_eq!(out, lines_vec!["```rust", "code", "```"]);
}
#[test]
fn does_not_compress_mixed_fences() {
// Opening with tildes, closing with backticks
let input = lines_vec!["~~~rust", "code", "```"];
let out = compress_fences(&input);
// Should not compress or should leave as-is, depending on intended behavior
assert_eq!(out, lines_vec!["~~~rust", "code", "```"]);
// Opening with backticks, closing with tildes
let input2 = lines_vec!["```rust", "code", "~~~"];
let out2 = compress_fences(&input2);
assert_eq!(out2, lines_vec!["```rust", "code", "~~~"]);
}
>>>>>>> REPLACE
</suggested_fix>
### Comment 2
<location> `tests/fences.rs:36` </location>
<code_context>
+}
+
+#[test]
+fn fixes_orphaned_specifier() {
+ let input = lines_vec!["Rust", "```", "fn main() {}", "```"];
+ let out = attach_orphan_specifiers(&compress_fences(&input));
+ assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
+}
+
</code_context>
<issue_to_address>
Missing test for multiple orphan specifiers in a single document.
Add a test case with multiple orphan specifiers in one input to verify correct attachment for each.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
#[test]
fn fixes_orphaned_specifier() {
let input = lines_vec!["Rust", "```", "fn main() {}", "```"];
let out = attach_orphan_specifiers(&compress_fences(&input));
assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
}
=======
#[test]
fn fixes_orphaned_specifier() {
let input = lines_vec!["Rust", "```", "fn main() {}", "```"];
let out = attach_orphan_specifiers(&compress_fences(&input));
assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
}
#[test]
fn fixes_multiple_orphaned_specifiers() {
let input = lines_vec![
"Rust", "```", "fn main() {}", "```",
"Python", "```", "print('hi')", "```"
];
let out = attach_orphan_specifiers(&compress_fences(&input));
assert_eq!(
out,
lines_vec![
"```Rust", "fn main() {}", "```",
"```Python", "print('hi')", "```"
]
);
}
>>>>>>> REPLACE
</suggested_fix>
### Comment 3
<location> `tests/fences.rs:50` </location>
<code_context>
+}
+
+#[test]
+fn attaches_orphan_specifier_with_blank_line_unit() {
+ let input = lines_vec!["Rust", "", "```", "fn main() {}", "```"];
+ let out = attach_orphan_specifiers(&input);
+ assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
+}
+
</code_context>
<issue_to_address>
Missing test for non-orphan lines before fences.
Add a test to ensure lines resembling orphan specifiers (with spaces, punctuation, or unrelated words) are not mistakenly attached to fences.
</issue_to_address>
<suggested_fix>
<<<<<<< SEARCH
#[test]
fn attaches_orphan_specifier_unit() {
let input = lines_vec!["Rust", "```", "fn main() {}", "```"];
let out = attach_orphan_specifiers(&input);
assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
}
=======
#[test]
fn attaches_orphan_specifier_unit() {
let input = lines_vec!["Rust", "```", "fn main() {}", "```"];
let out = attach_orphan_specifiers(&input);
assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]);
}
#[test]
fn does_not_attach_non_orphan_lines_before_fences() {
// Lines that look like specifiers but are not: have spaces, punctuation, or unrelated words
let input = lines_vec![
"Rust code", // has a space
"```",
"fn main() {}",
"```",
"rust!", // has punctuation
"```",
"println!(\"hi\");",
"```",
"notaspec", // unrelated word
"```",
"let x = 1;",
"```"
];
let out = attach_orphan_specifiers(&input);
assert_eq!(
out,
lines_vec![
"Rust code",
"```",
"fn main() {}",
"```",
"rust!",
"```",
"println!(\"hi\");",
"```",
"notaspec",
"```",
"let x = 1;",
"```"
]
);
}
>>>>>>> REPLACE
</suggested_fix>
### Comment 4
<location> `src/fences.rs:68` </location>
<code_context>
+/// assert_eq!(fixed[0], "```Rust");
+/// ```
+#[must_use]
+pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> {
+ let mut out = Vec::with_capacity(lines.len());
+ let mut in_fence = false;
</code_context>
<issue_to_address>
Consider rewriting attach_orphan_specifiers as a single-pass function that backtracks on the output buffer to merge orphan language specifiers, avoiding iterator cloning and lookahead loops.
Here’s a one-pass version of `attach_orphan_specifiers` that avoids the `peekable().clone()` and explicit “look ahead” loops. Instead it just backtracks on the output buffer to merge the orphan line when you hit an opening fence:
```rust
pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> {
let mut out = Vec::with_capacity(lines.len());
let mut in_fence = false;
for line in lines {
let trimmed = line.trim();
if trimmed.starts_with("```") {
if !in_fence {
// drop any blank lines before us
while matches!(out.last(), Some(l) if l.trim().is_empty()) {
out.pop();
}
// if the last non-empty line is an orphan lang spec, merge it
if let Some(prev) = out.last() {
let lang = prev.trim();
if ORPHAN_LANG_RE.is_match(lang) {
out.pop();
out.push(format!("```{lang}"));
in_fence = true;
continue; // skip the normal push below
}
}
// otherwise, this is just a normal opening fence
} else {
// closing the fence
in_fence = false;
}
}
out.push(line.clone());
}
out
}
```
Key changes:
1. Single pass over `lines` (no iterator cloning or nested loops).
2. When you see an opening fence and are not already in one, you “rewind” over any blank lines in `out`, check the last non‐empty line for `ORPHAN_LANG_RE`, pop & merge if it matches.
3. Otherwise you fall back to toggling `in_fence` as before and pushing the line.
This preserves the behavior (skips empties between orphan and fence, toggles fence state) but is much simpler.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| #[test] | ||
| fn compresses_tilde_fences() { | ||
| let input = lines_vec!["~~~~rust", "code", "~~~~"]; | ||
| let out = compress_fences(&input); | ||
| assert_eq!(out, lines_vec!["```rust", "code", "```"]); | ||
| } |
There was a problem hiding this comment.
suggestion (testing): Missing test for mixed backtick and tilde fences.
Add a test where the opening and closing fences use different characters to verify correct handling or rejection of mixed fences.
| #[test] | |
| fn compresses_tilde_fences() { | |
| let input = lines_vec!["~~~~rust", "code", "~~~~"]; | |
| let out = compress_fences(&input); | |
| assert_eq!(out, lines_vec!["```rust", "code", "```"]); | |
| } | |
| #[test] | |
| fn compresses_tilde_fences() { | |
| let input = lines_vec!["~~~~rust", "code", "~~~~"]; | |
| let out = compress_fences(&input); | |
| assert_eq!(out, lines_vec!["```rust", "code", "```"]); | |
| } | |
| #[test] | |
| fn does_not_compress_mixed_fences() { | |
| // Opening with tildes, closing with backticks | |
| let input = lines_vec!["~~~rust", "code", "```"]; | |
| let out = compress_fences(&input); | |
| // Should not compress or should leave as-is, depending on intended behavior | |
| assert_eq!(out, lines_vec!["~~~rust", "code", "```"]); | |
| // Opening with backticks, closing with tildes | |
| let input2 = lines_vec!["```rust", "code", "~~~"]; | |
| let out2 = compress_fences(&input2); | |
| assert_eq!(out2, lines_vec!["```rust", "code", "~~~"]); | |
| } |
| #[test] | ||
| fn fixes_orphaned_specifier() { | ||
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | ||
| let out = attach_orphan_specifiers(&compress_fences(&input)); | ||
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | ||
| } |
There was a problem hiding this comment.
suggestion (testing): Missing test for multiple orphan specifiers in a single document.
Add a test case with multiple orphan specifiers in one input to verify correct attachment for each.
| #[test] | |
| fn fixes_orphaned_specifier() { | |
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | |
| let out = attach_orphan_specifiers(&compress_fences(&input)); | |
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | |
| } | |
| #[test] | |
| fn fixes_orphaned_specifier() { | |
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | |
| let out = attach_orphan_specifiers(&compress_fences(&input)); | |
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | |
| } | |
| #[test] | |
| fn fixes_multiple_orphaned_specifiers() { | |
| let input = lines_vec![ | |
| "Rust", "```", "fn main() {}", "```", | |
| "Python", "```", "print('hi')", "```" | |
| ]; | |
| let out = attach_orphan_specifiers(&compress_fences(&input)); | |
| assert_eq!( | |
| out, | |
| lines_vec![ | |
| "```Rust", "fn main() {}", "```", | |
| "```Python", "print('hi')", "```" | |
| ] | |
| ); | |
| } |
| #[test] | ||
| fn attaches_orphan_specifier_unit() { | ||
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | ||
| let out = attach_orphan_specifiers(&input); | ||
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | ||
| } |
There was a problem hiding this comment.
suggestion (testing): Missing test for non-orphan lines before fences.
Add a test to ensure lines resembling orphan specifiers (with spaces, punctuation, or unrelated words) are not mistakenly attached to fences.
| #[test] | |
| fn attaches_orphan_specifier_unit() { | |
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | |
| let out = attach_orphan_specifiers(&input); | |
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | |
| } | |
| #[test] | |
| fn attaches_orphan_specifier_unit() { | |
| let input = lines_vec!["Rust", "```", "fn main() {}", "```"]; | |
| let out = attach_orphan_specifiers(&input); | |
| assert_eq!(out, lines_vec!["```Rust", "fn main() {}", "```"]); | |
| } | |
| #[test] | |
| fn does_not_attach_non_orphan_lines_before_fences() { | |
| // Lines that look like specifiers but are not: have spaces, punctuation, or unrelated words | |
| let input = lines_vec![ | |
| "Rust code", // has a space | |
| "```", | |
| "fn main() {}", | |
| "```", | |
| "rust!", // has punctuation | |
| "```", | |
| "println!(\"hi\");", | |
| "```", | |
| "notaspec", // unrelated word | |
| "```", | |
| "let x = 1;", | |
| "```" | |
| ]; | |
| let out = attach_orphan_specifiers(&input); | |
| assert_eq!( | |
| out, | |
| lines_vec![ | |
| "Rust code", | |
| "```", | |
| "fn main() {}", | |
| "```", | |
| "rust!", | |
| "```", | |
| "println!(\"hi\");", | |
| "```", | |
| "notaspec", | |
| "```", | |
| "let x = 1;", | |
| "```" | |
| ] | |
| ); | |
| } |
| /// assert_eq!(fixed[0], "```Rust"); | ||
| /// ``` | ||
| #[must_use] | ||
| pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> { |
There was a problem hiding this comment.
issue (complexity): Consider rewriting attach_orphan_specifiers as a single-pass function that backtracks on the output buffer to merge orphan language specifiers, avoiding iterator cloning and lookahead loops.
Here’s a one-pass version of attach_orphan_specifiers that avoids the peekable().clone() and explicit “look ahead” loops. Instead it just backtracks on the output buffer to merge the orphan line when you hit an opening fence:
pub fn attach_orphan_specifiers(lines: &[String]) -> Vec<String> {
let mut out = Vec::with_capacity(lines.len());
let mut in_fence = false;
for line in lines {
let trimmed = line.trim();
if trimmed.starts_with("```") {
if !in_fence {
// drop any blank lines before us
while matches!(out.last(), Some(l) if l.trim().is_empty()) {
out.pop();
}
// if the last non-empty line is an orphan lang spec, merge it
if let Some(prev) = out.last() {
let lang = prev.trim();
if ORPHAN_LANG_RE.is_match(lang) {
out.pop();
out.push(format!("```{lang}"));
in_fence = true;
continue; // skip the normal push below
}
}
// otherwise, this is just a normal opening fence
} else {
// closing the fence
in_fence = false;
}
}
out.push(line.clone());
}
out
}Key changes:
- Single pass over
lines(no iterator cloning or nested loops). - When you see an opening fence and are not already in one, you “rewind” over any blank lines in
out, check the last non‐empty line forORPHAN_LANG_RE, pop & merge if it matches. - Otherwise you fall back to toggling
in_fenceas before and pushing the line.
This preserves the behavior (skips empties between orphan and fence, toggles fence state) but is much simpler.
|
@coderabbitai Please create a GitHub issue with a proposed solution for the following concern: The attach_orphan_specifiers function has a fairly complex peekable loop; you might simplify its logic or extract smaller helpers to make the specifier-attachment state transitions clearer and easier to maintain. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainI'll examine the current implementation of the @leynos I've created a GitHub issue titled "Refactor attach_orphan_specifiers function to reduce complexity" that addresses your concern about the complex peekable loop in the The issue includes:
The issue has been assigned to you as requested. This refactoring would improve the maintainability and testability of the fence normalization functionality while preserving the existing behavior. 🏁 Script executed: Length of output: 1457 🏁 Script executed: Length of output: 1718 @leynos Perfect! I've successfully created GitHub issue #109 titled "Refactor attach_orphan_specifiers function to reduce complexity" at #109. After examining the current implementation, I can see why this function needs refactoring. The complexity issues include:
The issue I created proposes a comprehensive solution with:
The issue has been assigned to you and includes backlinks to this PR for proper traceability. |
Summary
Testing
make fmtmake lintmake testmake markdownlintmake nixiehttps://chatgpt.com/codex/tasks/task_e_687b7f50db348322ae3409707771c7d6
Summary by Sourcery
Introduce an optional fence normalization step that compresses code block delimiters to three backticks and attaches orphaned language specifiers before further processing. Expose this behavior through a new CLI flag and library option, update the public API, documentation, and add comprehensive tests.
New Features:
Enhancements:
Documentation:
Tests: