Conversation
Reviewer's GuideThis PR introduces a new ellipsis module that replaces '...' with the Unicode ellipsis character, exposes a --ellipsis flag in the CLI to apply this transformation in the processing pipeline, and updates documentation and tests to cover the feature. Entity relationship diagram for FormatOpts struct updateerDiagram
FORMATOPTS {
BOOL wrap
BOOL renumber
BOOL breaks
BOOL ellipsis
}
Class diagram for the new ellipsis module and CLI integrationclassDiagram
class ellipsis {
<<module>>
+replace_ellipsis(lines: &[String]) Vec<String>
}
class wrap {
<<module>>
+is_fence(line: &str) bool
+tokenize_markdown(line: &str) Vec<String>
}
class Cli {
+FormatOpts
}
class FormatOpts {
+wrap: bool
+renumber: bool
+breaks: bool
+ellipsis: bool
}
class process_lines {
+process_lines(lines: &[String], opts: FormatOpts) Vec<String>
}
process_lines ..> ellipsis : uses replace_ellipsis
ellipsis ..> wrap : uses is_fence, tokenize_markdown
Cli o-- FormatOpts
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary by CodeRabbit
WalkthroughExpand the command-line tool with a new Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant ellipsis
User->>CLI: Run mdtablefix with --ellipsis
CLI->>CLI: Parse input and options
CLI->>CLI: Process lines (wrap, renumber, breaks)
alt --ellipsis flag enabled
CLI->>ellipsis: replace_ellipsis(lines)
ellipsis-->>CLI: Return processed lines
end
CLI-->>User: Output formatted text with ellipses replaced
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (2)**/*.mdInstructions used from: Sources: **/*.rsInstructions used from: Sources: 🔇 Additional comments (9)
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Hey @leynos - I've reviewed your changes and found some issues that need to be addressed.
- Consider integrating the ellipsis replacement into the main processing pipeline (e.g. in process_stream) to avoid an extra buffer clone and make the order of transformations clearer.
- It may be more robust to detect and replace three-or-more consecutive dots with a single regex rather than repeated string replacements within each token, which could simplify edge-case handling.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider integrating the ellipsis replacement into the main processing pipeline (e.g. in process_stream) to avoid an extra buffer clone and make the order of transformations clearer.
- It may be more robust to detect and replace three-or-more consecutive dots with a single regex rather than repeated string replacements within each token, which could simplify edge-case handling.
## Individual Comments
### Comment 1
<location> `tests/integration.rs:956` </location>
<code_context>
+mod tests {
+ use super::*;
+
+ #[test]
+ fn replaces_simple_text() {
+ let input = vec!["wait...".to_string()];
</code_context>
<issue_to_address>
Consider adding integration tests for edge cases, such as input with code spans or fenced code blocks.
Please add tests with code spans and fenced code blocks containing ellipses to verify the CLI preserves ellipses inside code.
</issue_to_address>
### Comment 2
<location> `src/ellipsis.rs:10` </location>
<code_context>
+
+/// Replace `...` with `…` outside code spans and fences.
+#[must_use]
+pub fn replace_ellipsis(lines: &[String]) -> Vec<String> {
+ let mut out = Vec::with_capacity(lines.len());
+ let mut in_code = false;
</code_context>
<issue_to_address>
Consider refactoring to use a tokenizer that emits typed tokens for fences, code, and text, enabling a single-pass, flag-free ellipsis replacement.
Here’s one way to remove your manual fence‐tracking and per‐line cloning entirely by pushing both “fence” and “inline‐code” into your tokenizer. Once your `tokenize_markdown` yields a small `enum Token { Fence(&str), Code(&str), Text(&str) }` you can rewrite `replace_ellipsis` in one pass over tokens:
1. Change your tokenizer to tag fences, code spans, and text:
```rust
// in crate::wrap
/// A markdown token that knows whether it’s a fence, inline‐code, or just text.
pub enum Token<'a> {
Fence(&'a str),
Code(&'a str),
Text(&'a str),
}
/// Now handles both code‐spans and fence‐blocks for you.
/// (implementation folds in your old `is_fence` and inline‐code logic)
pub fn tokenize_markdown(input: &str) -> Vec<Token<'_>> {
let mut out = Vec::new();
let mut in_fence = false;
for line in input.lines() {
if line.starts_with("```") {
in_fence = !in_fence;
out.push(Token::Fence(line));
} else if in_fence {
out.push(Token::Fence(line));
} else {
// break this line into backtick‐wrapped vs plain slices…
let mut rest = line;
while let Some(pos) = rest.find('`') {
if pos > 0 {
out.push(Token::Text(&rest[..pos]));
}
// find closing backtick
let (backticked, rem) = match rest[pos+1..].find('`') {
Some(end) => (&rest[pos+1..pos+1+end], &rest[pos+2+end..]),
None => (&rest[pos+1..], ""),
};
out.push(Token::Code(backticked));
rest = rem;
}
if !rest.is_empty() {
out.push(Token::Text(rest));
}
}
}
out
}
```
2. Then collapse your outer loops into one simple map over tokens:
```rust
pub fn replace_ellipsis(lines: &[String]) -> Vec<String> {
// join lines so fences spanning lines still work
let doc = lines.join("\n");
let mut out = String::with_capacity(doc.len());
for token in tokenize_markdown(&doc) {
match token {
Token::Text(t) => out.push_str(&t.replace("...", "…")),
Token::Code(c) | Token::Fence(c) => {
out.push_str(c);
// re‐emit the backticks for code spans
if let Token::Code(_) = token {
out.push('`');
}
out.push('\n');
}
}
}
// split back to per‐line Vec<String>
out.lines().map(|l| l.to_string()).collect()
}
```
Benefits:
- No `in_code` flag or manual `is_fence` calls in `replace_ellipsis`.
- Single pass over a well‐typed `Token` enum.
- All functionality (inline spans, fenced blocks, ellipsis replacement) stays intact.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
README.md(2 hunks)docs/module-relationships.md(2 hunks)src/ellipsis.rs(1 hunks)src/lib.rs(1 hunks)src/main.rs(4 hunks)tests/integration.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
docs/**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
- docs/html-table-support.md
🧬 Code Graph Analysis (1)
src/ellipsis.rs (1)
src/wrap.rs (2)
is_fence(152-152)tokenize_markdown(20-71)
🪛 LanguageTool
docs/module-relationships.md
[style] ~69-~69: Would you like to use the Oxford spelling “normalization”? The spelling ‘normalisation’ is also correct.
Context: ...es. The ellipsis module performs text normalisation. The process module provides streamin...
(OXFORD_SPELLING_Z_NOT_S)
🔇 Additional comments (8)
README.md (1)
25-25: Documentation updates are accurate and well-integrated.The command-line usage and option descriptions correctly reflect the new ellipsis functionality. The formatting is consistent with existing documentation patterns.
Also applies to: 35-36
tests/integration.rs (1)
956-966: Integration test is well-implemented and comprehensive.The test properly validates the end-to-end CLI behaviour for the new ellipsis option, following established patterns in the test suite.
docs/module-relationships.md (1)
35-38: Documentation accurately reflects the new module structure.The Mermaid diagram and textual descriptions correctly document the ellipsis module and its relationships. The British spelling "normalisation" is correct per the coding guidelines specifying en-GB usage.
Also applies to: 54-54, 61-61, 69-71
src/main.rs (2)
41-43: CLI integration is correctly implemented.The ellipsis flag is properly integrated with appropriate documentation and processing order. The placement after other transformations ensures code blocks remain protected.
Also applies to: 61-63
9-15: Reformat the import statement for consistency.The multi-line import formatting is unnecessarily verbose. Format it more compactly to match existing patterns.
-use mdtablefix::{ - ellipsis::replace_ellipsis, - format_breaks, - process_stream, - process_stream_no_wrap, - renumber_lists, -}; +use mdtablefix::{ + ellipsis::replace_ellipsis, format_breaks, process_stream, process_stream_no_wrap, + renumber_lists, +};Likely an incorrect or invalid review comment.
src/ellipsis.rs (3)
1-5: Excellent module documentation.The module documentation clearly explains the purpose and key behaviour whilst adhering to the required
//!format and en-GB spelling conventions.
6-10: Proper function signature and documentation.The imports are appropriate, the
#[must_use]attribute is correctly positioned after the doc comment, and the function signature follows Rust conventions.
11-34: Well-implemented ellipsis replacement logic.The function efficiently handles both fenced code blocks and inline code spans whilst maintaining clean control flow. The pre-allocation and use of existing tokenization functionality demonstrates good performance considerations.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/ellipsis.rs (1)
48-105: Comprehensive test coverage addressing previous feedback.The test suite covers all critical scenarios:
- Basic ellipsis replacement
- Code span preservation
- Fenced block preservation
- Long sequence handling with proper grouping
- Empty input handling
- Multiple fenced blocks
The
handles_empty_inputandhandles_multiple_fenced_blockstests directly address the suggestions from past review comments.src/lib.rs (1)
30-30: Proper public re-export addressing previous feedback.The
replace_ellipsisfunction is correctly re-exported, making it available in the public API. This addresses the missing re-export identified in past review comments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
README.md(3 hunks)src/ellipsis.rs(1 hunks)src/lib.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
**/*.md
Instructions used from:
Sources:
📄 CodeRabbit Inference Engine
- AGENTS.md
⚙️ CodeRabbit Configuration File
🧬 Code Graph Analysis (1)
src/lib.rs (2)
src/ellipsis.rs (1)
replace_ellipsis(17-46)src/process.rs (3)
process_stream(81-81)process_stream_no_wrap(84-86)process_stream_opts(89-91)
🪛 LanguageTool
README.md
[uncategorized] ~36-~36: Use a comma before ‘so’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... Longer runs are processed left-to-right so leftover dots remain unchanged. - Use...
(COMMA_COMPOUND_SENTENCE_2)
🔇 Additional comments (6)
src/ellipsis.rs (3)
1-7: Excellent module documentation following Rust conventions.The module documentation clearly explains the purpose, behaviour for long sequences, and preservation of code formatting.
8-14: Solid use of LazyLock for regex compilation.The static regex with LazyLock ensures the pattern is compiled once and reused efficiently. The regex pattern correctly matches sequences of three or more dots.
15-46: Well-implemented ellipsis replacement with proper tokenisation.The function correctly:
- Handles empty input early
- Uses tokenisation to preserve code formatting
- Applies regex replacement only to text tokens
- Reconstructs inline code spans with backticks
- Preserves fenced code blocks unchanged
The replacement logic properly handles long sequences by converting groups of three dots to ellipsis characters and preserving remainder dots.
src/lib.rs (3)
9-9: Proper documentation update for the new module.The ellipsis module description accurately reflects its purpose for normalising textual ellipses.
14-14: Correct module declaration following alphabetical ordering.The ellipsis module is properly declared in the correct alphabetical position amongst other modules.
34-34: Enhanced process module exports for configuration options.The addition of
process_stream_optsto the public exports provides external callers with configurable wrapping and ellipsis replacement options, complementing the existing convenience functions.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
README.md(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.md
Instructions used from:
Sources:
⚙️ CodeRabbit Configuration File
🔇 Additional comments (2)
README.md (2)
25-25: CLI synopsis correctly advertises the new flag
The command-line summary now lists--ellipsis, keeping the documentation in
sync with the binary’s behaviour.
35-37: Option description reads well and follows house style
Bullet is wrapped within 80 columns and avoids personal pronouns.
Summary
ellipsismodule to replace...with the Unicode ellipsis--ellipsisCLI flagTesting
make fmtmake lintmake testmake markdownlintmake nixiehttps://chatgpt.com/codex/tasks/task_e_68797c95ec4c832285890f6b5ce9b73a
Summary by Sourcery
Add ellipsis command-line option to replace sequences of three dots with the Unicode ellipsis character
New Features:
Enhancements:
Documentation:
Tests: