diff --git a/AGENTS.md b/AGENTS.md index e3f084c2..f79536ab 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -29,7 +29,7 @@ documentation should omit examples where the example serves only to reiterate the test logic. - **Keep file size manageable.** No single code file may be longer than 400 - lines. Long switch statements or dispatch tables should be broken up by + lines. Long switch statements or dispatch tables should be broken up by feature and constituents colocated with targets. Large blocks of test data should be moved to external data files. @@ -118,6 +118,7 @@ project: - Run `make check-fmt`, `make lint`, and `make test` before committing. These targets wrap the following commands, so contributors understand the exact behaviour and policy enforced: + - `make check-fmt` executes: ```sh @@ -125,6 +126,7 @@ project: ``` validating formatting across the entire workspace without modifying files. + - `make lint` executes: ```sh @@ -133,6 +135,7 @@ project: linting every target with all features enabled and denying all Clippy warnings. + - `make test` executes: ```sh @@ -142,37 +145,60 @@ project: running the full workspace test suite. Use `make fmt` (`cargo fmt --workspace`) to apply formatting fixes reported by the formatter check. + - Clippy warnings MUST be disallowed. + - Fix any warnings emitted during tests in the code itself rather than silencing them. + - Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS. + - Where a function has too many parameters, group related parameters in meaningfully named structs. + - Where a function is returning a large error, consider using `Arc` to reduce the amount of data returned. + - Write unit and behavioural tests for new functionality. Run both before and after making any change. + - Every module **must** begin with a module level (`//!`) comment explaining the module's purpose and utility. + - Document public APIs using Rustdoc comments (`///`) so documentation can be generated with cargo doc. + - Prefer immutable data and avoid unnecessary `mut` bindings. + - Handle errors with the `Result` type instead of panicking where feasible. + - Use explicit version ranges in `Cargo.toml` and keep dependencies up-to-date. + - Avoid `unsafe` code unless absolutely necessary, and document any usage clearly. + - Place function attributes **after** doc comments. + - Do not use `return` in single-line functions. + - Use predicate functions for conditional criteria with more than two branches. + - Lints must not be silenced except as a **last resort**. + - Lint rule suppressions must be tightly scoped and include a clear reason. + - Prefer `expect` over `allow`. + - Use `rstest` fixtures for shared setup. + - Replace duplicated tests with `#[rstest(...)]` parameterized cases. + - Prefer `mockall` for mocks/stubs. + - Use `concat!()` to combine long string literals rather than escaping newlines with a backslash. + - Prefer single line versions of functions where appropriate. i.e., ```rust diff --git a/README.md b/README.md index fcece86a..1f932fb5 100644 --- a/README.md +++ b/README.md @@ -74,6 +74,38 @@ mdtablefix [--version] [--wrap] [--renumber] [--breaks] [--ellipsis] [--fences] - If no files are specified, input is read from stdin and output is written to stdout. +## YAML frontmatter + +Documents that begin with a YAML frontmatter block have that block preserved +exactly while the remainder of the document is formatted. A frontmatter block +starts with a line containing exactly `---` and ends with a line containing +exactly `---` or `...`. Only a block at the very beginning of the document is +recognized as frontmatter. + +Before: + +```markdown +--- +title: My Document +author: Jane Doe +--- +|Character|Catchphrase| +|---|---| +|Speedy|Here come the cats!| +``` + +After running `mdtablefix`: + +```markdown +--- +title: My Document +author: Jane Doe +--- +| Character | Catchphrase | +| --------- | ------------------- | +| Speedy | Here come the cats! | +``` + ## Concurrency When multiple file paths are supplied, `mdtablefix` processes them in parallel diff --git a/docs/architecture.md b/docs/architecture.md index 8f5c8850..efe6f6a2 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -20,6 +20,12 @@ pub fn process_stream_inner(lines: &[String], opts: Options) -> Vec The function combines several helpers documented in `docs/`: +- `frontmatter::split_leading_yaml_frontmatter` detects and splits a leading + YAML frontmatter block from the document body. A valid frontmatter block + starts with `---` on the first line and ends with `---` or `...` before any + body content. The prefix is preserved verbatim while only the body is + processed. This shielding also applies to CLI-only transforms such as + `renumber_lists` and `format_breaks`. - `fences::compress_fences` and `attach_orphan_specifiers` normalize code block delimiters. The latter keeps indentation from the language line when the fence lacks it. Language specifiers explicitly set to `null` @@ -27,8 +33,8 @@ The function combines several helpers documented in `docs/`: `compress_fences` also tolerates spaces within comma-separated specifiers, e.g. `TOML, Ini` becomes `toml,ini`. - `html::convert_html_tables` transforms basic HTML tables into Markdown so \ - they can be reflowed like regular tables. See \ - [HTML table support](#html-table-support-in-mdtablefix). + they can be reflowed like regular tables. See \ + [HTML table support](#html-table-support-in-mdtablefix). - `wrap::wrap_text` applies optional line wrapping. It relies on the `unicode-width` crate for accurate character widths. - `wrap::tokenize_markdown` emits `Token` values for custom processing. @@ -48,7 +54,7 @@ incoming lines are buffered or emitted. Once the end of a table or fence is reached, buffered lines are flushed and possibly reformatted. The simplified behaviour is illustrated below. -```mermaid +````mermaid stateDiagram-v2 [*] --> Streaming: Start @@ -72,7 +78,7 @@ stateDiagram-v2 InHtmlTable --> InHtmlTable: Line inside table tag InCodeFence --> Streaming: Line is a fence delimiter -``` +```` Before: diff --git a/docs/documentation-style-guide.md b/docs/documentation-style-guide.md index 7807f229..fe343b8e 100644 --- a/docs/documentation-style-guide.md +++ b/docs/documentation-style-guide.md @@ -9,9 +9,9 @@ Apply these rules to keep the documentation clear and consistent for developers. [Oxford English Dictionary](https://public.oed.com/) locale `en-GB`, which denotes English for the Great Britain market: - suffix -ize in words like _realize_ and _organization_ instead of - -ise endings, + -ise endings, - suffix ‑lyse in words not traced to the Greek ‑izo, ‑izein suffixes, - such as _analyse_, _paralyse_ and _catalyse_, + such as _analyse_, _paralyse_ and _catalyse_, - suffix -our in words such as _colour_, _behaviour_ and _neighbour_, - suffix -re in words such as _calibre_, _centre_ and _fibre_, - double "l" in words such as _cancelled_, _counsellor_ and _cruellest_, @@ -87,7 +87,7 @@ contents of the manual. they do not execute during documentation tests. - Put function attributes after the doc comment. -```rust,no_run +````rust,no_run /// Returns the sum of `a` and `b`. /// /// # Parameters @@ -106,7 +106,7 @@ contents of the manual. pub fn add(a: i32, b: i32) -> i32 { a + b } -``` +```` ## Diagrams and images diff --git a/docs/execplans/yaml-frontmatter.md b/docs/execplans/yaml-frontmatter.md new file mode 100644 index 00000000..ca5156f7 --- /dev/null +++ b/docs/execplans/yaml-frontmatter.md @@ -0,0 +1,344 @@ +# Preserve leading YAML frontmatter while formatting Markdown + +This ExecPlan (execution plan) is a living document. The sections +`Constraints`, `Tolerances`, `Risks`, `Progress`, `Surprises & Discoveries`, +`Decision Log`, and `Outcomes & Retrospective` must be kept up to date as work +proceeds. + +Status: DELIVERED + +## Purpose / big picture + +After this change, `mdtablefix` must accept a Markdown document that begins +with a YAML frontmatter block and leave that block byte-for-byte unchanged +while continuing to format the Markdown body normally. A user should be able +to run the formatter with flags such as `--wrap`, `--breaks`, or `--in-place` +and still see the opening delimiter, YAML keys, and closing delimiter exactly +as they were written. + +The observable success case is a file that starts with: + +```plaintext +--- +name: weaver +description: Preserve this YAML metadata exactly. +--- +``` + +and ends with the same frontmatter block unchanged after formatting, while the +Markdown that follows is still reflowed, renumbered, or otherwise rewritten +according to the selected options. + +## Constraints + +- Only a leading document block counts as YAML frontmatter. A `---` line later + in the document must keep its existing Markdown meaning. +- The detected frontmatter block must be copied verbatim, including both + delimiter lines and all interior lines. +- No new external dependencies may be introduced. +- Public CLI flags and existing library entry points in `src/lib.rs` must stay + stable. +- The implementation must keep files below the repository's 400-line limit. +- The change must include both focused unit tests and behavioural CLI tests. +- The user guide must be updated. In this repository that means at minimum + `README.md`, and `docs/architecture.md` should also be updated if the + processing pipeline changes materially. + +## Tolerances (exception triggers) + +- Scope: if the work requires changes to more than 8 files or roughly 250 net + lines of code, stop and re-evaluate the design. +- Interfaces: if preserving frontmatter requires changing a public function + signature or adding a new CLI flag, stop and escalate. +- Dependencies: if a new crate seems necessary, stop and escalate. +- Ambiguity: if the required delimiter rules are not satisfied by the standard + leading YAML forms (`---` opener with `---` or `...` closer), stop and ask + for clarification before coding. +- Iterations: if the new or updated tests still fail after 3 focused fix + cycles, stop and document the blocker. +- Time: if one milestone takes more than 2 hours, stop and record why. + +## Risks + +- Risk: frontmatter might still be modified by CLI-only transforms such as + `renumber_lists` or `format_breaks` after the main stream processor returns. + Severity: high + Likelihood: medium + Mitigation: protect the body split at the highest shared pipeline boundary + and add a CLI regression that includes `--breaks`. + +- Risk: delimiter detection can become too permissive and accidentally treat a + thematic break or ordinary `---` block as frontmatter. + Severity: medium + Likelihood: medium + Mitigation: only detect frontmatter when the very first line is a delimiter + and require a matching closing delimiter before shielding the block. + +- Risk: `src/process.rs` is already close to the repository's file-length + ceiling. + Severity: medium + Likelihood: high + Mitigation: place the detector and splitter logic in a new small module + instead of extending `src/process.rs` significantly. + +## Progress + +- [x] (2026-04-05 22:45Z) Reviewed the current processing pipeline, test + layout, and user-facing documentation surfaces. +- [x] (2026-04-09) Add a shared helper for detecting and splitting leading YAML + frontmatter. +- [x] (2026-04-09) Thread the helper through the library and CLI formatting pipeline + so all transforms skip the frontmatter prefix. +- [x] (2026-04-09) Add unit and behavioural regression tests covering detection, + wrapping, and `--breaks`. +- [x] (2026-04-09) Update `README.md` and `docs/architecture.md`. +- [x] (2026-04-09) Run `make check-fmt`, `make lint`, `make test`, `make markdownlint`, + and `make nixie` if Mermaid content changes. + +## Surprises & discoveries + +- Observation: `src/main.rs` applies `renumber_lists` and `format_breaks` + after `process_stream_opts`, so shielding only `process_stream_inner` would + still allow the frontmatter delimiters to be rewritten. + Evidence: `process_lines` in `src/main.rs`. + Impact: the plan must protect the body before or around CLI-only transforms, + not just inside `src/process.rs`. + +- Observation: `src/process.rs` is 343 lines before this feature. + Evidence: `leta files` output for `src/process.rs`. + Impact: new helper logic should live in its own module to stay within the + repository limit and keep tests readable. + +## Decision log + +- Decision: use a shared internal splitter for leading YAML frontmatter rather + than adding special cases separately in each transform. + Rationale: one detector keeps the delimiter rules consistent and reduces the + chance that a later pipeline stage mutates the protected prefix. + Date/Author: 2026-04-05 22:45Z / Droid + +- Decision: treat unmatched opening delimiters as ordinary Markdown instead of + partially shielding the document. + Rationale: this avoids swallowing the entire file into a special mode and + preserves current behaviour for malformed input. + Date/Author: 2026-04-05 22:45Z / Droid + +## Outcomes & retrospective + +The frontmatter splitter was successfully implemented in the `frontmatter` +module and integrated through both the `process` module and `main` module. +Test coverage was added covering detection, wrapping, and `--breaks` flags +for both library and CLI paths. All transforms now correctly skip the +frontmatter prefix, preserving the leading YAML block exactly while +formatting the Markdown body. + +## Context and orientation + +The main formatting pipeline lives in `src/process.rs`. It handles table +reflow, fence tracking, HTML table conversion, heading conversion, wrapping, +ellipsis replacement, and footnote conversion through +`process_stream_inner(lines, opts)`. + +The CLI entry point lives in `src/main.rs`. Its `process_lines` function first +calls `process_stream_opts`, then applies ordered-list renumbering and +thematic-break formatting. This is important because the YAML delimiter `---` +looks like a thematic break, so protecting the frontmatter only in +`src/process.rs` is insufficient. + +In-place file rewriting is handled by `src/io.rs`, which delegates to the +library functions and therefore benefits automatically once the shared library +pipeline preserves frontmatter correctly. + +Focused library tests already live beside the implementation in +`src/process.rs` and `src/io.rs`. Behavioural CLI tests live in `tests/cli.rs` +and `tests/wrap/cli.rs`. The user-facing guide is `README.md`. The processing +pipeline is described in `docs/architecture.md`. + +## Plan of work + +Stage A is a small, isolated detector module. Add `src/frontmatter.rs` with a +module-level comment and a private helper that splits the input into an +unchanged frontmatter prefix and a Markdown body slice. The helper should only +match when `lines.first()` is exactly the YAML opener and a closing delimiter +is found before the body begins. If no valid closer exists, return an empty +prefix and the original input as the body. + +Stage B wires the helper through the library pipeline. Update `src/lib.rs` to +declare the new module. In `src/process.rs`, split the input first, run the +existing processing logic only on the body slice, and then prepend the +unchanged prefix to the processed body. Keep the current ordering of fences, +HTML tables, wrapping, headings, ellipsis, and footnotes for the body. + +Stage C wires the same protection through the CLI-only transforms. In +`src/main.rs`, split the original input once in `process_lines`, pass only the +body slice through `process_stream_opts`, `renumber_lists`, and `format_breaks` +as needed, then prepend the original prefix before returning the final lines. +This ensures `--breaks` cannot rewrite the `---` delimiters in the frontmatter +block. + +Stage D adds regression coverage. Put detector-specific unit tests in +`src/frontmatter.rs` and a pipeline regression in `src/process.rs` or a small +new test module. Add at least one behavioural CLI test in `tests/cli.rs` +covering a document with leading frontmatter plus a paragraph or table body. +The CLI test should enable `--breaks` and one ordinary formatting option such +as `--wrap` so it proves both preservation and continued formatting. + +Stage E updates the docs. Add a short YAML frontmatter note and example to +`README.md` so users know the block is preserved. Update +`docs/architecture.md` to describe the leading-frontmatter split before the +rest of the formatting pipeline. + +Each stage ends with focused validation before moving on. + +## Concrete steps + +Work from the repository root: + +```bash +pwd +``` + +Expected: + +```plaintext +/home/leynos/Projects/mdtablefix.worktrees/yaml-frontmatter +``` + +Add the detector and its focused tests, then run the smallest relevant test +set first: + +```bash +cargo test frontmatter --lib +``` + +Expected: + +```plaintext +running tests +test ...frontmatter... ok +``` + +After wiring the library and CLI paths, run focused regressions: + +```bash +cargo test process::tests:: --lib +cargo test --test cli yaml_frontmatter +``` + +Manually verify the user-visible behaviour with a CLI example: + +```bash +printf '%s\n' \ + '---' \ + 'name: weaver' \ + 'description: short example' \ + '---' \ + '' \ + '|A|B|' \ + '|1|2|' | cargo run -- --wrap --breaks +``` + +Expected: + +```plaintext +--- +name: weaver +description: short example +--- + +| A | B | +| 1 | 2 | +``` + +Finish with repository validators, run sequentially: + +```bash +make check-fmt +make lint +make test +make markdownlint +make nixie +``` + +If `docs/architecture.md` does not change any Mermaid content, `make nixie` +may be skipped. + +## Validation and acceptance + +Acceptance means all of the following are true: + +- A document that starts with a valid YAML frontmatter block keeps that block + exactly unchanged after formatting. +- The Markdown body that follows is still formatted normally, including table + reflow and optional wrapping. +- `--breaks` does not rewrite the frontmatter delimiters. +- The new detector rejects malformed or non-leading cases without changing + existing behaviour elsewhere. +- The README explains the feature clearly enough for a user to discover it. + +Quality criteria: + +- Tests: the new unit tests and CLI regression tests pass, and `make test` + passes for the full suite. +- Lint: `make lint` passes with no warnings. +- Formatting: `make check-fmt` passes. +- Docs: `make markdownlint` passes, and `make nixie` passes if Mermaid content + changed. + +## Idempotence and recovery + +The planned edits are safe to repeat because the detector only changes control +flow, not persisted state outside the repository. If a step goes wrong, revert +the affected file and rerun the focused tests before continuing. The manual +CLI example is read-only and may be rerun as many times as needed. + +## Artifacts and notes + +Key repository evidence gathered before implementation: + +```plaintext +src/process.rs -> core stream processor, already 343 lines +src/main.rs -> CLI-only post-processing for renumbering and thematic breaks +tests/cli.rs -> behavioural CLI coverage +README.md -> current user guide +docs/architecture.md -> pipeline description +``` + +The most failure-prone path is `--breaks`, because it can legally rewrite a +plain `---` line outside frontmatter. The tests must therefore include that +flag. + +## Interfaces and dependencies + +Do not add dependencies. + +Add a new internal module at `src/frontmatter.rs` with a helper shaped like: + +```rust +#[doc(hidden)] +pub mod frontmatter; +#[doc(hidden)] +pub use frontmatter::split_leading_yaml_frontmatter; +``` + +The helper `split_leading_yaml_frontmatter` returns `(prefix, body)`, where +`prefix` is the untouched leading YAML block, or an empty slice if no valid +block exists. The module and helper are marked `#[doc(hidden)]` to keep them +out of the public API documentation while remaining accessible to the binary +crate. + +`src/process.rs` calls the helper in `process_stream`, `process_stream_no_wrap`, +and `process_stream_opts` before existing body processing. `src/main.rs` calls +the same helper in `process_lines` before CLI-only transforms (`--renumber`, +`--breaks`). + +Interface note: The `frontmatter` module is exported as `pub` with +`#[doc(hidden)]` rather than `pub(crate)` because the binary crate (`main.rs`) +requires access to `split_leading_yaml_frontmatter`. The binary and library are +separate crate targets, so `pub(crate)` would not allow the binary to access +the symbol. Using `#[doc(hidden)]` prevents the API from appearing in docs +while maintaining the necessary visibility. + +Revision note: Delivered. The implementation follows the plan with the +visibility adjustment noted above. All tests pass and the feature is ready +for use. diff --git a/docs/rust-doctest-dry-guide.md b/docs/rust-doctest-dry-guide.md index 8ac95f11..31d04264 100644 --- a/docs/rust-doctest-dry-guide.md +++ b/docs/rust-doctest-dry-guide.md @@ -18,13 +18,13 @@ running within the library's own context, but as an entirely separate, temporary crate.[^1] When a developer executes `cargo test --doc`, `rustdoc` initiates a multi-stage process for every code -block found in the documentation comments [^3]: +block found in the documentation comments[^3]: 1. **Parsing and Extraction**: `rustdoc` first parses the source code of the library, resolving conditional compilation attributes (`#[cfg]`) to determine which items are active and should be documented for the current target.[^2] It then extracts all code examples enclosed in triple-backtick - fences (\`\`\`\`). + fences (\`\`\`). 2. **Code Generation**: For each extracted code block, `rustdoc` performs a textual transformation to create a complete, self-contained Rust program. If @@ -55,9 +55,9 @@ This "separate crate" paradigm has two immediate and significant consequences that shape all advanced doctesting patterns. First, **API visibility is strictly limited to public items**. Because the -doctest is compiled as an external crate, it can only access functions. -Structs, traits and modules are marked with the `pub` keyword. It has no access -to private items or even crate-level public items (e.g., `pub(crate)`). This is +doctest is compiled as an external crate, it can only access functions, +structs, traits, and modules marked with the `pub` keyword. It has no access to +private items or even crate-level public items (e.g., `pub(crate)`). This is not a bug or an oversight but a fundamental aspect of the design, enforcing the perspective of an external consumer.[^1] @@ -71,7 +71,7 @@ CI/CD cycle, a common pain point in the Rust community.[^2] The architectural purity of the `rustdoc` model—its insistence on simulating an external user—creates a fundamental trade-off. On one hand, it provides an unparalleled guarantee that the public documentation is accurate and that the -examples work as advertised, creating true "living documentation".[^8] On the +examples work as advertised, creating true "living documentation".[^7] On the other hand, this same purity prevents the use of doctests for verifying documentation of internal, private APIs. This forces a bifurcation of documentation strategy. Public-facing documentation can be tied directly to @@ -97,32 +97,30 @@ clear, illustrative, and robust. Doctests reside within documentation comments. Rust recognizes two types: -- **Outer doc comments (**`///`**)**: These document the item that follows them - (e.g., a function, struct, or module). This is the most common type.[^8] +- **Outer doc comments (`///`)**: These document the item that follows them + (e.g., a function, struct, or module). This is the most common type.[^7] -- **Inner doc comments (**`//!`**)**: These document the item they are inside - of (e.g., a module or the crate itself). They are typically used at the top - of `lib.rs` or `mod.rs` to provide crate- or module-level documentation.[^9] +- **Inner doc comments (`//!`)**: These document the item they are inside + (e.g., a module or the crate itself). They are typically used at the top of + `lib.rs` or `mod.rs` to provide crate- or module-level documentation.[^8] -Within these comments, a code block is denoted by triple backticks. while -`rustdoc` defaults to assuming the language is Rust, explicitly adding the rust -language specifier (e.g., `rust`) is considered good practice for clarity.[^3] - -A doctest is considered to "pass" if it compiles successfully and runs to -completion without panicking. To verify that a function produces a specific -output, developers should use the standard assertion macros, such as `assert!`, -`assert_eq!`, and `assert_ne!`.[^3] +Within these comments, a code block is denoted by triple back-ticks (```). +While `rustdoc` defaults to Rust syntax, explicitly add the `rust` language +specifier for clarity.[^3] A doctest "passes" when it compiles and runs +without panicking. To assert specific outcomes, use the standard macros +`assert!`, `assert_eq!`, and `assert_ne!`.[^3] ### 2.2 The Philosophy of a Good Example The purpose of a documentation example extends beyond merely demonstrating syntax. A reader can typically be expected to understand the mechanics of calling a function or instantiating a struct. A truly valuable example -illustrates *why* and in *what context* an item should be used.[^10] It should +illustrates *why* and in *what context* an item should be used.[^9] It should tell a small story or solve a miniature problem that illuminates the item's -purpose. For instance, an example for `String::clone()` should not just show -`hello.clone();` but should demonstrate a scenario where ownership rules -necessitate creating a copy.[^10] +purpose. For instance, an example for + +`String::clone()` should not just show `hello.clone();`, but should demonstrate +a scenario where ownership rules necessitate creating a copy.[^9] To achieve this, examples must be clear and concise. Any code that is not directly relevant to the point being made—such as complex setup, boilerplate, @@ -138,18 +136,18 @@ type of `()`, while the `?` operator can only be used in a function that returns a `Result` or `Option`. This mismatch leads to a compilation error.[^3] Using `.unwrap()` or `.expect()` in examples is strongly discouraged. It is -considered an antipattern because users often copy example code verbatim, and -encouraging panicking on errors is contrary to robust application design.[^10] +considered an anti-pattern because users often copy example code verbatim, and +encouraging panicking on errors is contrary to robust application design.[^9] Instead, two canonical solutions exist. -Solution [^1]: The Explicit main Function +Solution 1: The Explicit main Function The most transparent and recommended approach is to manually write a main function within the doctest that returns a Result. This leverages the Termination trait, which is implemented for Result. The surrounding boilerplate can then be hidden from the rendered documentation. -```rust +```Rust /// # Examples /// /// ``` @@ -159,21 +157,21 @@ can then be hidden from the rendered documentation. /// let config = "key=value".parse::()?; /// assert_eq!(config.get("key"), Some("value")); /// # -/// # OK(()) +/// # Ok(()) /// # } /// ``` ``` In this pattern, the reader only sees the core, fallible code, while the test -itself is a complete, well-behaved program.[^10] +itself is a complete, well-behaved program.[^9] -Solution [^2]: The Implicit Result-Returning main +Solution 2: The Implicit Result-Returning main rustdoc provides a lesser-known but more concise shorthand for this exact scenario. If a code block ends with the literal token (()), rustdoc will automatically wrap the code in a main function that returns a Result. -```rust +```Rust /// # Examples /// /// ``` @@ -201,8 +199,9 @@ human-readable example and what constitutes a complete, compilable program. Its primary use cases include: 1. **Hiding** `main` **Wrappers**: As demonstrated in the error-handling - examples, the entire `fn main() -> Result<…> {… }` and `OK(())` scaffolding - can be hidden, presenting the user with only the relevant code.[^10] + examples, the entire `fn main() -> Result<...> {... }` and `Ok(())` + scaffolding can be hidden, presenting the user with only the relevant + code.[^9] 2. **Hiding Setup Code**: If an example requires some preliminary setup—like creating a temporary file, defining a helper struct for the test, or @@ -210,24 +209,24 @@ primary use cases include: on the API item being documented.[^3] 3. **Hiding** `use` **Statements**: While often useful to show which types are - involved, `use` statements can sometimes be hidden to de-clutter very simple + involved, `use` statements can sometimes be hidden to declutter simple examples. The existence of features like hidden lines and the `(())` shorthand reveals a core tension in `rustdoc`'s design. The compilation model is rigid: every test must be a valid, standalone program.[^2] However, the ideal documentation example is often just a small, illustrative snippet that is not a valid program -on its own.[^10] These ergonomic features are pragmatic "patches" designed to +on its own.[^9] These ergonomic features are pragmatic "patches" designed to resolve this conflict. They allow the developer to inject the necessary boilerplate to satisfy the compiler without burdening the human reader with irrelevant details. Understanding them as clever workarounds, rather than as first-class language features, helps explain their sometimes quirky, text-based -behavior. +behaviour. ## Advanced Doctest Control and Attributes Beyond basic pass/fail checks, `rustdoc` provides a suite of attributes to -control doctest behavior with fine-grained precision. These attributes, placed +control doctest behaviour with fine-grained precision. These attributes, placed in the header of a code block (e.g., \`\`\`\`ignore\`), allow developers to handle expected failures, non-executable examples, and other complex scenarios. @@ -237,13 +236,13 @@ Choosing the correct attribute is critical for communicating the intent of an example and ensuring the test suite provides meaningful feedback. The following table provides a comparative reference for the most common doctest attributes. -| Attribute | Action | Test Outcome | Primary Use Case & Warnings | -| ------------ | ------------------------------------------------------------------- | -------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| ignore | Skips both compilation and execution. | ignored | Use Case: For pseudocode, examples known to be broken, or to temporarily disable a test. Warning: Provides no guarantee that the code is even syntactically correct. Generally discouraged in favour of more specific attributes.3 | -| should_panic | Compiles and runs the code. The test passes if the code panics. | OK on panic, failed if it does not panic. | Use Case: Demonstrating functions that are designed to panic on invalid input (e.g., indexing out of bounds). | -| compile_fail | Attempts to compile the code. The test passes if compilation fails. | OK on compilation failure, failed if it compiles successfully. | Use Case: Illustrating language rules, such as the borrow checker or type system constraints. Warning: Highly brittle. A future Rust version might make the code valid, causing the test to unexpectedly fail.4 | -| no_run | Compiles the code but does not execute it. | OK if compilation succeeds. | Use Case: Essential for examples with undesirable side effects in a test environment, such as network requests, filesystem I/O, or launching a GUI. Guarantees the example is valid Rust code without running it.5 | -| edition2021 | Compiles the code using the specified Rust edition's rules. | OK on success. | Use Case: Demonstrating syntax or idioms that are specific to a particular Rust edition (e.g., edition2018, edition2021).4 | +| Attribute | Action | Test Outcome | Primary Use Case & Warnings | +| ------------ | ------------------------------------------------------------------- | -------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| ignore | Skips both compilation and execution. | ignored | Use Case: For pseudocode, examples known to be broken, or to temporarily disable a test. Warning: Provides no guarantee that the code is even syntactically correct. Generally discouraged in favour of more specific attributes.[^3] | +| should_panic | Compiles and runs the code. The test passes if the code panics. | OK on panic, failed if it does not panic. | Use Case: Demonstrating functions that are designed to panic on invalid input (e.g., indexing out of bounds). | +| compile_fail | Attempts to compile the code. The test passes if compilation fails. | OK on compilation failure, failed if it compiles successfully. | Use Case: Illustrating language rules, such as the borrow checker or type system constraints. Warning: Highly brittle. A future Rust version might make the code valid, causing the test to unexpectedly fail.[^4] | +| no_run | Compiles the code but does not execute it. | OK if compilation succeeds. | Use Case: Essential for examples with undesirable side effects in a test environment, such as network requests, filesystem I/O, or launching a GUI. Guarantees the example is valid Rust code without running it.[^5] | +| edition20xx | Compiles the code using the specified Rust edition's rules. | OK on success. | Use Case: Demonstrating syntax or idioms that are specific to a particular Rust edition (e.g., edition2018, edition2021).[^4] | ### 3.2 Detailed Attribute Breakdown @@ -251,10 +250,10 @@ table provides a comparative reference for the most common doctest attributes. to do nothing with the code block. It is almost always better to either fix the example using hidden lines or use a more descriptive attribute like `no_run`.[^3] Its main legitimate use is for non-Rust code blocks or - illustrative pseudo-code. + illustrative pseudocode. - `should_panic`: This attribute inverts the normal test condition. It is used - to document and verify behavior that intentionally results in a panic. The + to document and verify behaviour that intentionally results in a panic. The test will fail if the code completes successfully or panics for a reason other than the one expected (if a specific panic message is asserted).[^3] @@ -267,10 +266,10 @@ table provides a comparative reference for the most common doctest attributes. - `no_run`: This attribute strikes a crucial balance between test verification and practicality. For an example that demonstrates how to download a file - from the internet, developers want to ensure the example code is - syntactically correct and uses the API properly, but CI servers should not - actually perform a network request every time tests run. `no_run` provides - this guarantee by compiling the code without executing it.[^5] + from the internet, the example code must be syntactically correct and use the + API properly, but it is undesirable for the CI server to perform a network + request during every test run. `no_run` provides this guarantee by compiling + the code without executing it.[^5] - `edition20xx`: This attribute allows an example to be tested against a specific Rust edition. This is important for crates that support multiple @@ -301,7 +300,7 @@ flag provided by `rustdoc`: `doctest`. A common mistake is to try to place shared test logic in a block guarded by `#[cfg(test)]`. This will not work, because `rustdoc` does not enable the `test` configuration flag during its compilation process; `#[cfg(test)]` is reserved for unit and integration tests -run directly by `cargo test`.[^12] +run directly by `cargo test`.[^11] Instead, `rustdoc` sets its own unique `doctest` flag. By guarding a module or function with `#[cfg(doctest)]`, developers can write helper code that is @@ -310,9 +309,9 @@ excluded from normal production builds and standard unit test runs, preventing any pollution of the final binary or the public API. The typical implementation pattern is to create a private helper module within -your library: +the library: -```rust +```Rust // In lib.rs or a submodule /// A function that requires a complex environment to test. @@ -325,12 +324,12 @@ your library: /// let mut ctx = setup_test_environment()?; /// let result = my_func_that_needs_env(&mut ctx); /// assert!(result.is_ok()); -/// # OK(()) +/// # Ok(()) /// # } /// ``` pub fn my_func_that_needs_env(ctx: &mut TestContext) -> Result<(), ()> { - // … function logic… - OK(()) + //... function logic... + Ok(()) } // This module and its contents are only compiled for doctests. @@ -341,24 +340,24 @@ mod doctest_helpers { use std::io::Result; pub struct TestContext { - // … fields for the test context… + //... fields for the test context... } pub fn setup_test_environment() -> Result { // All the complex, shared setup logic lives here once. - println!("Setting up test environment…"); - OK(TestContext { /*… */ }) + println!("Setting up test environment..."); + Ok(TestContext { /*... */ }) } } // A struct that might be needed by the public function signature. // It can be defined normally. -pub struct TestContext { /*… */ } +pub struct TestContext { /*... */ } ``` This pattern is the most effective way to achieve DRY doctests. It centralizes setup logic, improves maintainability, and cleanly separates testing concerns -from production code.[^12] +from production code.[^11] ### 4.3 Advanced DRY: Programmatic Doctest Generation @@ -372,7 +371,7 @@ Crates like `quote-doctest` address this by allowing developers to programmatically construct a doctest from a `TokenStream`. This enables the generation of doctests from the same source of truth that generates the code they are intended to test, representing the ultimate application of the DRY -principle in this domain.[^14] +principle in this domain.[^12] ## Conditional Compilation Strategies for Doctests @@ -394,11 +393,11 @@ a Windows machine). **The Mechanism**: `rustdoc` always invokes the compiler with the `--cfg doc` flag set. By adding `doc` to an item's `#[cfg]` attribute, a developer can instruct the compiler to include that item specifically for documentation -builds.[^15] +builds.[^13] **The Pattern**: -```rust +```Rust /// A socket that is only available on Unix platforms. #[cfg(any(target_os = "unix", doc))] pub struct UnixSocket; @@ -413,7 +412,7 @@ This distinction highlights the "cfg duality." The `#[cfg(doc)]` attribute controls the *table of contents* of the documentation; it determines which items are parsed and rendered. The actual compilation of a doctest, however, happens in a separate, later stage. In that stage, the `doc` cfg is *not* -passed to the compiler.[^15] The compiler only sees the host +passed to the compiler.[^13] The compiler only sees the host `cfg` (e.g., `target_os = "windows"`), so the `UnixSocket` type is not available, and the test fails to compile. `#[cfg(doc)]` affects what is @@ -427,12 +426,12 @@ via `cargo test --doc --features "serde"`. Two primary patterns exist to achieve this. -Pattern [^1]: #\[cfg\] Inside the Code Block +Pattern 1: #\[cfg\] Inside the Code Block This pattern involves placing a #\[cfg\] attribute directly on the code within the doctest itself. -```rust +```Rust /// This example only runs if the "serde" feature is enabled. /// /// ``` @@ -447,16 +446,16 @@ the doctest itself. When the `"serde"` feature is disabled, the code inside the block is compiled out. The doctest becomes an empty program that runs, does nothing, and is -reported as `OK`. While simple to write, this can be misleading, as the test -suite reports a "pass" for a test that was effectively skipped.[^16] +reported as `ok`. While simple to write, this can be misleading, as the test +suite reports a "pass" for a test that was effectively skipped.[^14] -Pattern [^2]: cfg_attr to Conditionally ignore the Test +Pattern 2: cfg_attr to Conditionally ignore the Test A more explicit and accurate pattern uses the cfg_attr attribute to conditionally add the ignore flag to the doctest's header. This is typically done with inner doc comments (//!). -```rust +```Rust //! #![cfg_attr(not(feature = "serde"), doc = "```ignore")] //! #![cfg_attr(feature = "serde", doc = "```")] //! // Example code that requires the "serde" feature. @@ -471,29 +470,29 @@ With this pattern, if the `"serde"` feature is disabled, the test is marked as the feature is enabled, the `ignore` is omitted, and the test runs normally. This approach provides clearer feedback but is significantly more verbose and less ergonomic, especially when applied to outer (`///`) doc comments, as the -`cfg_attr` must be applied to every single line of the comment.[^16] +`cfg_attr` must be applied to every single line of the comment.[^14] -### 5.3 Displaying Feature Requirements in Docs: `#[doc(cfg(…))]` +### 5.3 Displaying Feature Requirements in Docs: `#[doc(cfg(...))]` To complement conditional execution, Rust provides a way to visually flag feature-gated items in the generated documentation. This is achieved with the -`#[doc(cfg(…))]` attribute, which requires enabling the `#![feature(doc_cfg)]` -feature gate at the crate root. +`#[doc(cfg(...))]` attribute, which requires enabling the +`#![feature(doc_cfg)]` feature gate at the crate root. -```rust +```Rust // At the crate root (lib.rs) #![feature(doc_cfg)] // On the feature-gated item #[cfg(feature = "serde")] #[doc(cfg(feature = "serde"))] -pub fn function_requiring_serde() { /*… */ } +pub fn function_requiring_serde() { /*... */ } ``` This will render a banner in the documentation for `function_requiring_serde` that reads, "This is only available when the `serde` feature is enabled." This attribute is purely for documentation generation and is independent of, but -often used alongside, the conditional test execution patterns.[^16] +often used alongside, the conditional test execution patterns.[^14] ## Doctests in the Wider Project Ecosystem @@ -506,15 +505,15 @@ limitations is key to maintaining a healthy and well-tested Rust project. A robust testing strategy leverages three distinct types of tests, each with its own purpose: -- **Doctests**: These are ideal for simple, "happy-path" examples of your +- **Doctests**: These are ideal for simple, "happy-path" examples of the public API. Their dual purpose is to provide clear documentation for users and to act as a basic sanity check that the examples remain correct over time. They should be easy to read and focused on illustrating a single concept.[^6] -- **Unit Tests (**`#[test]` **in** `src/`**)**: These are for testing the - nitty-gritty details of your implementation. They are placed in submodules - within your source files (often `mod tests {…}`) and are compiled only with +- **Unit tests (`#[test]` in `src/`)**: These are for testing the + nitty-gritty details of the implementation. They are placed in submodules + within the source files (often `mod tests {... }`) and are compiled only with `#[cfg(test)]`. Because they live inside the crate, they can access private functions and modules, making them perfect for testing internal logic, edge cases, and specific error conditions.[^1] @@ -523,13 +522,13 @@ its own purpose: from a completely external perspective, much like doctests. However, they are not constrained by the need to be readable documentation. They are suited for testing complex user workflows, interactions between multiple API entry - points, and the overall behavior of the library as a black box.[^6] + points, and the overall behaviour of the library as a black box.[^6] ### 6.2 The Unsolved Problem: Testing Private APIs As established, the `rustdoc` compilation model makes testing private items in doctests impossible by design.[^1] The community has developed several -workarounds, but each comes with significant trade-offs [^1]: +workarounds, but each comes with significant trade-offs[^1]: 1. `ignore` **the test**: This allows the example to exist in the documentation but sacrifices the guarantee of correctness. It is the least desirable @@ -543,8 +542,8 @@ workarounds, but each comes with significant trade-offs [^1]: 3. **Use** `cfg_attr` **to conditionally make items public**: This involves adding an attribute like `#[cfg_attr(feature = "doctest-private", visibility::make(pub))]` to every - private item you wish to test. While robust, it is highly invasive and adds - significant boilerplate throughout the codebase. + private item that requires testing. While robust, it is highly invasive and + adds significant boilerplate throughout the codebase. The expert recommendation is to acknowledge this limitation and not fight the tool. Do not compromise a clean API design for the sake of doctests. Use @@ -559,28 +558,28 @@ Beyond architectural considerations, developers face several practical, real-world challenges when working with doctests. - **The** `README.md` **Dilemma**: A project's `README.md` file serves multiple - audiences. It needs to render cleanly on platforms like GitHub and crates.io, - where hidden lines (`#…`) look like ugly, commented-out code. At the same - time, it should contain testable examples, which often require hidden lines - for setup.[^11] The best practice is to avoid maintaining the README - manually. Instead, use a tool like + audiences. It needs to render cleanly on platforms like GitHub and + [crates.io](http://crates.io), where hidden lines (`#...`) look like ugly, + commented-out code. At the same time, it should contain testable examples, + which often require hidden lines for setup.[^10] The best practice is to + avoid maintaining the README manually. Instead, use a tool like - `cargo-readme`. This tool generates a `README.md` file from your crate-level + `cargo-readme`. This tool generates a `README.md` file from the crate-level documentation (in `lib.rs`), automatically stripping out the hidden lines. This provides a single source of truth that is both fully testable via `cargo test --doc` and produces a clean, professional README for external - sites.[^11] + sites.[^10] - **Developer Ergonomics in IDEs**: Writing code inside documentation comments can be a subpar experience. IDEs and tools like `rust-analyzer` often provide limited or no autocompletion, real-time error checking, or refactoring - support for code within a comment block.[^18] A common and effective workflow + support for code within a comment block.[^15] A common and effective workflow to mitigate this is to first write and debug the example as a standard `#[test]` function in a temporary file or test module. This allows the developer to leverage the full power of the IDE. Once the code is working correctly, it can be copied into the doc comment, and the necessary - formatting (`///`, `#`, etc.) can be applied.[^18] + formatting (`///`, `#`, etc.) can be applied.[^15] ## Conclusion and Recommendations @@ -591,23 +590,23 @@ have evolved to manage its constraints, developers can write doctests that are effective, ergonomic, and maintainable. To summarize the key principles for mastering doctests: -1. **Embrace the Model**: Always remember that a doctest is an external - integration test compiled in a separate crate. This mental model explains - nearly all of its behavior. +1. **Embrace the Model**: Treat a doctest as an external integration test + compiled in a separate crate; this mental model explains nearly all of its + behaviour. 2. **Prioritize Clarity**: Write examples that teach the *why*, not just the *how*. Use hidden lines (`#`) ruthlessly to eliminate boilerplate and focus the reader's attention on the relevant code. -3. **Handle Errors Gracefully**: For fallible functions, always use - the `fn main() -> Result<…>` pattern. Hide the boilerplate and avoid +3. **Handle Errors Gracefully**: For examples of fallible functions, always use + the `fn main() -> Result<...>` pattern, hiding the boilerplate. Avoid `.unwrap()` to promote robust error-handling practices. 4. **Be DRY**: When setup logic is shared across multiple examples, centralize it in a helper module guarded by `#[cfg(doctest)]` to avoid repetition. -5. **Master** `cfg`: Use `#[cfg(doc)]` to control an item's *visibility* in - the final documentation. Use `#[cfg(feature = "…")]` or other `cfg` flags +5. **Master** `cfg`: Use `#[cfg(doc)]` to control an item's *visibility* in the + final documentation. Use `#[cfg(feature = "...")]` or other `cfg` flags *inside* the test block to control its conditional *execution*. Do not confuse the two. @@ -617,63 +616,45 @@ mastering doctests: unit or integration test. Do not compromise your API design or test clarity by forcing a square peg into a round hole. Use the right tool for the job. -### Works cited +### **Works cited** [^1]: rust - How can I write documentation tests for private modules …, - accessed on July 15, 2025, - - -[^2]: Rustdoc doctests need fixing — Swatinem, accessed on 15 July 2025, - - -[^3]: Documentation tests - The rustdoc book - Rust Documentation, accessed on - July 15, 2025, - - +accessed on July 15, 2025, + +[^2]: Rustdoc doctests need fixing - Swatinem, accessed on July 15, 2025, + +[^3]: Documentation tests - The rustdoc boOK - Rust Documentation, accessed on +July 15, 2025, [^4]: Documentation tests - - GitHub Pages, accessed on July 15, 2025, - - + [^5]: Documentation tests - - MIT, accessed on July 15, 2025, - - + [^6]: How to organize your Rust tests - LogRocket Blog, accessed on July 15, - 2025, - - - -[^8]: Writing Rust Documentation - DEV Community, accessed on July 15, 2025, - - -[^9]: The rustdoc book, accessed on July 15, 2025, - - -[^10]: Documentation - Rust API Guidelines, accessed on July 15, 2025, - - -[^11]: Best practice for doc testing README - help - The Rust Programming +2025, + +[^7]: Writing Rust Documentation - DEV Community, accessed on July 15, 2025, + +[^8]: The rustdoc book, accessed on July 15, 2025, + +[^9]: Documentation - Rust API Guidelines, accessed on July 15, 2025, + +[^10]: Best practice for doc testing README - help - The Rust Programming Language Forum, accessed on July 15, 2025, - -[^12]: Compile_fail doc test ignored in cfg(test) - help - The Rust Programming - Language Forum, accessed on July 15, 2025, - - - accessed on July 15, 2025, - - -[^14]: quote_doctest - Rust - [Docs.rs](http://Docs.rs), accessed on July 15, - 2025, - -[^15]: Advanced features - The rustdoc book - Rust Documentation, accessed on +[^11]: Compile_fail doc test ignored in cfg(test) - help - The Rust Programming +Language Forum, accessed on July 15, 2025, + + accessed on July 15, 2025, + +[^12]: quote_doctest - Rust - [Docs.rs](http://Docs.rs), accessed on July 15, +2025, +[^13]: Advanced features - The rustdoc boOK - Rust Documentation, accessed on July 15, 2025, - -[^16]: rust - How can I conditionally execute a module-level doctest based …, - accessed on July 15, 2025, - - - have doctests?, accessed on July 15, 2025, - - -[^18]: How do you write your doc tests? : r/rust - Reddit, accessed on July 15, - 2025, - +[^14]: rust - How can I conditionally execute a module-level doctest based …, +accessed on July 15, 2025, + + have doctests?, accessed on July 15, 2025, + +[^15]: How do you write your doc tests? : r/rust - Reddit, accessed on July 15, +2025, + diff --git a/docs/rust-testing-with-rstest-fixtures.md b/docs/rust-testing-with-rstest-fixtures.md index 8c517b15..e570c886 100644 --- a/docs/rust-testing-with-rstest-fixtures.md +++ b/docs/rust-testing-with-rstest-fixtures.md @@ -546,6 +546,7 @@ When using `#[once]`, there are critical caveats 12: the end of the test suite. This makes `#[once]` fixtures best suited for truly passive data or resources whose cleanup is managed by the operating system upon process exit. + 2. **Functional Limitations:** `#[once]` fixtures cannot be `async` functions and cannot be generic functions (neither with generic type parameters nor using `impl Trait` in arguments or return types). @@ -1197,13 +1198,13 @@ The following table summarizes key differences: **Table 1:** `rstest` **vs. Standard Rust** `#[test]` **for Fixture Management and Parameterization** -| Feature | Standard #[test] Approach | rstest Approach | +| Feature | Standard #[test] Approach | rstest Approach | | ---------------------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------- | -| Fixture Injection | Manual calls to setup functions within each test. | Fixture name as argument in #[rstest] function; fixture defined with #[fixture]. | -| Parameterized Tests (Specific Cases) | Loop inside one test, or multiple distinct #[test] functions. | #[case(…)] attributes on #[rstest] function. | -| Parameterized Tests (Value Combinations) | Nested loops inside one test, or complex manual generation. | #[values(…)] attributes on arguments of #[rstest] function. | -| Async Fixture Setup | Manual async block and .await calls inside test. | async fn fixtures, with #[future] and #[awt] for ergonomic .awaiting. | -| Reusing Parameter Sets | Manual duplication of cases or custom helper macros. | rstest_reuse crate with #[template] and #[apply] attributes. | +| Fixture Injection | Manual calls to setup functions within each test. | Fixture name as argument in #[rstest] function; fixture defined with #[fixture]. | +| Parameterized Tests (Specific Cases) | Loop inside one test, or multiple distinct #[test] functions. | #[case(…)] attributes on #[rstest] function. | +| Parameterized Tests (Value Combinations) | Nested loops inside one test, or complex manual generation. | #[values(…)] attributes on arguments of #[rstest] function. | +| Async Fixture Setup | Manual async block and .await calls inside test. | async fn fixtures, with #[future] and #[awt] for ergonomic .awaiting. | +| Reusing Parameter Sets | Manual duplication of cases or custom helper macros. | rstest_reuse crate with #[template] and #[apply] attributes. | This comparison highlights how `rstest`'s attribute-based, declarative approach streamlines common testing patterns, reducing manual effort and improving the @@ -1341,20 +1342,20 @@ provided by `rstest`: **Table 2: Key** `rstest` **Attributes Quick Reference** -| Attribute | Core Purpose | +| Attribute | Core Purpose | | ---------------------------- | -------------------------------------------------------------------------------------------- | -| #[rstest] | Marks a function as an rstest test; enables fixture injection and parameterization. | -| #[fixture] | Defines a function that provides a test fixture (setup data or services). | -| #[case(…)] | Defines a single parameterized test case with specific input values. | -| #[values(…)] | Defines a list of values for an argument, generating tests for each value or combination. | -| #[once] | Marks a fixture to be initialized only once and shared (as a static reference) across tests. | -| #[future] | Simplifies async argument types by removing impl Future boilerplate. | -| #[awt] | (Function or argument level) Automatically .awaits future arguments in async tests. | -| #[from(original_name)] | Allows renaming an injected fixture argument in the test function. | -| #[with(…)] | Overrides default arguments of a fixture for a specific test. | -| #[default(…)] | Provides default values for arguments within a fixture function. | -| #[timeout(…)] | Sets a timeout for an asynchronous test. | -| #[files("glob_pattern",…)] | Injects file paths (or contents, with mode=) matching a glob pattern as test arguments. | +| #[rstest] | Marks a function as a rstest test; enables fixture injection and parameterization. | +| #[fixture] | Defines a function that provides a test fixture (setup data or services). | +| #[case(…)] | Defines a single parameterized test case with specific input values. | +| #[values(…)] | Defines a list of values for an argument, generating tests for each value or combination. | +| #[once] | Marks a fixture to be initialized only once and shared (as a static reference) across tests. | +| #[future] | Simplifies async argument types by removing impl Future boilerplate. | +| #[awt] | (Function or argument level) Automatically .awaits future arguments in async tests. | +| #[from(original_name)] | Allows renaming an injected fixture argument in the test function. | +| #[with(…)] | Overrides default arguments of a fixture for a specific test. | +| #[default(…)] | Provides default values for arguments within a fixture function. | +| #[timeout(…)] | Sets a timeout for an asynchronous test. | +| #[files("glob_pattern",…)] | Injects file paths (or contents, with mode=) matching a glob pattern as test arguments. | Mastering `rstest` can significantly elevate the quality and efficiency of testing practices for Rust developers, leading to more reliable and @@ -1363,75 +1364,66 @@ maintainable software. #### Works cited [^1]: rstest - Rust - [Docs.rs](http://Docs.rs), accessed on June 12, 2025, - + [^2]: rstest - Rust Package Registry - [Crates.io](http://Crates.io), accessed - on June 12, 2025, +on June 12, 2025, [^3]: rstest_macros - [crates.io](http://crates.io): Rust Package Registry, - accessed on June 12, 2025, +accessed on June 12, 2025, [^4]: la10736/rstest: Fixture-based test framework for Rust - GitHub, accessed - on June 12, 2025, +on June 12, 2025, [^5]: It's Not Out Yet… But Rstest Has Me HYPED - YouTube, accessed on June 12, - 2025, - - GitHub, accessed on June 12, 2025, - +2025, [^7]: Iterating on Testing in Rust - Hacker News, accessed on June 12, 2025, - + -[^8]: Feature request: Support for debugging parameterized tests using rstest : - RUST-12206, accessed on June 12, 2025, - +[^8]: Feature request: Support for debugging parameterized tests using +rstest: RUST-12206, accessed on June 12, 2025, + [^9]: rstest - [crates.io](http://crates.io): Rust Package Registry, accessed - on June 12, 2025, +on June 12, 2025, [^10]: rstest - [crates.io](http://crates.io): Rust Package Registry, accessed - on June 12, 2025, +on June 12, 2025, [^11]: Test Organization - The Rust Programming Language, accessed on June 12, - 2025, +2025, [^12]: fixture in rstest - Rust - [Docs.rs](http://Docs.rs), accessed on June - 12, 2025, +12, 2025, [^13]: rstest in rstest - Rust - [Docs.rs](http://Docs.rs), accessed on June - 12, 2025, - - [Shuttle.dev](http://Shuttle.dev), accessed on June 12, 2025, - +12, 2025, [^15]: Very long build time · Issue #184 · la10736/rstest - GitHub, accessed on - June 12, 2025, +June 12, 2025, [^16]: test-temp-dir - [crates.io](http://crates.io): Rust Package Registry, - accessed on June 12, 2025, +accessed on June 12, 2025, [^17]: Mistakes to avoid while writing unit test for your rust codebase? - - Reddit, accessed on June 12, 2025, - +Reddit, accessed on June 12, 2025, + [^18]: rstest_reuse - [crates.io](http://crates.io): Rust Package Registry, - accessed on June 12, 2025, +accessed on June 12, 2025, [^19]: crates or tips on how to organize test better? : r/rust - Reddit, - accessed on June 12, 2025, - +accessed on June 12, 2025, + [^20]: Is there any point in avoiding std when testing a no_std library? - Rust - Users Forum, accessed on June 12, 2025, - +Users Forum, accessed on June 12, 2025, + [^21]: rstest-log - [crates.io](http://crates.io): Rust Package Registry, - accessed on June 12, 2025, - +accessed on June 12, 2025, + [^22]: test-with - [crates.io](http://crates.io): Rust Package Registry, - accessed on June 12, 2025, - - accessed on June 12, 2025, - +accessed on June 12, 2025, diff --git a/rust-toolchain.toml b/rust-toolchain.toml index edc4de49..1f901c81 100644 --- a/rust-toolchain.toml +++ b/rust-toolchain.toml @@ -1,4 +1,4 @@ [toolchain] -channel = "1.89.0" -components = ["rustfmt", "clippy"] +channel = "nightly-2026-03-26" +components = ["rustfmt", "clippy", "rust-analyzer"] profile = "minimal" diff --git a/src/code_emphasis.rs b/src/code_emphasis.rs index 68ba9312..f64610f8 100644 --- a/src/code_emphasis.rs +++ b/src/code_emphasis.rs @@ -10,8 +10,10 @@ //! transformation should run before wrapping and footnote conversion so marker //! adjacency is evaluated on the raw input. -use crate::textproc::process_text; -use crate::wrap::{Token, tokenize_markdown}; +use crate::{ + textproc::process_text, + wrap::{Token, tokenize_markdown}, +}; /// Split emphasis markers at both ends of `s`. /// diff --git a/src/fences.rs b/src/fences.rs index a0eceffe..26176023 100644 --- a/src/fences.rs +++ b/src/fences.rs @@ -17,7 +17,8 @@ static ORPHAN_LANG_RE: LazyLock = LazyLock::new(|| { /// Determine whether a language specifier denotes an absent language. /// -/// A language is absent when it is empty or the case-insensitive string `null`, with surrounding whitespace ignored. +/// A language is absent when it is empty or the case-insensitive string `null`, with surrounding +/// whitespace ignored. /// /// # Examples /// diff --git a/src/footnotes/inline.rs b/src/footnotes/inline.rs index ae74a4fe..82744eb0 100644 --- a/src/footnotes/inline.rs +++ b/src/footnotes/inline.rs @@ -77,6 +77,4 @@ pub(super) fn convert_inline(text: &str) -> String { } /// Determine whether a string is the prefix of an ATX heading. -pub(super) fn is_atx_heading_prefix(s: &str) -> bool { - ATX_HEADING_RE.is_match(s) -} +pub(super) fn is_atx_heading_prefix(s: &str) -> bool { ATX_HEADING_RE.is_match(s) } diff --git a/src/footnotes/mod.rs b/src/footnotes/mod.rs index f3db22c4..eecec284 100644 --- a/src/footnotes/mod.rs +++ b/src/footnotes/mod.rs @@ -9,12 +9,12 @@ mod lists; mod parsing; mod renumber; -use crate::textproc::{Token, push_original_token, tokenize_markdown}; - use inline::{convert_inline, is_atx_heading_prefix}; use lists::convert_block; use renumber::renumber_footnotes; +use crate::textproc::{Token, push_original_token, tokenize_markdown}; + /// Convert bare numeric footnote references to Markdown footnote syntax. #[must_use] pub fn convert_footnotes(lines: &[String]) -> Vec { diff --git a/src/footnotes/renumber.rs b/src/footnotes/renumber.rs index 516ffabc..9d7655b6 100644 --- a/src/footnotes/renumber.rs +++ b/src/footnotes/renumber.rs @@ -1,16 +1,15 @@ //! Sequential renumbering of footnote references and definitions. -use std::collections::HashMap; -use std::fmt::Write; -use std::sync::LazyLock; +use std::{collections::HashMap, fmt::Write, sync::LazyLock}; use regex::{Captures, Match, Regex}; +use super::{ + lists::{footnote_block_range, has_existing_footnote_block, trimmed_range}, + parsing::{FOOTNOTE_LINE_RE, is_definition_continuation, parse_definition}, +}; use crate::textproc::{Token, push_original_token, tokenize_markdown}; -use super::lists::{footnote_block_range, has_existing_footnote_block, trimmed_range}; -use super::parsing::{FOOTNOTE_LINE_RE, is_definition_continuation, parse_definition}; - static FOOTNOTE_REF_RE: LazyLock = lazy_regex!( r"\[\^(?P\d+)\]", "footnote reference pattern should compile", diff --git a/src/frontmatter.rs b/src/frontmatter.rs new file mode 100644 index 00000000..12cb5c87 --- /dev/null +++ b/src/frontmatter.rs @@ -0,0 +1,191 @@ +//! YAML frontmatter detection and preservation. +//! +//! This module provides a helper to detect and split a leading YAML frontmatter +//! block from a Markdown document. The frontmatter block is defined as starting +//! with a line containing exactly `---` (the YAML opener) and ending with a line +//! containing `---` or `...` with optional trailing whitespace (the YAML closer). +//! Only a block at the very beginning of the document counts as frontmatter. + +/// Splits the input into a leading YAML frontmatter prefix and the remaining body. +/// +/// A valid frontmatter block must: +/// - Start with the first line being exactly `---` +/// - End with a line that is `---` or `...` with optional trailing whitespace before any body +/// content (matching is done after `trim_end()`) +/// +/// If no valid closer is found, the entire input is returned as the body with an +/// empty prefix. This preserves existing behaviour for malformed or non-frontmatter +/// documents. +/// +/// # Examples +/// +/// ``` +/// use mdtablefix::frontmatter::split_leading_yaml_frontmatter; +/// +/// let lines = vec![ +/// "---".to_string(), +/// "title: Example".to_string(), +/// "---".to_string(), +/// "# Heading".to_string(), +/// ]; +/// let (prefix, body) = split_leading_yaml_frontmatter(&lines); +/// assert_eq!(prefix.len(), 3); +/// assert_eq!(body.len(), 1); +/// assert_eq!(body[0], "# Heading"); +/// ``` +#[must_use] +pub fn split_leading_yaml_frontmatter(lines: &[String]) -> (&[String], &[String]) { + if lines.is_empty() { + return (&[], &[]); + } + + // First line must be exactly the YAML opener (no leading/trailing whitespace) + if lines[0] != "---" { + return (&[], lines); + } + + // Look for a closing delimiter after the opener + // Only trim trailing whitespace to preserve leading whitespace + // (indented lines inside YAML block scalars should not be treated as closers) + for (idx, line) in lines.iter().enumerate().skip(1) { + let trimmed_end = line.trim_end(); + if trimmed_end == "---" || trimmed_end == "..." { + // Found valid closer - split after this line + let split_at = idx + 1; + return (&lines[..split_at], &lines[split_at..]); + } + } + + // No valid closer found - treat as ordinary Markdown + (&[], lines) +} + +#[cfg(test)] +mod tests { + use rstest::rstest; + + use super::*; + + /// Helper to convert `&[&str]` → `Vec`. + fn s(v: &[&str]) -> Vec { v.iter().copied().map(str::to_string).collect() } + + /// Cases where `prefix` is empty (no frontmatter detected). + #[rstest] + #[case::empty_input_returns_empty_slices( + s(&[]), + true, // body_is_empty + false // check_body_equality + )] + #[case::no_frontmatter_returns_empty_prefix( + s(&["# Heading", "Some text"]), + false, + true // check body == input lines + )] + #[case::unmatched_opener_treated_as_body( + s(&["---", "Some text", "More text"]), + false, + false + )] + #[case::indented_opener_not_recognized( + s(&[" ---", "title: Example", " ---"]), + false, + false + )] + #[case::later_dash_block_not_frontmatter( + s(&["# Heading", "", "---", "Not frontmatter", "---"]), + false, + false + )] + #[case::indented_closer_not_recognized( + s(&["---", "title: Example", " --- ", "# Heading"]), + false, + false + )] + fn prefix_empty_cases( + #[case] lines: Vec, + #[case] body_is_empty: bool, + #[case] check_body_equality: bool, + ) { + let (prefix, body) = split_leading_yaml_frontmatter(&lines); + assert!(prefix.is_empty()); + if body_is_empty { + assert!(body.is_empty()); + } else if check_body_equality { + assert_eq!(body, &lines); + } else { + assert!(!body.is_empty()); + } + } + + /// Cases where frontmatter is detected (non-empty `prefix`). + #[rstest] + #[case::detects_frontmatter_with_triple_dash_closer( + s(&["---", "title: Example", "author: Test", "---", "# Heading", "Body text"]), + 4, // prefix_len + 2, // body_len + Some((0, "---")), + Some((3, "---")), + Some("# Heading") + )] + #[case::detects_frontmatter_with_triple_dot_closer( + s(&["---", "title: Example", "...", "# Heading"]), + 3, + 1, + Some((2, "...")), + None, + Some("# Heading") + )] + #[case::frontmatter_with_empty_body( + s(&["---", "title: Example", "---"]), + 3, + 0, + None, + None, + None + )] + #[case::frontmatter_only_no_body( + s(&["---", "---"]), + 2, + 0, + Some((1, "---")), + None, + None + )] + #[case::trailing_whitespace_on_closer_is_trimmed( + s(&["---", "title: Example", "--- ", "# Heading"]), + 3, + 1, + None, + None, + None + )] + #[case::multiline_yaml_values_preserved( + s(&["---", "description: |", " This is a multi-line", " YAML value", "---", "# Content"]), + 5, + 1, + None, + None, + Some("# Content") + )] + fn frontmatter_split_cases( + #[case] lines: Vec, + #[case] prefix_len: usize, + #[case] body_len: usize, + #[case] prefix_spot_check: Option<(usize, &str)>, + #[case] prefix_spot_check_2: Option<(usize, &str)>, + #[case] body_spot_check: Option<&str>, + ) { + let (prefix, body) = split_leading_yaml_frontmatter(&lines); + assert_eq!(prefix.len(), prefix_len); + assert_eq!(body.len(), body_len); + if let Some((idx, expected)) = prefix_spot_check { + assert_eq!(prefix[idx], expected); + } + if let Some((idx, expected)) = prefix_spot_check_2 { + assert_eq!(prefix[idx], expected); + } + if let Some(expected) = body_spot_check { + assert_eq!(body[0], expected); + } + } +} diff --git a/src/headings.rs b/src/headings.rs index 2f93ba2c..4956c3b0 100644 --- a/src/headings.rs +++ b/src/headings.rs @@ -150,9 +150,10 @@ fn needs_space_after(prefix: &str) -> bool { #[cfg(test)] mod tests { - use super::*; use rstest::rstest; + use super::*; + #[rstest] #[case(vec!["Heading".into(), "===".into()], vec!["# Heading".into()])] #[case(vec!["Heading".into(), "----".into()], vec!["## Heading".into()])] diff --git a/src/html.rs b/src/html.rs index c5a73c53..a3179eb8 100644 --- a/src/html.rs +++ b/src/html.rs @@ -90,9 +90,7 @@ fn is_element(handle: &Handle, tag: &str) -> bool { } /// Returns `true` if `handle` represents a `` or `` element. -fn is_table_cell(handle: &Handle) -> bool { - is_element(handle, "td") || is_element(handle, "th") -} +fn is_table_cell(handle: &Handle) -> bool { is_element(handle, "td") || is_element(handle, "th") } /// Walks the DOM tree collecting `` nodes under `handle`. fn collect_tables(handle: &Handle, tables: &mut Vec) { @@ -260,7 +258,7 @@ pub(crate) fn html_table_to_markdown(lines: &[String]) -> Vec { for line in lines { if depth > 0 || TABLE_START_RE.is_match(line.trim_start()) { - buf.push(line.to_string()); + buf.push(line.clone()); depth += TABLE_START_RE.find_iter(line).count(); if TABLE_END_RE.is_match(line) { depth = depth.saturating_sub(TABLE_END_RE.find_iter(line).count()); @@ -272,7 +270,7 @@ pub(crate) fn html_table_to_markdown(lines: &[String]) -> Vec { continue; } - out.push(line.to_string()); + out.push(line.clone()); } if !buf.is_empty() { @@ -321,12 +319,12 @@ pub fn convert_html_tables(lines: &[String]) -> Vec { depth = 0; } in_code = !in_code; - out.push(line.to_string()); + out.push(line.clone()); continue; } if in_code { - out.push(line.to_string()); + out.push(line.clone()); continue; } @@ -341,7 +339,7 @@ pub fn convert_html_tables(lines: &[String]) -> Vec { continue; } - out.push(line.to_string()); + out.push(line.clone()); } if !buf.is_empty() { diff --git a/src/io.rs b/src/io.rs index e9bd9c17..cb30bea4 100644 --- a/src/io.rs +++ b/src/io.rs @@ -30,9 +30,7 @@ where /// /// # Errors /// Returns an error if reading or writing the file fails. -pub fn rewrite(path: &Path) -> std::io::Result<()> { - rewrite_with(path, process_stream) -} +pub fn rewrite(path: &Path) -> std::io::Result<()> { rewrite_with(path, process_stream) } /// Rewrite a file in place without wrapping text. /// diff --git a/src/lib.rs b/src/lib.rs index 3bd009f1..00c6e863 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -27,6 +27,8 @@ pub mod code_emphasis; pub mod ellipsis; pub mod fences; pub mod footnotes; +#[doc(hidden)] +pub mod frontmatter; pub mod headings; mod html; pub mod io; @@ -48,6 +50,8 @@ pub use code_emphasis::fix_code_emphasis; pub use ellipsis::replace_ellipsis; pub use fences::{attach_orphan_specifiers, compress_fences}; pub use footnotes::convert_footnotes; +#[doc(hidden)] +pub use frontmatter::split_leading_yaml_frontmatter; pub use headings::convert_setext_headings; pub use html::convert_html_tables; pub use io::{rewrite, rewrite_no_wrap}; diff --git a/src/lists.rs b/src/lists.rs index 3e4fa98e..1431a8a0 100644 --- a/src/lists.rs +++ b/src/lists.rs @@ -1,8 +1,9 @@ //! Ordered list renumbering utilities. -use regex::Regex; use std::collections::HashMap; +use regex::Regex; + use crate::{breaks::THEMATIC_BREAK_RE, wrap::FenceTracker}; /// Characters that mark formatted text at the start of a line. @@ -80,7 +81,8 @@ fn handle_paragraph_restart( /// Renumber ordered Markdown list items across the given lines. /// - Preserve code fences; do not renumber inside them. /// - Reset numbering on headings and thematic breaks. -/// - Restart numbering after a blank line followed by a plain paragraph at the same or a shallower indent. +/// - Restart numbering after a blank line followed by a plain paragraph at the same or a shallower +/// indent. #[must_use] pub fn renumber_lists(lines: &[String]) -> Vec { let mut out = Vec::with_capacity(lines.len()); diff --git a/src/main.rs b/src/main.rs index 5722c6b1..62639334 100644 --- a/src/main.rs +++ b/src/main.rs @@ -14,7 +14,13 @@ use std::{ use anyhow::Context; use clap::Parser; -use mdtablefix::{Options, format_breaks, process_stream_opts, renumber_lists}; +use mdtablefix::{ + Options, + format_breaks, + process::process_stream_inner, + renumber_lists, + split_leading_yaml_frontmatter, +}; use rayon::prelude::*; #[derive(Parser)] @@ -76,7 +82,11 @@ impl From for Options { } fn process_lines(lines: &[String], opts: FormatOpts) -> Vec { - let mut out = process_stream_opts(lines, opts.into()); + // Split off leading YAML frontmatter to preserve it from all transforms + let (frontmatter_prefix, body) = split_leading_yaml_frontmatter(lines); + + // Use process_stream_inner directly since we've already split frontmatter + let mut out = process_stream_inner(body, opts.into()); if opts.renumber { out = renumber_lists(&out); } @@ -86,7 +96,11 @@ fn process_lines(lines: &[String], opts: FormatOpts) -> Vec { .map(Cow::into_owned) .collect(); } - out + + // Prepend the preserved frontmatter prefix + let mut result = frontmatter_prefix.to_vec(); + result.extend(out); + result } fn handle_file(path: &Path, in_place: bool, opts: FormatOpts) -> anyhow::Result> { diff --git a/src/process.rs b/src/process.rs index 84f0e3b4..1dd34a05 100644 --- a/src/process.rs +++ b/src/process.rs @@ -4,6 +4,7 @@ use crate::{ ellipsis::replace_ellipsis, fences::{attach_orphan_specifiers, compress_fences}, footnotes::convert_footnotes, + frontmatter::split_leading_yaml_frontmatter, html::convert_html_tables, table::reflow_table, wrap::{FenceTracker, classify_block, wrap_text}, @@ -168,7 +169,7 @@ pub fn process_stream_inner(lines: &[String], opts: Options) -> Vec { } if fence_tracker.in_fence() { - out.push(line.to_string()); + out.push(line.clone()); continue; } @@ -177,7 +178,7 @@ pub fn process_stream_inner(lines: &[String], opts: Options) -> Vec { } flush_buffer(&mut buf, &mut in_table, &mut out); - out.push(line.to_string()); + out.push(line.clone()); } flush_buffer(&mut buf, &mut in_table, &mut out); @@ -200,6 +201,7 @@ pub fn process_stream_inner(lines: &[String], opts: Options) -> Vec { if opts.footnotes { out = convert_footnotes(&out); } + out } @@ -219,7 +221,7 @@ pub fn process_stream_inner(lines: &[String], opts: Options) -> Vec { /// ``` #[must_use] pub fn process_stream(lines: &[String]) -> Vec { - process_stream_inner( + process_with_frontmatter( lines, Options { wrap: true, @@ -241,9 +243,8 @@ pub fn process_stream(lines: &[String]) -> Vec { /// assert!(out.iter().any(|l| l.contains("| a | b |"))); /// ``` #[must_use] -#[inline] pub fn process_stream_no_wrap(lines: &[String]) -> Vec { - process_stream_inner(lines, Options::default()) + process_with_frontmatter(lines, Options::default()) } /// Runs [`process_stream_inner`] with custom [`Options`]. @@ -271,7 +272,16 @@ pub fn process_stream_no_wrap(lines: &[String]) -> Vec { /// ``` #[must_use] pub fn process_stream_opts(lines: &[String], opts: Options) -> Vec { - process_stream_inner(lines, opts) + process_with_frontmatter(lines, opts) +} + +/// Helper to split frontmatter, process body, and rejoin. +fn process_with_frontmatter(lines: &[String], opts: Options) -> Vec { + let (frontmatter_prefix, body) = split_leading_yaml_frontmatter(lines); + let out = process_stream_inner(body, opts); + let mut result = frontmatter_prefix.to_vec(); + result.extend(out); + result } #[cfg(test)] diff --git a/src/wrap.rs b/src/wrap.rs index 2fbd3334..fbad449b 100644 --- a/src/wrap.rs +++ b/src/wrap.rs @@ -130,7 +130,8 @@ pub fn wrap_text(lines: &[String], width: usize) -> Vec { } if is_indented_code_line(line) { - // Preserve indented code blocks verbatim so wrapping does not merge them into paragraphs. + // Preserve indented code blocks verbatim so wrapping does not merge them into + // paragraphs. flush_paragraph(&mut out, &buf, &indent, width); buf.clear(); indent.clear(); diff --git a/src/wrap/block.rs b/src/wrap/block.rs index c8fea348..1e71357e 100644 --- a/src/wrap/block.rs +++ b/src/wrap/block.rs @@ -126,9 +126,10 @@ pub(super) fn is_markdownlint_directive(line: &str) -> bool { #[cfg(test)] mod tests { - use super::*; use rstest::rstest; + use super::*; + #[rstest( line, expected, diff --git a/src/wrap/fence.rs b/src/wrap/fence.rs index d176b66d..19d543d7 100644 --- a/src/wrap/fence.rs +++ b/src/wrap/fence.rs @@ -84,9 +84,7 @@ pub struct FenceTracker { impl FenceTracker { /// Create a new tracker with no active fence. #[must_use] - pub fn new() -> Self { - Self::default() - } + pub fn new() -> Self { Self::default() } /// Update the tracker with a potential fence line. /// @@ -122,7 +120,5 @@ impl FenceTracker { /// Check whether the tracker is currently inside a fenced block. #[must_use] - pub fn in_fence(&self) -> bool { - self.state.is_some() - } + pub fn in_fence(&self) -> bool { self.state.is_some() } } diff --git a/src/wrap/inline.rs b/src/wrap/inline.rs index a48a78fa..78f261a2 100644 --- a/src/wrap/inline.rs +++ b/src/wrap/inline.rs @@ -30,13 +30,9 @@ fn looks_like_link(token: &str) -> bool { && token.ends_with(')') } -fn is_whitespace_token(token: &str) -> bool { - token.chars().all(char::is_whitespace) -} +fn is_whitespace_token(token: &str) -> bool { token.chars().all(char::is_whitespace) } -fn is_inline_code_token(token: &str) -> bool { - token.starts_with('`') && token.ends_with('`') -} +fn is_inline_code_token(token: &str) -> bool { token.starts_with('`') && token.ends_with('`') } fn extend_punctuation(tokens: &[String], mut j: usize, width: &mut usize) -> usize { while j < tokens.len() && tokens[j].chars().all(is_trailing_punct) { diff --git a/src/wrap/line_buffer.rs b/src/wrap/line_buffer.rs index 9db351b6..feff9c55 100644 --- a/src/wrap/line_buffer.rs +++ b/src/wrap/line_buffer.rs @@ -13,17 +13,11 @@ pub(crate) struct LineBuffer { } impl LineBuffer { - pub(crate) fn new() -> Self { - Self::default() - } + pub(crate) fn new() -> Self { Self::default() } - pub(crate) fn text(&self) -> &str { - self.text.as_str() - } + pub(crate) fn text(&self) -> &str { self.text.as_str() } - pub(crate) fn width(&self) -> usize { - self.width - } + pub(crate) fn width(&self) -> usize { self.width } pub(crate) fn push_token(&mut self, token: &str) { if token.len() == 1 && ".?!,:;".contains(token) && self.text.trim_end().ends_with('`') { diff --git a/src/wrap/tokenize/mod.rs b/src/wrap/tokenize/mod.rs index 22a74490..c7501861 100644 --- a/src/wrap/tokenize/mod.rs +++ b/src/wrap/tokenize/mod.rs @@ -7,10 +7,16 @@ mod parsing; mod scanning; use parsing::{ - handle_backtick_fence, is_trailing_punctuation, looks_like_image_start, parse_link_or_image, + handle_backtick_fence, + is_trailing_punctuation, + looks_like_image_start, + parse_link_or_image, }; use scanning::{ - bracket_follows_escaped_bang, collect_range, has_odd_backslash_escape_bytes, scan_while, + bracket_follows_escaped_bang, + collect_range, + has_odd_backslash_escape_bytes, + scan_while, }; /// Markdown token emitted by the `segment_inline` tokenizer. @@ -223,7 +229,11 @@ where /// tokens, /// vec![ /// Token::Text("Example with "), -/// Token::Code { raw: "`code`", fence: "`", code: "code" }, +/// Token::Code { +/// raw: "`code`", +/// fence: "`", +/// code: "code" +/// }, /// ] /// ); /// ``` diff --git a/src/wrap/tokenize/scanning.rs b/src/wrap/tokenize/scanning.rs index 1b67b7f8..ef0edfdb 100644 --- a/src/wrap/tokenize/scanning.rs +++ b/src/wrap/tokenize/scanning.rs @@ -78,9 +78,10 @@ pub(super) fn bracket_follows_escaped_bang(bytes: &[u8], idx: usize) -> bool { #[cfg(test)] mod tests { - use super::*; use rstest::rstest; + use super::*; + #[rstest] #[case::alpha_prefix("abc123", 0, Some(char::is_alphabetic as fn(char) -> bool), None, Some(3), None)] #[case::numeric_suffix("abc123", 3, Some(char::is_numeric as fn(char) -> bool), None, Some("abc123".len()), None)] diff --git a/tests/cli_frontmatter.rs b/tests/cli_frontmatter.rs new file mode 100644 index 00000000..4b9eb694 --- /dev/null +++ b/tests/cli_frontmatter.rs @@ -0,0 +1,187 @@ +//! CLI tests for YAML frontmatter handling. + +use assert_cmd::Command; +use rstest::{fixture, rstest}; + +/// Fixture providing an in-place test runner closure. +#[fixture] +fn in_place_runner() -> impl Fn(&[&str], &str, &str) { + |args: &[&str], input: &str, expected: &str| { + let temp = tempfile::NamedTempFile::new().expect("create temp file"); + std::fs::write(temp.path(), input).expect("write temp file"); + + let mut cmd = Command::cargo_bin("mdtablefix").expect("find binary"); + cmd.arg("--in-place").args(args).arg(temp.path()); + cmd.assert().success(); + + let actual = std::fs::read_to_string(temp.path()).expect("read temp file"); + assert_eq!(actual, expected, "in-place content mismatch"); + } +} + +/// Stdin→stdout equality cases for YAML frontmatter handling. +#[rstest] +#[case::preserved(&[], concat!( + "---\n", + "title: Example\n", + "author: Test\n", + "---\n", + "\n", + "|A|B|\n", + "|1|2|\n", +), concat!( + "---\n", + "title: Example\n", + "author: Test\n", + "---\n", + "\n", + "| A | B |\n", + "| 1 | 2 |\n", +))] +#[case::dot_closer(&[], concat!( + "---\n", + "title: Example\n", + "...\n", + "# Heading\n", +), concat!( + "---\n", + "title: Example\n", + "...\n", + "# Heading\n", +))] +#[case::later_dash_block_not_frontmatter(&[], concat!( + "# Heading\n", + "\n", + "---\n", + "\n", + "Text after break\n", +), concat!( + "# Heading\n", + "\n", + "---\n", + "\n", + "Text after break\n", +))] +#[case::with_renumber(&["--renumber"], concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "3. Third item\n", + "5. Fifth item\n", +), concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "1. Third item\n", + "2. Fifth item\n", +))] +#[case::malformed_treated_as_body(&[], concat!( + "---\n", + "This is not valid YAML frontmatter\n", + "and there is no closing delimiter.\n", +), concat!( + "---\n", + "This is not valid YAML frontmatter\n", + "and there is no closing delimiter.\n", +))] +fn test_cli_yaml_frontmatter_stdin( + #[case] args: &[&str], + #[case] input: &str, + #[case] expected: &str, +) { + let mut cmd = Command::cargo_bin("mdtablefix").expect("find binary"); + cmd.args(args) + .write_stdin(input) + .assert() + .success() + .stdout(expected.to_string()); +} + +/// In-place file modification cases for YAML frontmatter handling. +#[rstest] +#[case::basic(&[], concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "|A|B|\n", + "|1|2|\n", +), concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "| A | B |\n", + "| 1 | 2 |\n", +))] +fn test_cli_yaml_frontmatter_in_place_variants( + #[case] args: &[&str], + #[case] input: &str, + #[case] expected: &str, + in_place_runner: impl Fn(&[&str], &str, &str), +) { + in_place_runner(args, input, expected); +} + +// Cannot be parameterized: uses partial/line-level assertions rather than stdout equality. +/// Tests that YAML frontmatter is preserved with `--wrap` option. +#[test] +fn test_cli_yaml_frontmatter_with_wrap() { + let input = concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "This is a very long paragraph that should be wrapped to 80 columns when the wrap option \ + is enabled.\n", + ); + let cmd_result = Command::cargo_bin("mdtablefix") + .expect("Failed to create cargo command") + .arg("--wrap") + .write_stdin(input) + .assert() + .success(); + let output = String::from_utf8_lossy(&cmd_result.get_output().stdout); + assert!(output.starts_with("---\ntitle: Example\n---\n")); +} + +// Cannot be parameterized: uses partial/line-level assertions rather than stdout equality. +/// Tests that YAML frontmatter delimiters are not rewritten by `--breaks`. +#[test] +fn test_cli_yaml_frontmatter_with_breaks() { + let input = concat!( + "---\n", + "title: Example\n", + "---\n", + "\n", + "Text\n", + "\n", + "---\n", + "\n", + "More text\n", + ); + let cmd_result = Command::cargo_bin("mdtablefix") + .expect("Failed to create cargo command") + .args(["--breaks", "--wrap"]) + .write_stdin(input) + .assert() + .success(); + let output = String::from_utf8_lossy(&cmd_result.get_output().stdout); + // Frontmatter delimiters should be preserved + let lines: Vec<&str> = output.lines().collect(); + assert!( + lines.len() >= 3, + "expected at least 3 lines in output: {output}" + ); + assert_eq!(lines[0], "---"); + assert_eq!(lines[1], "title: Example"); + assert_eq!(lines[2], "---"); + // The later --- should be converted to underscores (thematic break) + let later_dashes = lines.iter().position(|l| l.starts_with("___")); + assert!( + later_dashes.is_some(), + "thematic break should be underscores" + ); +} diff --git a/tests/code_emphasis.rs b/tests/code_emphasis.rs index 674b27fb..8cc7cacd 100644 --- a/tests/code_emphasis.rs +++ b/tests/code_emphasis.rs @@ -3,8 +3,9 @@ //! Verifies that emphasis markers adjacent to inline code are normalised. mod prelude; -use prelude::{run_cli_with_args, run_cli_with_stdin}; use std::fs; + +use prelude::{run_cli_with_args, run_cli_with_stdin}; use tempfile::tempdir; #[test] @@ -101,8 +102,10 @@ fn cli_in_place_preserves_inner_backticks() { #[test] fn cli_code_emphasis_with_wrap_and_renumber() { - let input = "8. `StepContext`** Enhancement (in **`crates/rstest-bdd/src/context.rs`**)**\n10. Second item\n"; - let expected = "1. **`StepContext` Enhancement (in `crates/rstest-bdd/src/context.rs`)**\n2. Second item\n"; + let input = "8. `StepContext`** Enhancement (in \ + **`crates/rstest-bdd/src/context.rs`**)**\n10. Second item\n"; + let expected = "1. **`StepContext` Enhancement (in `crates/rstest-bdd/src/context.rs`)**\n2. \ + Second item\n"; run_cli_with_stdin(&["--code-emphasis", "--wrap", "--renumber"], input) .success() .stdout(expected); diff --git a/tests/common/mod.rs b/tests/common/mod.rs index 40f701ce..663f68f0 100644 --- a/tests/common/mod.rs +++ b/tests/common/mod.rs @@ -19,7 +19,7 @@ macro_rules! lines_vec { /// /// Example: /// ``` -/// let input: Vec = include_lines!("data/bold_header_input.txt"); +/// let input: Vec = include_lines!("data/bold_header_input.txt"); /// ``` #[expect(unused_macros, reason = "macros are optional helpers across modules")] macro_rules! include_lines { diff --git a/tests/parallel.rs b/tests/parallel.rs index 593f12d2..e9622c6c 100644 --- a/tests/parallel.rs +++ b/tests/parallel.rs @@ -10,9 +10,7 @@ mod prelude; use prelude::*; #[rstest] -fn test_cli_parallel_empty_file_list() { - run_cli_with_args(&[]).success().stdout("\n"); -} +fn test_cli_parallel_empty_file_list() { run_cli_with_args(&[]).success().stdout("\n"); } #[rstest] fn test_cli_parallel_multiple_files() { diff --git a/tests/process_frontmatter.rs b/tests/process_frontmatter.rs new file mode 100644 index 00000000..2518778c --- /dev/null +++ b/tests/process_frontmatter.rs @@ -0,0 +1,79 @@ +//! Tests for YAML frontmatter handling in process functions. + +use mdtablefix::process::{Options, process_stream, process_stream_inner}; +use rstest::rstest; + +#[rstest] +#[case( + vec!["---", "title: Example", "author: Test", "---", "# Heading", "|A|B|", "|1|2|"], + true, + Some(vec!["---", "title: Example", "author: Test", "---"]), +)] +#[case( + vec!["---", "title: Example", "...", "Body text"], + true, + Some(vec!["---", "title: Example", "...", "Body text"]), +)] +#[case( + vec!["# Heading", "|A|B|", "|1|2|"], + false, + None, +)] +#[case( + vec!["---", "Not frontmatter", "More text"], + false, + None, +)] +fn frontmatter_detection_behaviour( + #[case] raw: Vec<&str>, + #[case] has_frontmatter: bool, + #[case] expected_prefix: Option>, +) { + let first_line = raw[0].to_string(); + let input: Vec = raw.into_iter().map(str::to_string).collect(); + let out = process_stream(&input); + assert!(!out.is_empty()); + + if has_frontmatter { + if let Some(prefix) = expected_prefix { + for (i, expected_line) in prefix.iter().enumerate() { + assert_eq!(&out[i], *expected_line); + } + } + } else if first_line == "---" { + // Unmatched opener case: --- is treated as body content + let joined = out.join("\n"); + assert!(out[0].contains("---")); + assert!(joined.contains("Not frontmatter")); + assert!(joined.contains("More text")); + } else { + // No frontmatter case: body processed normally + assert_eq!(out[0], "# Heading"); + assert!(out.len() >= 2); + } +} + +#[test] +fn process_stream_inner_does_not_handle_frontmatter() { + // process_stream_inner should NOT handle frontmatter - it's the caller's + // responsibility. This test verifies that behavior. + let input = vec![ + "---".to_string(), + "title: Example".to_string(), + "---".to_string(), + "# Heading".to_string(), + ]; + let out = process_stream_inner( + &input, + Options { + headings: false, + ..Default::default() + }, + ); + // process_stream_inner doesn't split frontmatter, so --- is treated as body + // With headings: false, lines should pass through unchanged + assert_eq!(out[0], "---"); + assert_eq!(out[1], "title: Example"); + assert_eq!(out[2], "---"); + assert_eq!(out[3], "# Heading"); +}