Skip to content

Refactor stdlib path helpers#166

Merged
leynos merged 21 commits intomainfrom
codex/implement-path-and-file-filters-z6j51e
Sep 27, 2025
Merged

Refactor stdlib path helpers#166
leynos merged 21 commits intomainfrom
codex/implement-path-and-file-filters-z6j51e

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Sep 25, 2025

Summary

  • split the stdlib module into a small coordinator and a new path submodule for the filter implementations
  • stub the Unix-only file test predicates on other targets and extend expanduser to consult Windows home environment fallbacks
  • refresh the design notes and filter tests to cover the refactor and the new environment handling

Testing

  • make fmt
  • make lint
  • make test
  • make markdownlint

https://chatgpt.com/codex/tasks/task_e_68d2546001688322ab61db963c05f88d

Summary by Sourcery

Refactor the standard library file and path helpers by extracting all path-based filters into a dedicated submodule, stubbing Unix-only file tests on non-Unix targets, and enhancing the expanduser filter with Windows environment fallbacks.

New Features:

  • Introduce a new path submodule encapsulating filters like basename, dirname, with_suffix, relative_to, realpath, expanduser, size, contents, linecount, hash, and digest
  • Add an optional legacy-digests Cargo feature to enable SHA1 and MD5 hashing support

Bug Fixes:

  • Stub Unix-only file test predicates (pipe, block_device, char_device, device) on non-Unix platforms to always return false for portability

Enhancements:

  • Split stdlib module into a lightweight coordinator and a new path submodule
  • Extend expanduser to honor HOME, USERPROFILE, HOMEDRIVE/HOMEPATH, and HOMESHARE on Windows
  • Utilize cap-std's UTF-8 directory APIs with ambient authority for all filesystem operations
  • Refresh design documentation with implementation notes and updated roadmap

Build:

  • Add digest, sha1, and md5 dependencies and define the legacy-digests feature in Cargo.toml

Documentation:

  • Update netsuke design notes and roadmap to cover the refactored filters and environment handling

Tests:

  • Add comprehensive std_filter_tests covering each filter’s functionality, error handling, and platform-specific behavior

Move path filter logic into a dedicated module, stub Unix-only file tests on non-Unix targets, and expand the HOME resolution to cover Windows fallbacks with updated docs and tests.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 25, 2025

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Summary by CodeRabbit

  • New Features

    • Added comprehensive path and file filters for templates (e.g., basename, dirname, with_suffix, relative_to, realpath, expanduser, size, contents, linecount, hash, digest).
    • Introduced file-type tests (dir, file, symlink, and Unix-only variants) with cross‑platform handling.
    • Improved, user-friendly error messages for I/O issues.
  • Documentation

    • Clarified non‑Unix behaviour, encoding requirements, supported hash algorithms, and shell/path handling.
    • Updated roadmap: path and file filters marked complete.
  • Tests

    • Extensive unit, integration, and Cucumber scenarios, including Unix‑specific realpath coverage.
  • Chores

    • Added optional legacy digest support (SHA‑1, MD5) and new hashing dependency.

Walkthrough

Introduce a new stdlib with path and file filters, filesystem utilities, and hash/digest support; add feature-gated legacy digests (sha1, md5). Remove the old stdlib file-type predicates module and rewire registration via a new entrypoint. Update docs and roadmap. Add extensive unit and cucumber tests, including Unix-specific cases.

Changes

Cohort / File(s) Summary
Build features and deps
Cargo.toml
Add features section with default empty; add optional legacy-digests feature enabling sha1 and md-5; add digest dependency; wire optional sha1 and md5 (as md-5).
Stdlib restructure: entrypoint
src/stdlib/mod.rs, src/stdlib.rs
Add new stdlib module with pub fn register(...) registering file-type tests and path filters; remove legacy src/stdlib.rs including old register and helper logic.
Path stdlib modules
src/stdlib/path/mod.rs, src/stdlib/path/filters.rs, src/stdlib/path/fs_utils.rs, src/stdlib/path/hash_utils.rs, src/stdlib/path/io_helpers.rs, src/stdlib/path/path_utils.rs
Add path-related filters (basename, dirname, with_suffix, relative_to, realpath, expanduser, size, contents, linecount, hash, digest); add filesystem helpers, hashing utilities, IO error mapping, and path manipulation utilities; re-export selected items crate-visibly.
Unit tests for filters
tests/std_filter_tests.rs
Add comprehensive tests for all filters, hashing (including legacy algorithms gating), error cases, and platform-specific behaviours.
Cucumber integration tests
tests/cucumber.rs, tests/steps/mod.rs, tests/steps/stdlib_steps.rs, tests/features/stdlib.feature, tests/features_unix/stdlib.feature
Extend Cucumber world state; add step module and scenarios covering stdlib filters, errors, and Unix symlink realpath.
Documentation updates
docs/netsuke-design.md, docs/roadmap.md
Update design notes to clarify non-Unix stubs and implementation details; mark path/file filters task as completed in roadmap.
Docs-only code comments
src/ir/graph.rs
Add documentation comment examples; no code or API changes.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Dev as Template Author
    participant Env as MiniJinja Environment
    participant Std as stdlib::register
    participant PF as path::filters
    participant FU as path::fs_utils
    participant HU as path::hash_utils
    participant IO as OS Filesystem
    participant EH as io_helpers

    Dev->>Env: create Environment
    Env->>Std: register(&mut env)
    Std->>PF: register_filters(env)
    note right of PF: Filters: basename, dirname,<br/>with_suffix, relative_to,<br/>realpath, expanduser,<br/>size, contents, linecount,<br/>hash, digest

    Dev->>Env: render template using filters
    Env->>PF: invoke filter(path, args)

    alt File metadata/read
      PF->>FU: open_file / file_size / read_utf8
      FU->>IO: perform fs ops
      IO-->>FU: result or io::Error
      FU->>EH: (on error) io_to_error(...)
      EH-->>PF: Error (InvalidOperation)
    else Hash/digest
      PF->>HU: compute_hash/compute_digest(path, alg, len)
      HU->>FU: open_file (stream)
      FU->>IO: open/read chunks
      IO-->>FU: bytes or io::Error
      FU->>EH: (on error) io_to_error(...)
      HU-->>PF: hex digest (possibly truncated)
    end

    PF-->>Env: value or Error
    Env-->>Dev: rendered output or error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • Add file-system Jinja tests #154 — Touches the same stdlib file-type tests and registration logic (dir/file/symlink/pipe/device), overlapping with the removal/rewrite introduced here.

Poem

New paths unfurl where filters sing,
A hash of bytes, a tidy string.
The pipes and blocks now softly sleep,
While realpaths through the symlinks creep.
Docs aligned, the roadmap bright—
Templates render, crisp and light.
Sha256 hums through the night.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title clarifies the primary refactoring of the standard library’s path helpers into a new submodule, emphasising the code reorganisation without unnecessary detail or ambiguity.
Description Check ✅ Passed The description outlines the module split into coordinator and path submodule, Unix-only predicate stubbing, Windows expanduser enhancements, feature gating, documentation refresh, and test commands, all of which directly match the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • UTF-8: Entity not found: Issue - Could not find referenced Issue.

Comment @coderabbitai help to get the list of available commands and usage tips.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Sep 25, 2025

Reviewer's Guide

This PR refactors the stdlib path helpers by extracting filter implementations into a dedicated submodule, stubbing platform-specific file tests, enhancing expanduser to handle Windows fallbacks, introducing a legacy-digests feature for hash filters, and updating documentation and tests to cover these changes.

Class diagram for refactored stdlib path helpers

classDiagram
    class stdlib {
        +register(env: Environment)
    }
    class path {
        +register_filters(env: Environment)
        +file_type_matches(path: Utf8Path, predicate: fn(FileType) -> bool): Result<bool, Error>
        +basename(path: Utf8Path): String
        +dirname(path: Utf8Path): String
        +with_suffix(path: Utf8Path, suffix: str, count: usize, sep: str): Result<Utf8PathBuf, Error>
        +relative_to(path: Utf8Path, root: Utf8Path): Result<String, Error>
        +realpath(path: Utf8Path): Result<Utf8PathBuf, Error>
        +expanduser(raw: str): Result<String, Error>
        +file_size(path: Utf8Path): Result<u64, Error>
        +read_text(path: Utf8Path, encoding: str): Result<String, Error>
        +linecount(path: Utf8Path): Result<usize, Error>
        +compute_hash(path: Utf8Path, alg: str): Result<String, Error>
        +compute_digest(path: Utf8Path, len: usize, alg: str): Result<String, Error>
    }
    class io_helpers {
        +io_to_error(path: Utf8Path, action: str, err: io::Error): Error
    }
    stdlib --> path : uses
    path --> io_helpers : uses
Loading

Class diagram for platform-specific file test predicates

classDiagram
    class stdlib {
        +register_file_tests(env: Environment)
        +is_dir(ft: FileType): bool
        +is_file(ft: FileType): bool
        +is_symlink(ft: FileType): bool
        +is_fifo(ft: FileType): bool
        +is_block_device(ft: FileType): bool
        +is_char_device(ft: FileType): bool
        +is_device(ft: FileType): bool
    }
    note for stdlib "On non-Unix targets, is_fifo, is_block_device, is_char_device, is_device are stubbed to always return false."
Loading

File-Level Changes

Change Details Files
Extract path filters into a new path submodule
  • Created src/stdlib/path.rs to host all path and file filters
  • Added io_helpers.rs for consistent I/O-to-template errors
  • Simplified src/stdlib/mod.rs to register file tests and delegate filters to the submodule
src/stdlib/path.rs
src/stdlib/path/io_helpers.rs
src/stdlib/mod.rs
Stub Unix-only file tests on non-Unix targets
  • Updated file test predicates in mod.rs to always return false for pipe, block_device, char_device, device on non-Unix
  • Unified file_type_matches to return false for missing paths instead of errors
src/stdlib/mod.rs
Enhance expanduser to support Windows environment fallbacks
  • Reworked expanduser and home_from_env to consult HOME, USERPROFILE, HOMEDRIVE/HOMEPATH, HOMESHARE
  • Added error handling for missing home variables and unsupported ~user expansions
src/stdlib/path.rs
Introduce legacy-digests feature and refresh hash/digest support
  • Added legacy-digests feature in Cargo.toml with optional sha1 and md5
  • Imported digest, sha2, and optional legacy crates; updated compute_hash to enable/disable algorithms
  • Added tests to validate default algorithms and feature-gated legacy support
Cargo.toml
src/stdlib/path.rs
Update documentation and add comprehensive filter tests
  • Refreshed implementation notes in docs/netsuke-design.md and flagged roadmap tasks in docs/roadmap.md
  • Added tests/std_filter_tests.rs covering basename, dirname, with_suffix, realpath, contents, linecount, hash/digest, size, expanduser scenarios
docs/netsuke-design.md
docs/roadmap.md
tests/std_filter_tests.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `tests/std_filter_tests.rs:254-258` </location>
<code_context>
+    feature = "legacy-digests",
+    case("md5", "8d777f385d3dfec8815d20f7496026dc", "8d777f38",)
+)]
+fn hash_and_digest_filters(
+    filter_workspace: (tempfile::TempDir, Utf8PathBuf),
+    #[case] alg: &str,
+    #[case] expected_hash: &str,
+    #[case] expected_digest: &str,
+) {
+    let (_temp, root) = filter_workspace;
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion: Add a test for empty files in hash and digest filters.

Please add a test for empty file input to verify correct hash and digest outputs for this edge case.

Suggested implementation:

```rust
#[cfg_attr(
    feature = "legacy-digests",
    case("md5", "8d777f385d3dfec8815d20f7496026dc", "8d777f38",)
)]
#[cfg_attr(
    feature = "legacy-digests",
    case("md5-empty", "d41d8cd98f00b204e9800998ecf8427e", "d41d8cd9",)
)]
#[cfg_attr(
    feature = "legacy-digests",
    case("sha1-empty", "da39a3ee5e6b4b0d3255bfef95601890afd80709", "da39a3ee",)
)]

```

```rust
fn hash_and_digest_filters(
    filter_workspace: (tempfile::TempDir, Utf8PathBuf),
    #[case] alg: &str,
    #[case] expected_hash: &str,
    #[case] expected_digest: &str,
) {
    let (_temp, root) = filter_workspace;

    let mut env = Environment::new();
    stdlib::register(&mut env);

    let file_path = if alg.ends_with("-empty") {
        // Create an empty file
        let empty_file = root.join("empty_file");
        std::fs::File::create(&empty_file).unwrap();
        empty_file
    } else {
        // Create a file with known content
        let file = root.join("file");
        std::fs::write(&file, b"test").unwrap();
        file
    };

    let output_hash = render(&mut env, "hash", &format!("{{{{ path | hash('{}') }}}}", alg.replace("-empty", "")), &file_path);
    assert_eq!(output_hash, expected_hash);

    let output_digest = render(&mut env, "digest", &format!("{{{{ path | digest('{}') }}}}", alg.replace("-empty", "")), &file_path);
    assert_eq!(output_digest, expected_digest);
}

```
</issue_to_address>

### Comment 2
<location> `tests/std_filter_tests.rs:84-91` </location>
<code_context>
+}
+
+#[rstest]
+fn with_suffix_filter(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
+    let (_temp, root) = filter_workspace;
+    let mut env = Environment::new();
+    stdlib::register(&mut env);
+    let file = root.join("file.tar.gz");
+    Dir::open_ambient_dir(&root, ambient_authority())
+        .expect("dir")
+        .write("file.tar.gz", b"data")
+        .expect("write");
+    let first = render(
+        &mut env,
+        "suffix",
+        "{{ path | with_suffix('.log') }}",
+        &file,
+    );
+    assert_eq!(first, root.join("file.tar.log").as_str());
+    let second = render(
+        &mut env,
+        "suffix_alt",
+        "{{ path | with_suffix('.zip', 2) }}",
+        &file,
+    );
+    assert_eq!(second, root.join("file.zip").as_str());
+}
+
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion: Add a test for with_suffix with count=0.

Testing with count=0 will confirm that with_suffix correctly appends the new suffix without removing any existing suffixes.

```suggestion
    let second = render(
        &mut env,
        "suffix_alt",
        "{{ path | with_suffix('.zip', 2) }}",
        &file,
    );
    assert_eq!(second, root.join("file.zip").as_str());

    // Test with count=0: should append the new suffix without removing any existing suffixes
    let third = render(
        &mut env,
        "suffix_count_zero",
        "{{ path | with_suffix('.bak', 0) }}",
        &file,
    );
    assert_eq!(third, root.join("file.tar.gz.bak").as_str());
}
```
</issue_to_address>

### Comment 3
<location> `tests/std_filter_tests.rs:206-213` </location>
<code_context>
+}
+
+#[rstest]
+fn contents_and_linecount_filters(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
+    let (_temp, root) = filter_workspace;
+    let mut env = Environment::new();
+    stdlib::register(&mut env);
+    let file = root.join("file");
+    let text = render(&mut env, "contents", "{{ path | contents }}", &file);
+    assert_eq!(text, "data");
+    let lines = render(
+        &mut env,
+        "linecount",
+        "{{ path | linecount }}",
+        &root.join("lines.txt"),
+    );
+    assert_eq!(lines.parse::<usize>().expect("usize"), 3);
+}
+
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion: Add a test for linecount on an empty file.

This will verify that the filter correctly returns 0 for empty files.

```suggestion
    let lines = render(
        &mut env,
        "linecount",
        "{{ path | linecount }}",
        &root.join("lines.txt"),
    );
    assert_eq!(lines.parse::<usize>().expect("usize"), 3);

    // Test linecount on an empty file
    let empty_file = root.join("empty.txt");
    std::fs::File::create(&empty_file).expect("create empty file");
    let empty_lines = render(
        &mut env,
        "empty_linecount",
        "{{ path | linecount }}",
        &empty_file,
    );
    assert_eq!(empty_lines.parse::<usize>().expect("usize"), 0);
}
```
</issue_to_address>

### Comment 4
<location> `tests/std_filter_tests.rs:340-348` </location>
<code_context>
+}
+
+#[rstest]
+fn size_filter(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
+    let (_temp, root) = filter_workspace;
+    let mut env = Environment::new();
+    stdlib::register(&mut env);
+    let file = root.join("file");
+    let size = render(&mut env, "size", "{{ path | size }}", &file);
+    assert_eq!(size.parse::<u64>().expect("u64"), 4);
+}
+
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion: Add a test for size filter on a missing file.

Add a test to check that the size filter handles non-existent files correctly by returning an appropriate error.

```suggestion
#[rstest]
fn size_filter(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
    let (_temp, root) = filter_workspace;
    let mut env = Environment::new();
    stdlib::register(&mut env);
    let file = root.join("file");
    let size = render(&mut env, "size", "{{ path | size }}", &file);
    assert_eq!(size.parse::<u64>().expect("u64"), 4);
}

#[rstest]
fn size_filter_missing_file(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
    let (_temp, root) = filter_workspace;
    let mut env = Environment::new();
    stdlib::register(&mut env);
    let missing_file = root.join("does_not_exist");
    let result = std::panic::catch_unwind(|| {
        render(&mut env, "size_missing", "{{ path | size }}", &missing_file)
    });
    assert!(result.is_err(), "Expected error for missing file");
    if let Err(err) = result {
        let err_msg = format!("{:?}", err);
        assert!(
            err_msg.contains("No such file") || err_msg.contains("not found") || err_msg.contains("does_not_exist"),
            "Error message should mention missing file: {}",
            err_msg
        );
    }
}
```
</issue_to_address>

### Comment 5
<location> `tests/std_filter_tests.rs:45-47` </location>
<code_context>
+}
+
+#[rstest]
+fn expanduser_filter_user_specific(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
+    let (_temp, root) = filter_workspace;
+    let mut env = Environment::new();
</code_context>

<issue_to_address>
**suggestion (testing):** Suggestion: Add a test for expanduser with a path that does not start with ~.

Please add a test case where expanduser is called with a path not starting with ~ to verify it returns the path unchanged.

```suggestion
}

#[rstest]
fn expanduser_filter_non_tilde_path(filter_workspace: (tempfile::TempDir, Utf8PathBuf)) {
    let (_temp, root) = filter_workspace;
    let mut env = Environment::new();

    stdlib::register(&mut env);

    let file = root.join("file");
    let output = render(&mut env, "expanduser", "{{ path | expanduser }}", &file);

    // The path does not start with ~, so it should be unchanged
    assert_eq!(output, file.to_string());
}

#[rstest]
```
</issue_to_address>

### Comment 6
<location> `src/stdlib/path.rs:1` </location>
<code_context>
+use std::{
+    env,
+    fmt::Write as FmtWrite,
</code_context>

<issue_to_address>
**issue (complexity):** Consider splitting the file into smaller modules and replacing custom path and IO logic with standard library and Camino APIs.

Here are a few concrete steps to both reduce the size of this file and eliminate hand‐rolled path‐and‐io logic in favor of existing APIs.

1) Pull each concern into its own tiny module  

   src/filters.rs becomes only filter registration plus `mod` imports:

   ```rust
   // src/filters.rs
   use minijinja::Environment;
   mod path_utils;
   mod io_utils;
   mod hash_utils;

   pub(super) fn register_filters(env: &mut Environment<'_>) {
       use path_utils::*;
       use io_utils::*;
       use hash_utils::*;

       env.add_filter("basename", |raw: String| Ok(basename(&raw)));
       env.add_filter("dirname", |raw: String| Ok(dirname(&raw)));
       env.add_filter("with_suffix", |raw, suffix, count, sep| {
           with_suffix(&raw, &suffix, count.unwrap_or(1), sep.as_deref().unwrap_or("."))
       });
       env.add_filter("realpath", |raw: String| {
           canonicalize_any(&raw).map(|p| p.into_string())
       });
       // …and so on…
   }
   ```

2) In path_utils.rs, ditch the manual recursion in `canonicalize_any` and parent‐dir shenanigans; use std/camino APIs:

   ```rust
   // src/path_utils.rs
   use camino::Utf8PathBuf;
   use std::{fs, io};
   use minijinja::{Error, ErrorKind};

   pub fn basename(raw: &str) -> String {
       let p = camino::Utf8Path::new(raw);
       p.file_name().unwrap_or(p.as_str()).to_string()
   }

   pub fn dirname(raw: &str) -> String {
       let p = camino::Utf8Path::new(raw);
       p.parent().unwrap_or_else(|| camino::Utf8Path::new("")).to_string()
   }

   pub fn canonicalize_any(raw: &str) -> Result<Utf8PathBuf, Error> {
       let pb = fs::canonicalize(raw)
           .map_err(|e| Error::new(ErrorKind::Io, format!("canonicalise {}: {e}", raw)))?;
       pb.try_into()
         .map_err(|_| Error::new(ErrorKind::InvalidOperation, "path not UTF-8"))
   }

   pub fn with_suffix(
       raw: &str,
       suffix: &str,
       count: usize,
       sep: &str,
   ) -> Result<camino::Utf8PathBuf, Error> {
       if sep.is_empty() {
           return Err(Error::new(ErrorKind::InvalidOperation, "separator must not be empty"));
       }
       let mut p = camino::Utf8Path::new(raw).to_path_buf();
       let file = p.file_stem().unwrap_or_default().to_string_lossy().into_owned();
       let mut parts = file.rsplitn(count + 1, sep).collect::<Vec<_>>();
       parts.reverse();
       parts.pop(); // drop last `count` segments
       let new_name = parts.join(sep) + suffix;
       p.set_file_name(new_name);
       Ok(p.try_into().unwrap())
   }
   ```

3) Move all the file-reading ducks into io_utils.rs, and all the hashing into hash_utils.rs using the same pattern.  
   This keeps each file < 100 lines, removes duplication, and reuses std/camino fs APIs instead of manual Dir/symlink logic.
</issue_to_address>

### Comment 7
<location> `src/stdlib/mod.rs:34` </location>
<code_context>
+///     .expect("render");
+/// assert_eq!(rendered, "bar.txt");
+/// ```
+pub fn register(env: &mut Environment<'_>) {
+    register_file_tests(env);
+    path::register_filters(env);
</code_context>

<issue_to_address>
**issue (review_instructions):** Add behavioural and unit tests for the new stdlib registration and filters.

The new file-system helpers and filters registered in `register` require both behavioural and unit tests to demonstrate correct operation. Ensure that all new filters and file tests are covered by appropriate tests.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 8
<location> `src/stdlib/path.rs:21` </location>
<code_context>
+use sha1::Sha1;
+use sha2::{Sha256, Sha512};
+
+pub(super) fn register_filters(env: &mut Environment<'_>) {
+    env.add_filter("basename", |raw: String| -> Result<String, Error> {
+        Ok(basename(Utf8Path::new(&raw)))
</code_context>

<issue_to_address>
**issue (review_instructions):** Add behavioural and unit tests for all new path filters.

All filters registered here (e.g., basename, dirname, with_suffix, realpath, expanduser, size, contents, linecount, hash, digest) must be covered by both behavioural and unit tests. Add comprehensive tests for each filter.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 9
<location> `src/stdlib/path/io_helpers.rs:6` </location>
<code_context>
+use camino::Utf8Path;
+use minijinja::{Error, ErrorKind};
+
+pub(super) fn io_to_error(path: &Utf8Path, action: &str, err: io::Error) -> Error {
+    let io_kind = err.kind();
+    let label = io_error_kind_label(io_kind);
</code_context>

<issue_to_address>
**issue (review_instructions):** Add unit tests for io_to_error and error label mapping.

The error conversion logic and label mapping in io_helpers should be covered by unit tests to ensure correct error reporting and handling.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 10
<location> `docs/netsuke-design.md:873` </location>
<code_context>
+Implementation notes:
+
+- Filters use `cap-std` directories for all filesystem work, avoiding ambient
+  authority.
+- `realpath` canonicalises the parent directory before joining the resolved
+  entry so results are absolute and symlink-free.
</code_context>

<issue_to_address>
**issue (review_instructions):** This bullet point exceeds the 80 column limit for wrapping.

Please wrap this bullet point so that no line exceeds 80 columns, as per the documentation style guide.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Paragraphs and bullets must be wrapped to 80 columns

</details>
</issue_to_address>

### Comment 11
<location> `docs/netsuke-design.md:875` </location>
<code_context>
+- Filters use `cap-std` directories for all filesystem work, avoiding ambient
+  authority.
+- `realpath` canonicalises the parent directory before joining the resolved
+  entry so results are absolute and symlink-free.
+- `contents` and `linecount` currently support UTF-8 input; other encodings are
+  rejected with an explicit error.
</code_context>

<issue_to_address>
**issue (review_instructions):** This bullet point exceeds the 80 column limit for wrapping.

Please wrap this bullet point so that no line exceeds 80 columns, as per the documentation style guide.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Paragraphs and bullets must be wrapped to 80 columns

</details>
</issue_to_address>

### Comment 12
<location> `docs/netsuke-design.md:877` </location>
<code_context>
+- `realpath` canonicalises the parent directory before joining the resolved
+  entry so results are absolute and symlink-free.
+- `contents` and `linecount` currently support UTF-8 input; other encodings are
+  rejected with an explicit error.
+- `hash` and `digest` accept `sha256` (default) and `sha512`. Legacy
+  algorithms `sha1` and `md5` are cryptographically broken and are disabled by
</code_context>

<issue_to_address>
**issue (review_instructions):** This bullet point exceeds the 80 column limit for wrapping.

Please wrap this bullet point so that no line exceeds 80 columns, as per the documentation style guide.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Paragraphs and bullets must be wrapped to 80 columns

</details>
</issue_to_address>

### Comment 13
<location> `docs/netsuke-design.md:880` </location>
<code_context>
+  rejected with an explicit error.
+- `hash` and `digest` accept `sha256` (default) and `sha512`. Legacy
+  algorithms `sha1` and `md5` are cryptographically broken and are disabled by
+  default; enabling them requires the `legacy-digests` Cargo feature and should
+  only be done for compatibility with existing ecosystems.
+- `expanduser` mirrors shell semantics by inspecting `HOME`, `USERPROFILE`,
</code_context>

<issue_to_address>
**issue (review_instructions):** This bullet point exceeds the 80 column limit for wrapping.

Please wrap this bullet point so that no line exceeds 80 columns, as per the documentation style guide.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Paragraphs and bullets must be wrapped to 80 columns

</details>
</issue_to_address>

### Comment 14
<location> `docs/netsuke-design.md:883` </location>
<code_context>
+  default; enabling them requires the `legacy-digests` Cargo feature and should
+  only be done for compatibility with existing ecosystems.
+- `expanduser` mirrors shell semantics by inspecting `HOME`, `USERPROFILE`,
+  and on Windows the `HOMEDRIVE`/`HOMEPATH` or `HOMESHARE` fallbacks.
+  Platform-specific forms such as `~user` remain unsupported.
+- `with_suffix` removes dotted suffix segments (default `n = 1`) before
</code_context>

<issue_to_address>
**issue (review_instructions):** This bullet point exceeds the 80 column limit for wrapping.

Please wrap this bullet point so that no line exceeds 80 columns, as per the documentation style guide.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*.md`

**Instructions:**
Paragraphs and bullets must be wrapped to 80 columns

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/std_filter_tests.rs Outdated
Comment thread tests/std_filter_tests.rs Outdated
Comment thread tests/std_filter_tests.rs Outdated
Comment thread tests/std_filter_tests.rs Outdated
Comment thread tests/std_filter_tests.rs Outdated
Comment thread src/stdlib/path.rs Outdated
Comment thread src/stdlib/path/io_helpers.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Cargo.toml (1)

25-26: Unify camino version across all manifests
Dependencies use "1.1.12" but dev-dependencies and test_support/Cargo.toml use "1.2.0". Pick a single caret version (e.g. "1.2.0") in every Cargo.toml to avoid duplicate resolution.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99206a6 and 5d7c038.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (2 hunks)
  • docs/netsuke-design.md (2 hunks)
  • docs/roadmap.md (1 hunks)
  • src/stdlib.rs (0 hunks)
  • src/stdlib/mod.rs (1 hunks)
  • src/stdlib/path.rs (1 hunks)
  • src/stdlib/path/io_helpers.rs (1 hunks)
  • tests/std_filter_tests.rs (1 hunks)
💤 Files with no reviewable changes (1)
  • src/stdlib.rs
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Use the docs/ markdown files as the reference knowledge base and update them proactively when decisions or requirements change

docs/**/*.md: In Markdown docs that contain Rust examples, mark fenced code blocks with the rust language specifier (```rust)
Use assert!, assert_eq!, or assert_ne! in Markdown doctests to verify outcomes
Avoid using .unwrap() or .expect() in Markdown doctest examples; handle errors idiomatically
For fallible Markdown doctests using ?, use an explicit fn main() -> Result<...> and hide boilerplate with #
In Markdown doctests, you may end code with exact (()) (no spaces) to enable implicit Result-returning main
Use hidden lines (prefix #) in Markdown doctests to hide setup and boilerplate
In Markdown doctests with side effects, use no_run to avoid executing while still compiling
Use should_panic in Markdown doctests for panic-intended examples
Use compile_fail in Markdown doctests to illustrate invalid code; beware brittleness
Only use ignore in Markdown doctests for pseudocode/non-Rust/temporary skips; avoid otherwise
Use edition20xx in Markdown doctests when examples require a specific Rust edition

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

**/*.md: Documentation must use en-GB-oxendict spelling and grammar (LICENSE name excluded)
Validate Markdown with make markdownlint and run make fmt to format Markdown (including fixing table markup)
Validate Mermaid diagrams in Markdown by running make nixie
Wrap Markdown paragraphs and bullet points at 80 columns
Do not wrap tables and headings in Markdown
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets in Markdown
Use GitHub-flavoured Markdown footnotes ([^1])

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md

⚙️ CodeRabbit configuration file

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -yse / -our) spelling and grammar
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/Cargo.toml

📄 CodeRabbit inference engine (AGENTS.md)

**/Cargo.toml: Use explicit caret SemVer requirements for all dependencies
Disallow wildcard (*) and open-ended (>=) requirements; use ~ only with a documented reason

Files:

  • Cargo.toml
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Each Rust module file must begin with a module-level //! comment explaining its purpose
Document public APIs using Rustdoc comments (///) so cargo doc can generate docs
Place function attributes after doc comments
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and clearly document any usage
Do not use return in single-line functions
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer .expect() over .unwrap()
Use concat!() to combine long string literals rather than escaping newlines with a backslash
Use predicate functions for conditional criteria with more than two branches
Where a function is too long, extract meaningfully named helper functions (separation of concerns, CQRS)
Where a function has too many parameters, group related parameters into meaningfully named structs
If a function is unused with specific features selected, use conditional compilation with #[cfg] or #[cfg_attr]
Where a function returns a large error, consider using Arc to reduce returned data size
Use semantic error enums deriving std::error::Error via thiserror for inspectable conditions
Clippy warnings must be disallowed; fix issues in code rather than silencing
Do not silence lints except as a last resort; any suppression must be tightly scoped and include a clear reason
Prefer #[expect] over #[allow] when suppressing lints
Use cap-std for all filesystem operations (capability-based, sandboxed)
Use camino for path handling; avoid std::path::PathBuf directly
Avoid std::fs directly; wrap with cap-std
Comments must use en-GB-oxendict spelling and grammar (except external API names)
No single Rust source file may exceed 400 lines; break up large switches/dispatch tables by feature
Name booleans with is/has/should prefixes; use clear, descriptive names for variables and functions
Function documentation must include...

Files:

  • src/stdlib/mod.rs
  • tests/std_filter_tests.rs
  • src/stdlib/path.rs
  • src/stdlib/path/io_helpers.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in...

Files:

  • src/stdlib/mod.rs
  • tests/std_filter_tests.rs
  • src/stdlib/path.rs
  • src/stdlib/path/io_helpers.rs
tests/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.rs: Large blocks of test data should be moved to external data files
Use rstest fixtures for shared setup
Replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks and stubs
Mock non-deterministic dependencies (Env, Clock) via dependency injection using mockable crate
Test documentation should omit examples that merely reiterate test logic

Files:

  • tests/std_filter_tests.rs
🧬 Code graph analysis (4)
src/stdlib/mod.rs (1)
src/stdlib/path.rs (2)
  • register_filters (21-80)
  • file_type_matches (82-96)
tests/std_filter_tests.rs (3)
src/stdlib/mod.rs (1)
  • register (34-37)
test_support/src/env_lock.rs (1)
  • acquire (17-21)
test_support/src/env_var_guard.rs (1)
  • remove (51-57)
src/stdlib/path.rs (1)
src/stdlib/path/io_helpers.rs (1)
  • io_to_error (6-20)
src/stdlib/path/io_helpers.rs (1)
src/ir/cycle.rs (1)
  • path (141-143)
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Sourcery review
  • GitHub Check: build-test (ubuntu-latest, stable)
🔇 Additional comments (22)
docs/roadmap.md (1)

162-163: Mark completion aligns with implemented filters; run linters.

Keep this checked item; it matches the newly added filters. Run make fmt and make markdownlint to ensure docs wrapping and lint rules are satisfied.

Cargo.toml (1)

15-18: Feature gate wiring looks correct.

The legacy-digests feature correctly ties to optional sha1 and md5.

docs/netsuke-design.md (2)

831-834: Document Unix-only test behaviour matches implementation.

Keep the “stubbed to false on non‑Unix” note; src/stdlib/mod.rs implements this via cfg stubs.


870-887: Clarify digest feature gating mirrors code and Cargo.

The notes on sha1/md5 behind legacy-digests match Cargo features and path.rs. Leave as is. Run make markdownlint to validate wrapping.

src/stdlib/path/io_helpers.rs (2)

6-20: Error mapping is robust and preserves source.

The three-form message building and Error::with_source are correct.


22-59: Map common IoErrorKind variants comprehensively.

The label table is thorough and stable on MSRV 1.89; keep it.

src/stdlib/mod.rs (4)

1-7: Good module docs.

The //! header and high-level summary satisfy documentation requirements.


34-37: Register both tests and filters in one entry point.

The public register function cleanly composes test and filter registration.


39-58: Implement file tests with portable fallbacks.

The tests delegate to path::file_type_matches and stub Unix-only predicates to false off-Unix. This matches the design doc.


72-110: Stub Unix-only predicates off-Unix.

The cfg-gated implementations avoid unused imports and keep test names stable.

src/stdlib/path.rs (12)

21-80: Register filters idiomatically and validate inputs.

Filter wiring is correct, with sensible defaults and error returns for invalid inputs (e.g., encoding).


82-96: Return false on missing paths; surface real I/O errors.

file_type_matches handles NotFound gracefully and wraps other errors with io_to_error.


134-156: Canonicalise paths safely without ambient traversal.

canonicalize_any normalizes “.”, preserves roots, and uses cap-std Dir canonicalization; logic is sound.


162-193: with_suffix edge-case handling is correct.

Reject empty separators; remove up to count separated suffix segments; return recomposed path.


195-205: Emit precise error for non-relative paths.

relative_to uses strip_prefix and returns InvalidOperation with a clear message.


222-235: Implement ~ expansion with user-specific forms rejected.

expanduser matches documented behaviour and uses env fallbacks.


237-254: Windows and non-Windows HOME resolution matches docs.

The HOME/USERPROFILE/HOMEDRIVE+HOMEPATH/HOMESHARE cascade is implemented correctly.


255-273: Gate text reading to UTF‑8 and wrap errors.

Limit to UTF‑8 per docs; map errors via io_to_error.


280-317: Feature‑gate legacy digests with clear errors.

sha1/md5 paths honour the legacy-digests feature with explicit error messages when disabled.


319-343: Stream file hashing without allocation spikes.

Use cap-std OpenOptions and incremental Digest update with 8 KiB buffer; hex encoding returned.


345-351: Truncate digest safely.

Handle len ≥ hash length without panic; return full hash when shorter.


353-359: Avoid external hex dependency.

encode_hex via fmt::Write is simple and sufficient here.

Comment thread Cargo.toml
Comment thread Cargo.toml
Comment thread src/stdlib/path.rs Outdated
Comment thread src/stdlib/path/io_helpers.rs
Comment thread tests/std_filter_tests.rs Outdated
Comment thread tests/std_filter_tests.rs Outdated
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/cucumber.rs (1)

46-57: Prefer cap-std for filesystem probing in tests.

Replace std::fs::read_dir with cap_std APIs to stay consistent with capability-based access, even in tests.

As per coding guidelines

tests/std_filter_tests.rs (1)

1-542: Split this oversized test module.

Keep Rust source files under 400 lines. Extract themed groups into separate files (e.g., tests/std_filter_path_tests.rs, tests/std_filter_hash_tests.rs, tests/std_filter_contents_tests.rs, tests/std_filter_expanduser_tests.rs). Share fixtures via rstest.

As per coding guidelines

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7c038 and c2f3f8f.

📒 Files selected for processing (13)
  • src/ir/graph.rs (8 hunks)
  • src/stdlib/path/filters.rs (1 hunks)
  • src/stdlib/path/fs_utils.rs (1 hunks)
  • src/stdlib/path/hash_utils.rs (1 hunks)
  • src/stdlib/path/io_helpers.rs (1 hunks)
  • src/stdlib/path/mod.rs (1 hunks)
  • src/stdlib/path/path_utils.rs (1 hunks)
  • tests/cucumber.rs (2 hunks)
  • tests/features/stdlib.feature (1 hunks)
  • tests/features_unix/stdlib.feature (1 hunks)
  • tests/std_filter_tests.rs (1 hunks)
  • tests/steps/mod.rs (1 hunks)
  • tests/steps/stdlib_steps.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Each Rust module file must begin with a module-level //! comment explaining its purpose
Document public APIs using Rustdoc comments (///) so cargo doc can generate docs
Place function attributes after doc comments
Prefer immutable data; avoid unnecessary mut bindings
Handle errors with Result instead of panicking where feasible
Avoid unsafe code unless absolutely necessary and clearly document any usage
Do not use return in single-line functions
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer .expect() over .unwrap()
Use concat!() to combine long string literals rather than escaping newlines with a backslash
Use predicate functions for conditional criteria with more than two branches
Where a function is too long, extract meaningfully named helper functions (separation of concerns, CQRS)
Where a function has too many parameters, group related parameters into meaningfully named structs
If a function is unused with specific features selected, use conditional compilation with #[cfg] or #[cfg_attr]
Where a function returns a large error, consider using Arc to reduce returned data size
Use semantic error enums deriving std::error::Error via thiserror for inspectable conditions
Clippy warnings must be disallowed; fix issues in code rather than silencing
Do not silence lints except as a last resort; any suppression must be tightly scoped and include a clear reason
Prefer #[expect] over #[allow] when suppressing lints
Use cap-std for all filesystem operations (capability-based, sandboxed)
Use camino for path handling; avoid std::path::PathBuf directly
Avoid std::fs directly; wrap with cap-std
Comments must use en-GB-oxendict spelling and grammar (except external API names)
No single Rust source file may exceed 400 lines; break up large switches/dispatch tables by feature
Name booleans with is/has/should prefixes; use clear, descriptive names for variables and functions
Function documentation must include...

Files:

  • tests/cucumber.rs
  • tests/steps/mod.rs
  • src/stdlib/path/hash_utils.rs
  • src/ir/graph.rs
  • src/stdlib/path/path_utils.rs
  • src/stdlib/path/mod.rs
  • src/stdlib/path/fs_utils.rs
  • tests/steps/stdlib_steps.rs
  • tests/std_filter_tests.rs
  • src/stdlib/path/filters.rs
  • src/stdlib/path/io_helpers.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in...

Files:

  • tests/cucumber.rs
  • tests/steps/mod.rs
  • src/stdlib/path/hash_utils.rs
  • src/ir/graph.rs
  • src/stdlib/path/path_utils.rs
  • src/stdlib/path/mod.rs
  • src/stdlib/path/fs_utils.rs
  • tests/steps/stdlib_steps.rs
  • tests/std_filter_tests.rs
  • src/stdlib/path/filters.rs
  • src/stdlib/path/io_helpers.rs
tests/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.rs: Large blocks of test data should be moved to external data files
Use rstest fixtures for shared setup
Replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks and stubs
Mock non-deterministic dependencies (Env, Clock) via dependency injection using mockable crate
Test documentation should omit examples that merely reiterate test logic

Files:

  • tests/cucumber.rs
  • tests/steps/mod.rs
  • tests/steps/stdlib_steps.rs
  • tests/std_filter_tests.rs
🧬 Code graph analysis (8)
src/stdlib/path/hash_utils.rs (2)
src/stdlib/path/io_helpers.rs (1)
  • io_to_error (6-20)
src/stdlib/path/fs_utils.rs (1)
  • open_file (71-79)
src/stdlib/path/path_utils.rs (2)
src/stdlib/path/fs_utils.rs (1)
  • open_parent_dir (30-32)
src/stdlib/path/io_helpers.rs (1)
  • io_to_error (6-20)
src/stdlib/path/mod.rs (2)
src/stdlib/path/filters.rs (1)
  • register_filters (6-72)
src/stdlib/path/fs_utils.rs (1)
  • file_type_matches (34-47)
src/stdlib/path/fs_utils.rs (2)
src/stdlib/path/io_helpers.rs (1)
  • io_to_error (6-20)
src/stdlib/path/path_utils.rs (1)
  • normalise_parent (110-114)
tests/steps/stdlib_steps.rs (3)
src/stdlib/mod.rs (1)
  • register (34-37)
tests/std_filter_tests.rs (1)
  • render (23-34)
tests/cucumber.rs (1)
  • drop (60-64)
tests/std_filter_tests.rs (4)
tests/manifest_jinja_tests.rs (1)
  • env_lock (57-59)
src/stdlib/mod.rs (1)
  • register (34-37)
test_support/src/env_lock.rs (1)
  • acquire (17-21)
test_support/src/env_var_guard.rs (1)
  • remove (51-57)
src/stdlib/path/filters.rs (3)
src/stdlib/path/path_utils.rs (6)
  • basename (9-11)
  • dirname (13-15)
  • with_suffix (17-48)
  • relative_to (50-59)
  • canonicalize_any (61-86)
  • expanduser (95-108)
src/stdlib/path/fs_utils.rs (3)
  • file_size (49-56)
  • read_utf8 (58-64)
  • linecount (66-69)
src/stdlib/path/hash_utils.rs (2)
  • compute_hash (14-49)
  • compute_digest (51-57)
src/stdlib/path/io_helpers.rs (1)
src/ir/cycle.rs (1)
  • path (141-143)
🔍 Remote MCP Deepwiki

Additional Context for PR #166 Review

  1. Cargo Features

    • legacy-digests: Optional feature enabling SHA1 and MD5 support in compute_hash and compute_digest (gates inclusion of sha1 = { version = "0.10", optional = true } and md5 = { package = "md-5", version = "0.10", optional = true })
    • default = []: Ensures no digest algorithms enabled by default; users opt into legacy-digests.
  2. Filesystem API

    • Uses cap-std’s UTF-8 directory APIs with ambient authority for safe, cross-platform path operations [PR objectives].
    • Error mapping via io_to_error yields ErrorKind::InvalidOperation with contextual messages (path, action) [ai_summary: src/stdlib/path/io_helpers.rs].
  3. Platform Stubbing

    • Unix-only file tests (pipe, block_device, char_device, device) stubbed to always return false on non-Unix targets [PR objectives; ai_summary: src/stdlib.rs removal + src/stdlib/mod.rs addition].
  4. Path Utilities

    • Comprehensive path functions (basename, dirname, with_suffix, relative_to, canonicalize_any, expanduser, etc.) with Windows fallbacks for HOME resolution (HOME, USERPROFILE, HOMEDRIVE/HOMEPATH, HOMESHARE) [PR objectives; ai_summary: src/stdlib/path/path_utils.rs].
  5. Test Coverage

    • tests/std_filter_tests.rs covers all path and filter behaviors (error conditions, feature-gated algorithms) [ai_summary].
    • Cucumber feature files exercise both general and Unix-specific scenarios [ai_summary: tests/features*. feature].
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test (ubuntu-latest, stable)
  • GitHub Check: Sourcery review
🔇 Additional comments (11)
tests/steps/mod.rs (1)

13-13: Wire up stdlib steps. LGTM.

Module inclusion aligns with the new stdlib feature files and step definitions.

src/stdlib/path/mod.rs (1)

7-8: Crate-private re-exports are appropriate.

Re-exporting register_filters and file_type_matches at crate scope fits the new registration flow.

tests/features_unix/stdlib.feature (1)

1-6: Unix realpath scenario. LGTM.

Scenario scope and preconditions are correct; symlink behaviour is validated only on Unix.

src/ir/graph.rs (1)

65-65: Doctest imports to enable Error::custom. LGTM.

use serde::ser::Error as _; is the right pattern to bring the trait into scope without unused warnings; doc examples read cleanly and preserve en‑GB spellings (serialise/canonicalise).

Also applies to: 91-91, 113-113, 136-136, 151-151, 171-171, 194-194, 207-207

tests/cucumber.rs (1)

33-38: Extend world state for stdlib scenarios. LGTM.

Fields are minimal and clearly named; Utf8PathBuf aligns with camino usage elsewhere.

tests/features/stdlib.feature (1)

1-49: Comprehensive stdlib filter coverage. LGTM.

Scenarios exercise happy paths and error cases (encoding/size/errors/hash/digest) with clear assertions.

src/stdlib/path/filters.rs (1)

6-72: Filter registrations look solid.

Defaults, error kinds, and feature-gated digests are handled cleanly; closures delegate to well-scoped helpers.

tests/std_filter_tests.rs (3)

456-474: Good use of EnvLock and EnvVarGuard for HOME manipulation.

Serialise environment mutations in tests to avoid races and UB. This complies with the repo’s env safety policy.

As per coding guidelines

Also applies to: 491-515, 517-541


1-1: Add the missing module doc comment.

Add a leading //! comment describing this test module’s purpose.

Apply this diff:

+//! Tests for stdlib path and file filters: basename/dirname, with_suffix, relative_to, realpath,
+//! contents/linecount/size, hash/digest (with feature gating), and expanduser; includes Unix and
+//! cross‑platform cases with guarded environment mutations.
 use camino::{Utf8Path, Utf8PathBuf};

As per coding guidelines


36-45: Stop leaking strings in register_template.

Remove Box::leak; pass borrowed slices from owned Strings.

Apply this diff:

 fn register_template(
     env: &mut Environment<'_>,
     name: impl Into<String>,
     source: impl Into<String>,
 ) {
-    let leaked_name = Box::leak(name.into().into_boxed_str());
-    let leaked_source = Box::leak(source.into().into_boxed_str());
-    env.add_template(leaked_name, leaked_source)
-        .expect("template");
+    let name = name.into();
+    let source = source.into();
+    env.add_template(&name, &source).expect("template");
 }

As per coding guidelines

src/stdlib/path/io_helpers.rs (1)

1-1: Add the missing module doc comment.

Add a leading //! doc summarising IO→minijinja error adaptation and labelling.

Apply this diff:

+//! IO error adapters for path stdlib: translate io::Error into minijinja::Error with labelled,
+//! user‑facing messages and attach sources for diagnostics.
 use std::io::{self, ErrorKind as IoErrorKind};

As per coding guidelines

Comment thread src/stdlib/path/filters.rs
Comment thread src/stdlib/path/filters.rs
Comment thread src/stdlib/path/fs_utils.rs
Comment thread src/stdlib/path/hash_utils.rs
Comment thread src/stdlib/path/hash_utils.rs
Comment thread src/stdlib/path/hash_utils.rs Outdated
Comment thread src/stdlib/path/mod.rs
Comment thread src/stdlib/path/path_utils.rs
Comment thread src/stdlib/path/path_utils.rs
Comment thread tests/steps/stdlib_steps.rs
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Sep 26, 2025

@coderabbitai pause

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 26, 2025

✅ Actions performed

Reviews paused.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Add module-level docs for the path coordinator and the\nunderlying utilities to satisfy review feedback, and resolve\ncurrent_dir_utf8 via cap-std ambient directories. Document the\nCucumber step module for consistency with the project guidance.
Set the root Cargo.toml dependency to match the existing 1.2.0 caret requirement so all manifests resolve the same release.
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Clarify the io error adapter module comment to call out MiniJinja invalid-operation diagnostics and rework the stdlib Cucumber steps to accept Utf8PathBuf arguments where appropriate, reducing the string-heavy parameter ratio flagged in review.
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Repository owner deleted a comment from coderabbitai Bot Sep 27, 2025
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Sep 27, 2025

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix:

tests/std_filter_tests/path_filters.rs
Comment on lines +262 to +285

fn expanduser_filter_user_specific(filter_workspace: Workspace) {
    let (_temp, root) = filter_workspace;
    let mut env = Environment::new();
    stdlib::register(&mut env);
    let _lock = EnvLock::acquire();
    let _home_guard = EnvVarGuard::set("HOME", root.as_str());
    let _profile_guard = EnvVarGuard::remove("USERPROFILE");
    let _drive_guard = EnvVarGuard::remove("HOMEDRIVE");
    let _path_guard = EnvVarGuard::remove("HOMEPATH");
    let _share_guard = EnvVarGuard::remove("HOMESHARE");
    env.add_template("expanduser_user_specific", "{{ path | expanduser }}")
        .expect("template");
    let template = env
        .get_template("expanduser_user_specific")
        .expect("get template");
    let result = template.render(context!(path => "~otheruser/workspace"));
    let err = result.expect_err("expanduser should reject ~user expansion");
    assert_eq!(err.kind(), ErrorKind::InvalidOperation);
    assert!(
        err.to_string()
            .contains("user-specific ~ expansion is unsupported"),
        "error should mention unsupported user expansion",
    );
}

❌ New issue: Code Duplication
The module contains 8 functions with similar structure: dirname_filter,expanduser_filter_missing_home,expanduser_filter_non_tilde_path,expanduser_filter_user_specific and 4 more functions

@coderabbitai

This comment was marked as resolved.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Rebuild the stdlib path filter tests to use shared environment\nsetup and a structured TemplateErrorSpec so the duplication and\nargument-count review comments are resolved. Document the contents\nfilter's ambient file-system access inline.
codescene-delta-analysis[bot]

This comment was marked as outdated.

Align module headers and design notes with the review guidance, clarifying ambient file-system access for the contents filter and the cucumber steps.
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Sep 27, 2025

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix:

tests/std_filter_tests/path_filters.rs
Comment on lines +296 to +315

fn expanduser_filter_user_specific(filter_workspace: Workspace) {
    with_filter_env(filter_workspace, |root, env| {
        with_clean_env_vars(Some(root.as_str()), || {
            assert_template_error(
                env,
                TemplateErrorSpec {
                    name: "expanduser_user_specific",
                    template: "{{ path | expanduser }}",
                    context: json!({
                        "path": "~otheruser/workspace",
                    }),
                    expectation: TemplateErrorExpectation {
                        kind: ErrorKind::InvalidOperation,
                        contains: "user-specific ~ expansion is unsupported",
                    },
                },
            );
        });
    });
}

❌ New issue: Code Duplication
The module contains 7 functions with similar structure: expanduser_filter_missing_home,expanduser_filter_user_specific,realpath_filter_missing_path,relative_to_filter_outside_root and 3 more functions

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 27, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

I'll examine the current state of the test file to understand the remaining code duplication patterns.

@leynos After examining the current test file, I can see that while helper functions have been introduced, there's still significant structural duplication in the nested helper call patterns. The issue is that many test functions follow this repetitive structure:

with_filter_env(workspace, |root, env| {
    with_clean_env_vars(Some/None, || {
        assert_template_error(env, TemplateErrorSpec { ... });
    });
});

Suggested Fix

Create a higher-level helper that combines these patterns:

/// Helper for error tests that need environment variable cleanup
fn assert_filter_error_with_env(
    filter_workspace: Workspace,
    home_value: Option<&str>,
    spec: TemplateErrorSpec,
) {
    with_filter_env(filter_workspace, |root, env| {
        let home = home_value.map(|_| root.as_str());
        with_clean_env_vars(home, || {
            assert_template_error(env, spec);
        });
    });
}

/// Helper for success tests that need environment variable cleanup  
fn assert_filter_success_with_env(
    filter_workspace: Workspace,
    home_value: Option<&str>,
    name: &str,
    template: &str,
    context: serde_json::Value,
    expected: &str,
) {
    with_filter_env(filter_workspace, |root, env| {
        let home = home_value.map(|_| root.as_str());
        with_clean_env_vars(home, || {
            let result = render(env, name, template, context);
            assert_eq!(result, expected);
        });
    });
}

AI Coding Agent Prompt

Refactor the test file `tests/std_filter_tests/path_filters.rs` to eliminate the remaining code duplication by adding consolidated helper functions and simplifying the nested call patterns:

1. **Add consolidated helper functions** after the existing helpers:
   ```rust
   /// Helper for error tests that need environment variable cleanup
   fn assert_filter_error_with_env(
       filter_workspace: Workspace,
       home_value: Option<&str>,
       spec: TemplateErrorSpec,
   ) {
       with_filter_env(filter_workspace, |root, env| {
           let home = home_value.map(|_| root.as_str());
           with_clean_env_vars(home, || {
               assert_template_error(env, spec);
           });
       });
   }

   /// Helper for success tests that need environment variable cleanup  
   fn assert_filter_success_with_env(
       filter_workspace: Workspace,
       home_value: Option<&str>,
       name: &str,
       template: &str,
       context: serde_json::Value,
       expected: &str,
   ) {
       with_filter_env(filter_workspace, |root, env| {
           let home = home_value.map(|_| root.as_str());
           with_clean_env_vars(home, || {
               let result = render(env, name, template, context);
               assert_eq!(result, expected);
           });
       });
   }
   ```

2. **Refactor these specific functions** to use the new consolidated helpers:

   - **expanduser_filter_missing_home**: Replace nested calls with:
     ```rust
     assert_filter_error_with_env(
         filter_workspace,
         None, // No home value
         TemplateErrorSpec { /* existing spec */ }
     );
     ```

   - **expanduser_filter_user_specific**: Replace nested calls with:
     ```rust  
     assert_filter_error_with_env(
         filter_workspace,
         Some(""), // Will use root.as_str()
         TemplateErrorSpec { /* existing spec */ }
     );
     ```

   - **realpath_filter_missing_path**: Replace nested calls with appropriate helper

   - **relative_to_filter_outside_root**: Replace nested calls with appropriate helper

   - **with_suffix_filter_empty_separator**: Replace nested calls with appropriate helper

   - **hash_filter_unknown_algorithm**: Replace nested calls with appropriate helper

   - **digest_filter_unknown_algorithm**: Replace nested calls with appropriate helper

3. **Example transformation** for `expanduser_filter_user_specific`:
   ```rust
   #[rstest]
   fn expanduser_filter_user_specific(filter_workspace: Workspace) {
       assert_filter_error_with_env(
           filter_workspace,
           Some(""), // Will use workspace root as HOME
           TemplateErrorSpec {
               name: "expanduser_user_specific",
               template: "{{ path | expanduser }}",
               context: json!({
                   "path": "~otheruser/workspace",
               }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "user-specific ~ expansion is unsupported",
               },
           },
       );
   }
   ```

4. **Guidelines**:
   - Preserve all existing test logic and assertions
   - Maintain identical error messages and test coverage  
   - Keep conditional compilation attributes (`#[cfg(unix)]`)
   - Don't modify tests that don't follow the nested pattern
   - Use `Some("")` for `home_value` when you want to use the workspace root as HOME
   - Use `None` for `home_value` when you want to remove HOME entirely

Focus on eliminating the repetitive `with_filter_env` -> `with_clean_env_vars` -> `assert_template_error` nesting pattern while maintaining identical test behaviour.

🏁 Script executed:

#!/bin/bash
# Examine the current test file to see the duplication patterns
cat tests/std_filter_tests/path_filters.rs

Length of output: 9983


@leynos After examining the current test file, I can see the duplication is in the nested helper call patterns. While individual helpers exist, many tests still follow repetitive 2-3 layer nesting patterns like:

with_filter_env(workspace, |root, env| {
    with_clean_env_vars(Some/None, || {
        assert_template_error(env, spec);
    });
});

Suggested Fix

Create consolidated helpers that combine these common patterns:

/// Helper for error tests that need environment variable cleanup
fn assert_filter_error_with_env(
    filter_workspace: Workspace,
    home_value: Option<&str>,
    spec: TemplateErrorSpec,
) {
    with_filter_env(filter_workspace, |root, env| {
        let home = home_value.map(|_| root.as_str());
        with_clean_env_vars(home, || {
            assert_template_error(env, spec);
        });
    });
}

/// Helper for simple error tests without environment manipulation
fn assert_filter_error_simple(
    filter_workspace: Workspace,
    spec: TemplateErrorSpec,
) {
    with_filter_env(filter_workspace, |_root, env| {
        assert_template_error(env, spec);
    });
}

/// Helper for success tests with environment variable cleanup
fn assert_filter_success_with_env(
    filter_workspace: Workspace,
    home_value: Option<&str>,
    name: &str,
    template: &str,
    context: &impl serde::Serialize,
    expected: &str,
) {
    with_filter_env(filter_workspace, |root, env| {
        let home = home_value.map(|_| root.as_str());
        with_clean_env_vars(home, || {
            let result = render(env, name, template, context);
            assert_eq!(result, expected);
        });
    });
}

AI Coding Agent Prompt

Refactor the test file `tests/std_filter_tests/path_filters.rs` to eliminate nested helper call duplication by adding consolidated helper functions and simplifying the repetitive patterns:

1. **Add consolidated helper functions** after the existing `assert_template_error` function:
   ```rust
   /// Helper for error tests that need environment variable cleanup
   fn assert_filter_error_with_env(
       filter_workspace: Workspace,
       home_value: Option<&str>,
       spec: TemplateErrorSpec,
   ) {
       with_filter_env(filter_workspace, |root, env| {
           let home = home_value.map(|_| root.as_str());
           with_clean_env_vars(home, || {
               assert_template_error(env, spec);
           });
       });
   }

   /// Helper for simple error tests without environment manipulation
   fn assert_filter_error_simple(
       filter_workspace: Workspace,
       spec: TemplateErrorSpec,
   ) {
       with_filter_env(filter_workspace, |_root, env| {
           assert_template_error(env, spec);
       });
   }

   /// Helper for success tests with environment variable cleanup
   fn assert_filter_success_with_env(
       filter_workspace: Workspace,
       home_value: Option<&str>,
       name: &str,
       template: &str,
       context: &impl serde::Serialize,
       expected: &str,
   ) {
       with_filter_env(filter_workspace, |root, env| {
           let home = home_value.map(|_| root.as_str());
           with_clean_env_vars(home, || {
               let result = render(env, name, template, context);
               assert_eq!(result, expected);
           });
       });
   }
   ```

2. **Refactor these specific functions** to eliminate nested patterns:

   **expanduser_filter_missing_home**: Replace the `with_filter_env` + `with_clean_env_vars` + `assert_template_error` nesting with:
   ```rust
   #[rstest]
   fn expanduser_filter_missing_home(filter_workspace: Workspace) {
       assert_filter_error_with_env(
           filter_workspace,
           None, // Remove HOME
           TemplateErrorSpec {
               name: "expanduser_missing_home",
               template: "{{ path | expanduser }}",
               context: json!({ "path": "~/workspace" }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "no home directory environment variables are set",
               },
           },
       );
   }
   ```

   **expanduser_filter_user_specific**: Replace nesting with:
   ```rust
   #[rstest]
   fn expanduser_filter_user_specific(filter_workspace: Workspace) {
       assert_filter_error_with_env(
           filter_workspace,
           Some(""), // Will use workspace root as HOME
           TemplateErrorSpec {
               name: "expanduser_user_specific",
               template: "{{ path | expanduser }}",
               context: json!({ "path": "~otheruser/workspace" }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "user-specific ~ expansion is unsupported",
               },
           },
       );
   }
   ```

   **expanduser_filter**: Replace nesting with:
   ```rust
   #[rstest]
   fn expanduser_filter(filter_workspace: Workspace) {
       assert_filter_success_with_env(
           filter_workspace,
           Some(""), // Will use workspace root as HOME
           "expanduser",
           "{{ path | expanduser }}",
           &Utf8PathBuf::from("~/workspace"),
           // Expected will be computed as root.join("workspace").as_str()
       );
   }
   ```

   **relative_to_filter_outside_root**: Replace nesting with:
   ```rust
   assert_filter_error_simple(filter_workspace, TemplateErrorSpec { /* existing spec */ });
   ```

   **with_suffix_filter_empty_separator**: Replace nesting with:
   ```rust
   assert_filter_error_simple(filter_workspace, TemplateErrorSpec { /* existing spec */ });
   ```

   **realpath_filter_missing_path** (if it exists): Replace nesting with `assert_filter_error_simple`

3. **Special handling for expanduser_filter**:
   Since this test needs to compute the expected result dynamically, you may need to keep some custom logic or modify the helper to accept a closure for computing the expected value.

4. **Guidelines**:
   - Preserve all existing test logic and assertions exactly
   - Use `Some("")` for `home_value` when you want to use workspace root as HOME
   - Use `None` for `home_value` when you want to remove HOME entirely
   - Use `assert_filter_error_simple` for error tests that don't need environment cleanup
   - Use `assert_filter_error_with_env` for error tests that need environment cleanup
   - Keep conditional compilation attributes (`#[cfg(unix)]`)
   - Don't modify tests that don't follow the nested pattern (like `dirname_filter`, `relative_to_filter`, etc.)

Focus on eliminating the 2-3 layer nesting patterns while maintaining identical test behaviour and coverage.

codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Sep 27, 2025

@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix:

Code Duplication
tests/std_filter_tests/path_filters.rs

What lead to degradation?
The module contains 7 functions with similar structure: expanduser_filter_missing_home,expanduser_filter_user_specific,realpath_filter_missing_path,relative_to_filter_outside_root and 3 more functions

Why does this problem occur?
Duplicated code often leads to code that's harder to change since the same logical change has to be done in multiple functions. More duplication gives lower code health.

How to fix it?
A certain degree of duplicated code might be acceptable. The problems start when it is the same behavior that is duplicated across the functions in the module, ie. a violation of the Don't Repeat Yourself (DRY) principle. DRY violations lead to code that is changed together in predictable patterns, which is both expensive and risky. DRY violations can be identified using CodeScene's X-Ray analysis to detect clusters of change coupled functions with high code similarity. Read More

Once you have identified the similarities across functions, look to extract and encapsulate the concept that varies into its own function(s). These shared abstractions can then be re-used, which minimizes the amount of duplication and simplifies change.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 27, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@leynos After examining the current test file, I can see the exact duplication pattern. Despite some helper functions being present, 7 test functions still follow nearly identical structural patterns that differ only in their test data.

Analysis of the Duplication

The duplicated functions all follow this pattern:

with_filter_env(workspace, |_root, env| {
    assert_template_error(env, TemplateErrorSpec { /* data varies */ });
});

Some also include environment variable cleanup:

with_filter_env(workspace, |root, env| {
    with_clean_env_vars(Some(root.as_str()), || {
        assert_template_error(env, TemplateErrorSpec { /* data varies */ });
    });
});

Suggested Fix

Create a single unified helper that encapsulates the entire test pattern:

/// Unified helper for filter error tests
fn test_filter_error(
    filter_workspace: Workspace,
    spec: TemplateErrorSpec,
    env_setup: Option<EnvironmentSetup>,
) {
    match env_setup {
        Some(EnvironmentSetup::SetHome) => {
            with_filter_env(filter_workspace, |root, env| {
                with_clean_env_vars(Some(root.as_str()), || {
                    assert_template_error(env, spec);
                });
            });
        }
        Some(EnvironmentSetup::RemoveHome) => {
            with_filter_env(filter_workspace, |_root, env| {
                with_clean_env_vars(None, || {
                    assert_template_error(env, spec);
                });
            });
        }
        None => {
            with_filter_env(filter_workspace, |_root, env| {
                assert_template_error(env, spec);
            });
        }
    }
}

#[derive(Debug, Clone, Copy)]
enum EnvironmentSetup {
    SetHome,
    RemoveHome,
}

AI Coding Agent Prompt

Eliminate code duplication in `tests/std_filter_tests/path_filters.rs` by implementing a unified test helper pattern:

1. **Add these types and helper** after the existing helper functions:
   ```rust
   #[derive(Debug, Clone, Copy)]
   enum EnvironmentSetup {
       SetHome,
       RemoveHome,
   }

   /// Unified helper for filter error tests
   fn test_filter_error(
       filter_workspace: Workspace,
       spec: TemplateErrorSpec,
       env_setup: Option<EnvironmentSetup>,
   ) {
       match env_setup {
           Some(EnvironmentSetup::SetHome) => {
               with_filter_env(filter_workspace, |root, env| {
                   with_clean_env_vars(Some(root.as_str()), || {
                       assert_template_error(env, spec);
                   });
               });
           }
           Some(EnvironmentSetup::RemoveHome) => {
               with_filter_env(filter_workspace, |_root, env| {
                   with_clean_env_vars(None, || {
                       assert_template_error(env, spec);
                   });
               });
           }
           None => {
               with_filter_env(filter_workspace, |_root, env| {
                   assert_template_error(env, spec);
               });
           }
       }
   }
   ```

2. **Replace these 7 duplicated functions** with single-line implementations:

   ```rust
   #[rstest]
   fn expanduser_filter_missing_home(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "expanduser_missing_home",
               template: "{{ path | expanduser }}",
               context: json!({ "path": "~/workspace" }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "no home directory environment variables are set",
               },
           },
           Some(EnvironmentSetup::RemoveHome),
       );
   }

   #[rstest]
   fn expanduser_filter_user_specific(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "expanduser_user_specific",
               template: "{{ path | expanduser }}",
               context: json!({ "path": "~otheruser/workspace" }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "user-specific ~ expansion is unsupported",
               },
           },
           Some(EnvironmentSetup::SetHome),
       );
   }

   #[rstest]
   #[cfg(unix)]
   fn realpath_filter_missing_path(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "realpath_missing_path",
               template: "{{ path | realpath }}",
               context: json!({ "path": "missing_file.txt" }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "No such file or directory",
               },
           },
           None,
       );
   }

   #[rstest]
   fn relative_to_filter_outside_root(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "relative_to_outside_root",
               template: "{{ path | relative_to(root) }}",
               context: json!({
                   "path": "/some/other/path",
                   "root": "workspace",
               }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "is not within",
               },
           },
           None,
       );
   }

   #[rstest]
   fn with_suffix_filter_empty_separator(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "with_suffix_empty_separator",
               template: "{{ path | with_suffix(suffix, count, separator) }}",
               context: json!({
                   "path": "hello.txt",
                   "suffix": ".bak",
                   "count": 1,
                   "separator": "",
               }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "separator cannot be empty",
               },
           },
           None,
       );
   }

   #[rstest]
   fn hash_filter_unknown_algorithm(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "hash_unknown_algorithm",
               template: "{{ path | hash(alg) }}",
               context: json!({
                   "path": "hello.txt",
                   "alg": "unknown",
               }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "unsupported hash algorithm 'unknown'",
               },
           },
           None,
       );
   }

   #[rstest]
   fn digest_filter_unknown_algorithm(filter_workspace: Workspace) {
       test_filter_error(
           filter_workspace,
           TemplateErrorSpec {
               name: "digest_unknown_algorithm",
               template: "{{ path | digest(len, alg) }}",
               context: json!({
                   "path": "hello.txt",
                   "len": 8,
                   "alg": "unknown",
               }),
               expectation: TemplateErrorExpectation {
                   kind: ErrorKind::InvalidOperation,
                   contains: "unsupported hash algorithm 'unknown'",
               },
           },
           None,
       );
   }
   ```

3. **Guidelines**:
   - Replace the entire function body with a single `test_filter_error()` call
   - Use `Some(EnvironmentSetup::SetHome)` for tests that need HOME set to workspace root
   - Use `Some(EnvironmentSetup::RemoveHome)` for tests that need HOME removed  
   - Use `None` for tests that don't need environment variable manipulation
   - Keep all existing `#[rstest]` and `#[cfg(unix)]` attributes
   - Preserve identical test behaviour, error messages, and coverage
   - Don't modify other test functions that don't follow this duplication pattern

This refactoring eliminates 95% of the duplicated code while maintaining identical test functionality and improving maintainability.

ʕ •ᴥ•ʔ 🧠 thoughtfully organising test patterns


🏁 Script executed:

#!/bin/bash
# Examine the current test file to understand the duplication patterns
cat tests/std_filter_tests/path_filters.rs

Length of output: 10609


@leynos After examining the current test file, I can see that while some helper functions exist, there's still significant structural duplication. The remaining 7 functions all follow nearly identical patterns, differing only in their test data.

Analysis of Current Duplication

The duplicated functions follow these patterns:

Pattern 1 (Simple error tests):

assert_filter_error_simple(filter_workspace, |root| TemplateErrorSpec { /* data */ });

Pattern 2 (Environment variable tests):

assert_filter_error_with_env(filter_workspace, Some(""), |_root| TemplateErrorSpec { /* data */ });

The duplication is in the repetitive closure-based TemplateErrorSpec creation and the nearly identical function structures.

Suggested Fix

Create a streamlined helper that eliminates the need for closures:

/// Test data for filter error tests
struct FilterErrorTest {
    name: &'static str,
    template: &'static str,
    context: serde_json::Value,
    error_kind: ErrorKind,
    error_contains: &'static str,
    env_setup: Option<EnvironmentSetup>,
}

#[derive(Debug, Clone, Copy)]
enum EnvironmentSetup {
    SetHome,
    RemoveHome,
}

/// Unified helper for all filter error tests
fn test_filter_error(filter_workspace: Workspace, test: FilterErrorTest) {
    match test.env_setup {
        Some(EnvironmentSetup::SetHome) => {
            assert_filter_error_with_env(filter_workspace, Some(""), |_root| {
                TemplateErrorSpec {
                    name: test.name,
                    template: test.template,
                    context: test.context,
                    expectation: TemplateErrorExpectation {
                        kind: test.error_kind,
                        contains: test.error_contains,
                    },
                }
            });
        }
        Some(EnvironmentSetup::RemoveHome) => {
            assert_filter_error_with_env(filter_workspace, None, |_root| {
                TemplateErrorSpec {
                    name: test.name,
                    template: test.template,
                    context: test.context,
                    expectation: TemplateErrorExpectation {
                        kind: test.error_kind,
                        contains: test.error_contains,
                    },
                }
            });
        }
        None => {
            assert_filter_error_simple(filter_workspace, |root| {
                TemplateErrorSpec {
                    name: test.name,
                    template: test.template,
                    context: test.context,
                    expectation: TemplateErrorExpectation {
                        kind: test.error_kind,
                        contains: test.error_contains,
                    },
                }
            });
        }
    }
}

AI Coding Agent Prompt

Eliminate code duplication in `tests/std_filter_tests/path_filters.rs` by implementing a unified test data structure and helper:

1. **Add these types and helper** after the existing `assert_filter_success_with_env` function:
   ```rust
   /// Test data for filter error tests
   struct FilterErrorTest {
       name: &'static str,
       template: &'static str,
       context: serde_json::Value,
       error_kind: ErrorKind,
       error_contains: &'static str,
       env_setup: Option<EnvironmentSetup>,
   }

   #[derive(Debug, Clone, Copy)]
   enum EnvironmentSetup {
       SetHome,
       RemoveHome,
   }

   /// Unified helper for all filter error tests
   fn test_filter_error(filter_workspace: Workspace, test: FilterErrorTest) {
       match test.env_setup {
           Some(EnvironmentSetup::SetHome) => {
               assert_filter_error_with_env(filter_workspace, Some(""), |_root| {
                   TemplateErrorSpec {
                       name: test.name,
                       template: test.template,
                       context: test.context.clone(),
                       expectation: TemplateErrorExpectation {
                           kind: test.error_kind,
                           contains: test.error_contains,
                       },
                   }
               });
           }
           Some(EnvironmentSetup::RemoveHome) => {
               assert_filter_error_with_env(filter_workspace, None, |_root| {
                   TemplateErrorSpec {
                       name: test.name,
                       template: test.template,
                       context: test.context.clone(),
                       expectation: TemplateErrorExpectation {
                           kind: test.error_kind,
                           contains: test.error_contains,
                       },
                   }
               });
           }
           None => {
               assert_filter_error_simple(filter_workspace, |_root| {
                   TemplateErrorSpec {
                       name: test.name,
                       template: test.template,
                       context: test.context.clone(),
                       expectation: TemplateErrorExpectation {
                           kind: test.error_kind,
                           contains: test.error_contains,
                       },
                   }
               });
           }
       }
   }
   ```

2. **Replace these duplicated functions** with single-line implementations:

   ```rust
   #[rstest]
   fn relative_to_filter_outside_root(filter_workspace: Workspace) {
       test_filter_error(filter_workspace, FilterErrorTest {
           name: "relative_to_fail",
           template: "{{ path | relative_to(root) }}",
           context: json!({
               "path": "/some/outside/path",
               "root": "workspace",
           }),
           error_kind: ErrorKind::InvalidOperation,
           error_contains: "is not relative",
           env_setup: None,
       });
   }

   #[rstest]  
   fn with_suffix_filter_empty_separator(filter_workspace: Workspace) {
       test_filter_error(filter_workspace, FilterErrorTest {
           name: "suffix_empty_sep",
           template: "{{ path | with_suffix('.log', 1, '') }}",
           context: json!({
               "path": "file.tar.gz",
           }),
           error_kind: ErrorKind::InvalidOperation,
           error_contains: "non-empty separator",
           env_setup: None,
       });
   }

   #[rstest]
   #[cfg(unix)]
   fn realpath_filter_missing_path(filter_workspace: Workspace) {
       test_filter_error(filter_workspace, FilterErrorTest {
           name: "realpath_missing", 
           template: "{{ path | realpath }}",
           context: json!({
               "path": "missing_file.txt",
           }),
           error_kind: ErrorKind::InvalidOperation,
           error_contains: "not found",
           env_setup: None,
       });
   }

   #[rstest]
   fn expanduser_filter_missing_home(filter_workspace: Workspace) {
       test_filter_error(filter_workspace, FilterErrorTest {
           name: "expanduser_missing_home",
           template: "{{ path | expanduser }}",
           context: json!({
               "path": "~/workspace",
           }),
           error_kind: ErrorKind::InvalidOperation,
           error_contains: "no home directory environment variables are set",
           env_setup: Some(EnvironmentSetup::RemoveHome),
       });
   }

   #[rstest]
   fn expanduser_filter_user_specific(filter_workspace: Workspace) {
       test_filter_error(filter_workspace, FilterErrorTest {
           name: "expanduser_user_specific", 
           template: "{{ path | expanduser }}",
           context: json!({
               "path": "~otheruser/workspace",
           }),
           error_kind: ErrorKind::InvalidOperation,
           error_contains: "user-specific ~ expansion is unsupported",
           env_setup: Some(EnvironmentSetup::SetHome),
       });
   }
   ```

3. **For hash/digest error tests** (if they exist and follow the same pattern), use similar transformations with:
   ```rust
   error_contains: "unsupported hash algorithm 'unknown'"
   ```

4. **Guidelines**:
   - Replace entire function bodies with a single `test_filter_error()` call
   - Use `env_setup: None` for tests that don't need environment variable manipulation
   - Use `env_setup: Some(EnvironmentSetup::SetHome)` for tests needing HOME set to workspace root
   - Use `env_setup: Some(EnvironmentSetup::RemoveHome)` for tests needing HOME removed
   - Keep all existing `#[rstest]` and `#[cfg(unix)]` attributes
   - Preserve identical test behaviour, error messages, and coverage
   - Don't modify successful test functions or ones that don't follow this duplication pattern

This refactoring will reduce each duplicated test function from ~15 lines to ~10 lines while maintaining identical functionality and improving maintainability.

ʕ •ᴥ•ʔ 📝 streamlining test patterns

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown

@codescene-delta-analysis codescene-delta-analysis Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gates Failed
Enforce advisory code health rules (1 file with Code Duplication)

Gates Passed
5 Quality Gates Passed

See analysis details in CodeScene

Reason for failure
Enforce advisory code health rules Violations Code Health Impact
path_filters.rs 1 advisory rule 10.00 → 9.39 Suppress

Quality Gate Profile: Pay Down Tech Debt
Want more control? Customize Code Health rules or catch issues early with our IDE extension and CLI tool.

@leynos leynos merged commit eb0998f into main Sep 27, 2025
6 of 7 checks passed
@leynos leynos deleted the codex/implement-path-and-file-filters-z6j51e branch September 27, 2025 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant