Skip to content

Wire cwd override for which; add workspace search and cucumber tests#238

Merged
leynos merged 4 commits intoterragon/implement-which-filter-u5itpofrom
terragon/fix-test-failures-8zclyr
Nov 24, 2025
Merged

Wire cwd override for which; add workspace search and cucumber tests#238
leynos merged 4 commits intoterragon/implement-which-filter-u5itpofrom
terragon/fix-test-failures-8zclyr

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Nov 23, 2025

Summary

  • Fix test failures by enabling the which resolver to search the workspace when PATH is empty, using an optional cwd override.
  • Adjust the stdlib test harness to use the cucumber runner and improve error handling in assertions.
  • Update dependency to support workspace search functionality.

Changes

Core Functionality

  • Extend WhichResolver with a cwd_override field and plumb it through creation and resolution.
  • EnvSnapshot::capture now accepts an optional cwd override and uses it when provided; otherwise it falls back to the current directory.
  • Implement workspace search in which/lookup.rs:
    • add handle_miss to attempt a workspace search when PATH is empty and cwd_mode allows it.
    • add search_workspace to traverse the current workspace using WalkDir and locate executables that match the command name.
    • respect canoncialization and cwd_mode in the resolution flow.

Registration

  • which/mod.rs: register now accepts an optional cwd_override (Arc) and passes it to WhichResolver::new.
  • Tests harness updated to accommodate the new registration path.

Tests

  • tests/cucumber.rs: switch from direct CliWorld::run calls to the cucumber runner pattern:
    • CliWorld::cucumber().max_concurrent_scenarios(1).run_and_exit("tests/features").await;
    • For Unix feature set, same runner pattern is used when a block device exists.
  • tests/steps/stdlib_steps/assertions.rs:
    • stdlib_root_and_output now uses stdlib_output(world)?;
    • stdlib_output() enhanced to prefer stdout when present and to fail with a meaningful error if stdout is absent but an stdlib error is present.

Dependencies

  • Cargo.toml: add walkdir = "2.5.0" to support workspace traversal in which resolver.
  • Cargo.lock updated accordingly.

Test plan

  • cargo test
  • Run cucumber tests:
    • Execute tests/features with single-threaded scenario execution to avoid race conditions.
    • On Unix, also exercise tests/features_unix when a block device is present; skip with a warning otherwise.

Notes

  • This change introduces an API change for which::register signature and its usage; downstream callers must provide an optional cwd override.
  • The workspace search is conservative: it only considers files named exactly as the command and that have executable permissions.
  • The added dependency walkdir is scoped to support the new workspace search mechanism and should not affect existing behavior where PATH provides the executables.

🌿 Generated by Terry

ℹ️ Tag @terragon-labs to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/cb6348ff-c326-4c51-a786-8a4d49586fad

Summary by Sourcery

Wire the stdlib which resolver to respect an optional workspace-root cwd override and improve stdlib test execution and diagnostics.

New Features:

  • Resolve commands by searching the current workspace for executable matches when PATH is empty and cwd-based lookup is allowed.

Bug Fixes:

  • Prevent which lookups from failing in environments with an empty PATH by falling back to a workspace search.
  • Improve stdlib test failure messages by surfacing recorded stdlib errors when expected output is missing.

Enhancements:

  • Allow EnvSnapshot and WhichResolver to operate with an optional cwd override derived from the stdlib workspace root.
  • Run cucumber-based stdlib tests through the cucumber runner with single-concurrency to avoid race conditions and align with standard usage.

Build:

  • Add the walkdir dependency to support workspace traversal in the which resolver.

Tests:

  • Update cucumber test harness and stdlib assertion helpers to use the new cucumber runner pattern and enhanced stdlib output handling.

Enhance the 'which' command resolution logic to perform a recursive search of the
workspace directory if the PATH environment variable is empty. This ensures that
commands can be located within the current workspace when the standard PATH lookup fails.

- Added optional CWD override support in WhichResolver and EnvSnapshot to influence
  environment capturing.
- Integrated 'walkdir' crate for recursive directory traversal.
- Implemented search_workspace function to find executables recursively in workspace.
- Modified lookup behavior to try workspace search upon PATH miss when appropriate.
- Updated stdlib registration to pass current workspace root as CWD override.
- Improved error handling and tests accordingly.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 23, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch terragon/fix-test-failures-8zclyr

Comment @coderabbitai help to get the list of available commands and usage tips.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Nov 23, 2025

Reviewer's Guide

Extends the stdlib which resolver to respect an optional workspace-root cwd override (used when PATH is empty), wires that through EnvSnapshot and registration, adds a workspace-walk fallback using walkdir, and updates the cucumber-based test harness and stdlib assertions for more reliable and diagnosable test runs.

Sequence diagram for which resolution with workspace search fallback

sequenceDiagram
    participant "Caller" as "Caller"
    participant "WhichResolver" as "WhichResolver"
    participant "EnvSnapshot" as "EnvSnapshot"
    participant "Lookup" as "lookup() in lookup.rs"
    participant "HandleMiss" as "handle_miss()"
    participant "SearchWorkspace" as "search_workspace()"

    "Caller"->>"WhichResolver": "resolve(command, options)"
    "WhichResolver"->>"EnvSnapshot": "capture(cwd_override)"
    "EnvSnapshot"-->>"WhichResolver": "EnvSnapshot { cwd, raw_path }"
    "WhichResolver"->>"Lookup": "lookup(env, command, options)"
    "Lookup"->>"Lookup": "search PATH directories for matches"
    alt "matches found"
        "Lookup"-->>"WhichResolver": "Vec<Utf8PathBuf> (maybe canonicalised)"
        "WhichResolver"-->>"Caller": "Ok(matches)"
    else "no matches in PATH"
        "Lookup"->>"HandleMiss": "handle_miss(env, command, options, dirs)"
        "HandleMiss"->>"HandleMiss": "check if 'raw_path' is empty"
        alt "PATH empty and 'cwd_mode' != 'Never'"
            "HandleMiss"->>"SearchWorkspace": "search_workspace(env.cwd, command, options.all)"
            "SearchWorkspace"->>"SearchWorkspace": "walk workspace with 'WalkDir' and filter executables"
            "SearchWorkspace"-->>"HandleMiss": "discovered paths (possibly empty)"
            alt "discovered not empty"
                alt "options.canonical is true"
                    "HandleMiss"->>"HandleMiss": "canonicalise(discovered)"
                    "HandleMiss"-->>"Lookup": "canonical paths"
                else "options.canonical is false"
                    "HandleMiss"-->>"Lookup": "discovered paths"
                end
                "Lookup"-->>"WhichResolver": "Vec<Utf8PathBuf> from workspace"
                "WhichResolver"-->>"Caller": "Ok(matches)"
            else "discovered empty"
                "HandleMiss"-->>"Lookup": "Error(not_found_error)"
                "Lookup"-->>"WhichResolver": "Error"
                "WhichResolver"-->>"Caller": "Err(not_found)"
            end
        else "PATH not empty or 'cwd_mode' is 'Never'"
            "HandleMiss"-->>"Lookup": "Error(not_found_error)"
            "Lookup"-->>"WhichResolver": "Error"
            "WhichResolver"-->>"Caller": "Err(not_found)"
        end
    end
Loading

Sequence diagram for updated cucumber-based stdlib test execution

sequenceDiagram
    actor "Developer" as "Developer"
    participant "TestRunner" as "Rust test binary"
    participant "CliWorld" as "CliWorld"
    participant "Cucumber" as "Cucumber runner"
    participant "FS" as "Feature files under 'tests/features'"

    "Developer"->>"TestRunner": "run 'cargo test' (including cucumber tests)"
    "TestRunner"->>"CliWorld": "create world instance"
    "CliWorld"->>"CliWorld": "configure via 'cucumber()'"
    "CliWorld"->>"Cucumber": "builder with 'max_concurrent_scenarios(1)'"
    "Cucumber"->>"FS": "discover '.feature' files in 'tests/features'"
    "Cucumber"->>"CliWorld": "execute scenarios sequentially (max 1)"
    "CliWorld"-->>"Cucumber": "scenario results (stdout, stderr, exit codes)"
    "Cucumber"-->>"TestRunner": "aggregate results and 'run_and_exit'"
    "TestRunner"-->>"Developer": "process exit code and output with improved diagnostics"
Loading

Class diagram for updated which resolver, environment snapshot, and registration

classDiagram
    class StdlibConfig {
        +workspace_root_path() -> Option<&Utf8Path>
    }

    class Environment {
        +register_with_config(config: StdlibConfig)
    }

    class WhichModule {
        +register(env: &mut Environment, cwd_override: Option<Arc<Utf8PathBuf>>)
    }

    class WhichResolver {
        -cache: Arc<Mutex<LruCache<CacheKey, CacheEntry>>>
        -cwd_override: Option<Arc<Utf8PathBuf>>
        +new(cwd_override: Option<Arc<Utf8PathBuf>>) -> WhichResolver
        +resolve(command: &str, options: &WhichOptions) -> Result<Vec<Utf8PathBuf>, Error>
    }

    class EnvSnapshot {
        +cwd: Utf8PathBuf
        +raw_path: Option<OsString>
        +capture(cwd_override: Option<&Utf8Path>) -> Result<EnvSnapshot, Error>
    }

    class WhichOptions {
        +cwd_mode: CwdMode
        +canonical: bool
        +all: bool
        +fresh: bool
    }

    class CwdMode {
        <<enumeration>>
        +Never
        +OtherModes
    }

    Environment --> StdlibConfig : uses
    Environment --> WhichModule : calls register
    StdlibConfig --> WhichModule : provides workspace_root_path as cwd_override
    WhichModule --> WhichResolver : constructs via new(cwd_override)
    WhichResolver --> EnvSnapshot : calls capture(cwd_override)
    WhichResolver --> WhichOptions : reads lookup options
    WhichOptions --> CwdMode : uses cwd_mode
Loading

File-Level Changes

Change Details Files
Add workspace-search fallback in which resolver when PATH is empty and cwd-based lookup is allowed.
  • Replace direct not_found_error return in lookup with a new handle_miss function that can attempt a workspace search before failing.
  • Introduce handle_miss to detect an empty PATH, conditionally search the workspace, and canonicalize results when requested.
  • Implement search_workspace to WalkDir-traverse from env.cwd, filter by command name and executability, optionally collect all matches, and convert paths to Utf8PathBuf, returning detailed errors on traversal or encoding issues.
src/stdlib/which/lookup.rs
Plumb an optional cwd override through EnvSnapshot and WhichResolver so resolution can be anchored to the workspace root.
  • Change EnvSnapshot::capture to accept an Option<&Utf8Path>, using the override when present and falling back to current_dir_utf8 otherwise.
  • Extend WhichResolver to store an optional Arc cwd_override and accept it in its constructor.
  • Update WhichResolver::which to pass the cwd_override through to EnvSnapshot::capture using as_deref().map(Utf8PathBuf::as_path).
src/stdlib/which/env.rs
src/stdlib/which/cache.rs
Wire workspace-root-derived cwd override into stdlib which registration API.
  • Derive a which_cwd from StdlibConfig::workspace_root_path by cloning it into an Arc.
  • Change which::register to accept an optional cwd_override and pass it into WhichResolver::new.
  • Update stdlib registration to call which::register with the computed which_cwd so which resolution runs relative to the configured workspace root.
src/stdlib/mod.rs
src/stdlib/which/mod.rs
Improve stdlib test assertions to better report missing output vs error conditions.
  • Refactor stdlib_root_and_output to call stdlib_output(world) instead of accessing stdlib_output directly.
  • Change stdlib_output to prefer returning captured stdlib_output when present, otherwise fail with a specific error if stdlib_error exists, or a generic error if neither output nor error is set.
tests/steps/stdlib_steps/assertions.rs
Switch cucumber tests to the runner API with constrained concurrency for more predictable feature execution.
  • Replace direct CliWorld::run invocations with CliWorld::cucumber().max_concurrent_scenarios(1).run_and_exit(...) for tests/features.
  • Apply the same cucumber runner pattern for tests/features_unix behind the existing block_device_exists() guard, preserving the warning when skipping.
tests/cucumber.rs
Add walkdir dependency to support workspace traversal in the which resolver.
  • Declare walkdir = "2.5.0" in Cargo.toml dependencies.
  • Regenerate Cargo.lock to include the walkdir crate and its transitive dependencies.
Cargo.toml
Cargo.lock

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@leynos leynos marked this pull request as ready for review November 23, 2025 23:17
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • In search_workspace, consider treating WalkDir per-entry IO errors as non-fatal (e.g., log/skip unreadable paths) instead of failing the entire resolution, so a single permission issue doesn’t break all workspace lookups.
  • The workspace search currently walks the entire cwd tree on PATH misses; consider allowing configuration or at least skipping known heavy directories (e.g., target/, .git) to avoid significant performance hits in large workspaces.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In search_workspace, consider treating WalkDir per-entry IO errors as non-fatal (e.g., log/skip unreadable paths) instead of failing the entire resolution, so a single permission issue doesn’t break all workspace lookups.
- The workspace search currently walks the entire cwd tree on PATH misses; consider allowing configuration or at least skipping known heavy directories (e.g., target/, .git) to avoid significant performance hits in large workspaces.

## Individual Comments

### Comment 1
<location> `src/stdlib/which/lookup.rs:154-157` </location>
<code_context>
+            continue;
+        }
+        let path = entry.into_path();
+        let utf8 = Utf8PathBuf::from_path_buf(path).map_err(|_| {
+            Error::new(
+                ErrorKind::InvalidOperation,
+                "workspace path contains non-UTF-8 components",
+            )
+        })?;
</code_context>

<issue_to_address>
**suggestion:** Error message for non-UTF-8 workspace paths could include more context

Currently this error omits both the problematic path and the command being resolved. Please include at least the command, and ideally a lossy/escaped version of the path, to make diagnosing configuration or filesystem issues easier when this occurs.

Suggested implementation:

```rust
        let path = entry.into_path();
        let utf8 = Utf8PathBuf::from_path_buf(path).map_err(|path_buf| {
            let lossy_path = path_buf.to_string_lossy();
            Error::new(
                ErrorKind::InvalidOperation,
                format!(
                    "workspace path contains non-UTF-8 components while resolving command '{command}': {lossy_path}"
                ),
            )
        })?;

```

- This change assumes `command` implements `Display` (which it already should, given its current usage in string comparisons). If `command` is not a `&str`/`String`, you may need to adjust the formatting (e.g., `'{command_name}'` where `command_name` is a `&str`).
- If your `Error::new` signature or `ErrorKind::InvalidOperation` expect different argument types (e.g., `&'static str` only), ensure they accept `String` or adjust the constructor accordingly.
</issue_to_address>

### Comment 2
<location> `tests/steps/stdlib_steps/assertions.rs:44-51` </location>
<code_context>
     Ok((root, output))
 }

 fn stdlib_output(world: &CliWorld) -> Result<&str> {
-    world
-        .stdlib_output
-        .as_deref()
-        .context("expected stdlib output")
+    if let Some(output) = world.stdlib_output.as_deref() {
+        Ok(output)
+    } else {
+        if let Some(err) = &world.stdlib_error {
+            bail!("expected stdlib output; stdlib error present: {err}");
+        }
+        bail!("expected stdlib output");
+    }
 }
</code_context>

<issue_to_address>
**issue (testing):** Add targeted tests for the new stdlib_output error-handling branches.

Please add explicit tests (unit or cucumber) to cover the new branches so the behavior is locked in:
- stdlib_output = Some, stdlib_error = None → returns the output successfully.
- stdlib_output = None, stdlib_error = Some("…") → returns an error whose message includes the stdlib error.
- stdlib_output = None, stdlib_error = None → returns an error with the generic "expected stdlib output" message.
This will protect the improved diagnostics from future regressions or unintended message changes.
</issue_to_address>

### Comment 3
<location> `src/stdlib/which/lookup.rs:109` </location>
<code_context>
         .is_ok_and(|metadata| metadata.is_file() && has_execute_permission(&metadata))
 }

+fn handle_miss(
+    env: &EnvSnapshot,
+    command: &str,
</code_context>

<issue_to_address>
**issue (review_instructions):** Add both behavioural and unit tests that exercise the new workspace-search behaviour introduced in handle_miss/search_workspace and the cwd override wiring.

You updated the which lookup flow to fall back to a workspace search when PATH is empty and wired in cwd overrides, but there are no corresponding new tests in the diff (neither unit tests for handle_miss/search_workspace nor behavioural tests through the public interface). Add unit tests that cover the different branches (PATH empty/non-empty, cwd_mode variants, executable vs non-executable files, error handling, collect_all true/false) and behavioural tests that validate the end-to-end behaviour via the public APIs.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/stdlib/which/lookup.rs Outdated
Comment thread tests/steps/stdlib_steps/assertions.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/cucumber.rs
Changed .run_and_exit() to .run() in cucumber test execution to prevent premature process exit, enabling all tests to run as intended.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Nov 24, 2025

@coderabbitai Have the following now been resolved?

  • In search_workspace, consider treating WalkDir per-entry IO errors as non-fatal (e.g., log/skip unreadable paths) instead of failing the entire resolution, so a single permission issue doesn’t break all workspace lookups.
  • The workspace search currently walks the entire cwd tree on PATH misses; consider allowing configuration or at least skipping known heavy directories (e.g., target/, .git) to avoid significant performance hits in large workspaces.

- Added a new module `stdlib_assert` in `test_support/src` that provides helper
functions for asserting stdlib rendering outputs.
- Replaced ad-hoc stdlib output/error checking in `stdlib_steps` with calls to
this new helper, improving code reuse and clarity.
- Included unit tests for the new assertion helpers to ensure correct behavior.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Repository owner deleted a comment from coderabbitai Bot Nov 24, 2025
Repository owner deleted a comment from coderabbitai Bot Nov 24, 2025
Repository owner deleted a comment from sourcery-ai Bot Nov 24, 2025
Repository owner deleted a comment from coderabbitai Bot Nov 24, 2025
- Implement fallback search in workspace directory when PATH is empty
- Skip heavy directories like .git and target during workspace search
- Add small LRU cache for which resolver keyed by command, cwd, PATH, etc.
- Improve error handling and debug logs for unreadable workspace entries
- Enhance which filter tests to cover workspace fallback and skipping heavy dirs
- Add detailed documentation diagrams for which resolver flow and configuration wiring
- Introduce comprehensive tests for workspace search functionality

This enhancement improves command resolution robustness in empty PATH environments by searching the workspace, optimizing with caching, and clarifying behavior through documentation.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@leynos leynos changed the title Fix test failures by wiring cwd override for which and test updates Wire cwd override for which; add workspace search and cucumber tests Nov 24, 2025
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Nov 24, 2025

@coderabbitai Have the following now been resolved?

  • In search_workspace, consider treating WalkDir per-entry IO errors as non-fatal (e.g., log/skip unreadable paths) instead of failing the entire resolution, so a single permission issue doesn’t break all workspace lookups.
  • The workspace search currently walks the entire cwd tree on PATH misses; consider allowing configuration or at least skipping known heavy directories (e.g., target/, .git) to avoid significant performance hits in large workspaces.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 24, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and read the search_workspace implementation in which/lookup.rs
fd -e rs "lookup.rs" -x cat {}

Length of output: 9777


@leynos Yes, both concerns have been resolved:

1. WalkDir per-entry IO errors are now non-fatal:
The implementation catches IO errors during workspace traversal and logs them with tracing::debug! before continuing, rather than failing the entire resolution:

let entry = match walk_entry {
    Ok(value) => value,
    Err(err) => {
        tracing::debug!(
            %command,
            error = %err,
            "skipping unreadable workspace entry during which fallback"
        );
        continue;
    }
};

2. Heavy directories are now skipped:
The workspace search filters out .git and target directories using a SKIP_DIRS constant and filter_entry:

const SKIP_DIRS: &[&str] = &[".git", "target"];
let walker = WalkDir::new(cwd)
    .follow_links(false)
    .sort_by_file_name()
    .into_iter()
    .filter_entry(|entry| {
        let ft = entry.file_type();
        if ft.is_dir() {
            let name = entry.file_name().to_string_lossy();
            !SKIP_DIRS.iter().any(|skip| name == *skip)
        } else {
            true
        }
    });

Additionally, both behaviours are covered by unit tests: search_workspace_ignores_unreadable_entries and search_workspace_skips_heavy_directories.

@leynos leynos merged commit 231b958 into terragon/implement-which-filter-u5itpo Nov 24, 2025
11 of 13 checks passed
@leynos leynos deleted the terragon/fix-test-failures-8zclyr branch November 24, 2025 00:41
leynos added a commit that referenced this pull request Nov 24, 2025
* feat(stdlib): add cross-platform 'which' filter and function to resolve executables

- Implemented a deterministic executable discovery filter and function 'which' in MiniJinja
- Supports options: all matches, canonical paths, fresh cache bypass, and cwd_mode
- Added an LRU cache with self-healing for stale entries
- Emits actionable diagnostics with PATH previews for troubleshooting
- Fully covers Windows and POSIX path behavior with extensive unit and integration tests
- Updated documentation, roadmap, and user guides with 'which' usage and semantics
- Integrated into stdlib module and registered in MiniJinja environment

This feature allows templates to resolve executables from PATH safely and efficiently, improving script flexibility and debugging.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(stdlib/which): split `which` filter into modular components

Removed the monolithic src/stdlib/which.rs file and reorganized the codebase into several focused modules:
- cache.rs: Cache management for resolved executable paths
- env.rs: Environment snapshot and PATH parsing
- error.rs: Error generation utilities
- lookup.rs: Lookup resolver logic
- mod.rs: Module facade and filter/function registration
- options.rs: Options parsing and management

This restructuring improves code maintainability and clarity without changing external behavior. Added corresponding tests to verify functionality remains consistent.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(testing,bdd): replace cucumber with rstest-bdd for behavior tests

- Migrate behavioral tests from cucumber to rstest-bdd
- Introduce new fixtures and modules for scenario handling
- Refactor CliWorld into smaller state holders with mutex guards
- Improve guarantees on resource cleanup via RAII-style fixtures
- Update documentation and user guide to reflect new BDD framework
- Modify which filter tests to use new fixture helpers

This change consolidates behavior tests to run via the standard cargo test harness, simplifying CI and enabling reuse of fixtures and lint configs. Maintains cucumber in parallel during migration to ensure feature parity.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(which): update which tests to use ToolName, DirName, and Template newtypes

- Introduce ToolName, DirName, and Template structs in which filter tests for stronger typing and clearer intent
- Update all related test functions to use these newtypes instead of raw strings
- Adjust helper functions accordingly to accept newtypes
- Minor code style cleanup in which/env.rs and which/lookup.rs imports

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(stdlib/which): derive Default for CwdMode using attribute

Replaced the manual Default trait implementation for the CwdMode enum with the #[derive(Default)] attribute, specifying Auto as the default variant. This simplifies the code and leverages Rust's built-in derive functionality.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* test(stdlib): replace 'And' with 'Given' in stdlib.feature scenarios

Changed multiple 'And' steps to 'Given' in the stdlib.feature test scenarios to improve clarity and adherence to Gherkin syntax for preconditions.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* Wire cwd override for which; add workspace search and cucumber tests (#238)

* feat(stdlib/which): add workspace fallback to 'which' command lookup

Enhance the 'which' command resolution logic to perform a recursive search of the
workspace directory if the PATH environment variable is empty. This ensures that
commands can be located within the current workspace when the standard PATH lookup fails.

- Added optional CWD override support in WhichResolver and EnvSnapshot to influence
  environment capturing.
- Integrated 'walkdir' crate for recursive directory traversal.
- Implemented search_workspace function to find executables recursively in workspace.
- Modified lookup behavior to try workspace search upon PATH miss when appropriate.
- Updated stdlib registration to pass current workspace root as CWD override.
- Improved error handling and tests accordingly.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* fix(tests): remove run_and_exit calls in cucumber tests

Changed .run_and_exit() to .run() in cucumber test execution to prevent premature process exit, enabling all tests to run as intended.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(stdlib_assert): extract assertion helpers for stdlib output

- Added a new module `stdlib_assert` in `test_support/src` that provides helper
functions for asserting stdlib rendering outputs.
- Replaced ad-hoc stdlib output/error checking in `stdlib_steps` with calls to
this new helper, improving code reuse and clarity.
- Included unit tests for the new assertion helpers to ensure correct behavior.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(which): add workspace fallback and caching to which resolver

- Implement fallback search in workspace directory when PATH is empty
- Skip heavy directories like .git and target during workspace search
- Add small LRU cache for which resolver keyed by command, cwd, PATH, etc.
- Improve error handling and debug logs for unreadable workspace entries
- Enhance which filter tests to cover workspace fallback and skipping heavy dirs
- Add detailed documentation diagrams for which resolver flow and configuration wiring
- Introduce comprehensive tests for workspace search functionality

This enhancement improves command resolution robustness in empty PATH environments by searching the workspace, optimizing with caching, and clarifying behavior through documentation.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(which): add LRU caching with env fingerprint for which resolution

- Introduce an LRU cache keyed by command, env fingerprint (hash of PATH and PATHEXT), cwd, and relevant options.
- Remove repeated executability re-probing on cache hits to optimize performance.
- Refactor cache keys to use a fingerprint of environment variables instead of cloning large strings.
- Update cache entry validation logic to rely on initial validation at insertion time.
- Add extensive unit tests using rstest for deterministic and reusable test fixtures.
- Extend which options API with cache-aware views excluding 'fresh' flag.
- Document improvements and design rationale in netsuke design docs.

This commit improves resolver caching efficiency and robustness by avoiding expensive filesystem calls and string clones during path lookup.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(which): support PATHEXT-aware executable resolution on Windows

Enhance resolve_direct to append PATHEXT extensions when resolving
executables on Windows, allowing matches to files with executable
extensions like .bat. Add tests to verify correct behavior of
PATHEXT expansion and executable detection. This improves cross-platform
compatibility and correctness of executable lookup on Windows.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(stdlib::which): simplify and modularize direct executable path resolution

Refactored `resolve_direct` into platform-specific helper functions to improve clarity and code reuse. On Windows, direct candidates are generated separately, and matching executables are filtered in a dedicated function. On non-Windows, direct path resolution logic is isolated. This change enhances maintainability and readability without changing functionality.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* docs(resolver): add sequence diagram illustrating which resolver flow

Added a Mermaid sequence diagram to docs/netsuke-design.md describing the control flow of the 'WhichResolver' including cache lookups, environment snapshot capture, direct path and PATH search, workspace fallback, and caching behavior. This improves documentation clarity on resolver internals.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* Update code

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* fix(stdlib): propagate errors from stdlib registration and which cache creation

- Changed stdlib::register_with_config to return anyhow::Result instead of StdlibState directly.
- Propagate errors from which::register during stdlib registration.
- Update which::register and WhichResolver::new to return Result to handle cache capacity errors.
- Propagate errors in integration points such as manifest mod and test setup.
- Move tests from lookup.rs to lookup/tests.rs for better organization.

These changes improve error handling in stdlib initialization and which cache setup.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(stdlib/which): enhance workspace command lookup with PATHEXT and case-insensitivity

- Added environment snapshot parameter to workspace search to use PATHEXT on Windows.
- Implemented workspace_entry_matches to compare command names case-insensitively and account for PATHEXT extensions on Windows.
- Updated tests to capture EnvSnapshot for accurate workspace searches.
- Improved docs for register function error cases including non-UTF-8 current directory paths.

This improves command resolution consistency, especially on Windows where PATHEXT and case-insensitivity are important.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(stdlib): extract registration and config modules

Split the standard library registration and configuration logic into separate
modules `register.rs` and `config.rs` respectively. This refactors `src/stdlib/mod.rs` by
moving the registration functions (`register`, `register_with_config`, `value_from_bytes`)
to `register.rs`, and the configuration struct `StdlibConfig` along with related constants
and `NetworkConfig` to `config.rs`.

This improves code organization by separating concerns, making the stdlib codebase
more modular and maintainable. The public API is re-exported from `mod.rs` for
backwards compatibility. Additional cleanup and simplification of `mod.rs` were
performed to remove clutter and centralize related code.

Also, minor changes were made in the `which/lookup/mod.rs` to fix Windows path matching
logic by caching candidate basenames for more efficient matching.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(stdlib/config): update StdlibConfig builders to return Result

Replace panics in StdlibConfig::new and builder methods like with_workspace_root_path with error-returning APIs using anyhow::Result. This enhances robustness by enabling callers to handle validation errors rather than panicking. Added detailed error descriptions and updated usages across stdlib and tests accordingly.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(which): use cfg attributes to adjust workspace_entry_matches params

Changed the parameters of the workspace_entry_matches function to conditionally compile the 'command' argument for non-Windows targets and prefix it with underscore on Windows, similarly for 'ctx' argument. This improves platform-specific parameter usage and compilation cleanliness.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(which/lookup): move workspace fallback search to a new module

Extract the workspace fallback search logic from mod.rs into a dedicated workspace.rs module. This change organizes code better by separating concerns, improves code readability and maintainability without altering functionality.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(which): Refactor workspace search to improve readability and error handling

- Extracted workspace match collection into a new function `collect_workspace_matches` for clarity.
- Enhanced error handling during directory entry iteration with improved tracing.
- Removed platform-dependent conditional compilation from iterator construction, centralizing logic.
- Maintained functionality to collect all matches or stop after first match.
- Cleaned up code structure to simplify maintenance and readability.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(which): add workspace-based executable search with platform-specific matching

- Implement workspace traversal to find executables respecting VCS and build directories.
- Introduce platform-specific matching contexts to handle case sensitivity on Windows vs Unix.
- Add skip logic for heavy directories like .git and target to keep scans fast.
- Enhance tests to cover workspace filtering, PATH resolution, and diagnostic output.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(which/lookup): add cross-platform workspace search for commands

- Introduce conditional compilation to enable workspace search on non-Windows platforms by passing unit () instead of EnvSnapshot.
- Refactor workspace search to support exact case-sensitive matching on Unix-like systems and case-insensitive PATHEXT-aware matching on Windows.
- Enhance tests to run workspace search on both Windows and non-Windows environments.
- Document and encapsulate platform-specific filename matching logic into separate functions to unify and clarify behavior.
- This enables improved command lookup support across platforms by searching the workspace with appropriate matching logic.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* style(which): fix imports ordering under Windows cfg

Reordered the imports of env and EnvSnapshot under the Windows cfg to maintain consistent style and grouping.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* docs(which): add detailed docs to workspace executable search functions

- Documented the `search_workspace` function detailing parameters and behavior.
- Added doc comments to `collect_workspace_matches` explaining its parameters and return values.
- Improved clarity on platform-specific matching and error handling in comments.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant