Skip to content

Enforce workspace-scoped tempfile creation with UTF-8 paths#235

Merged
leynos merged 3 commits intocodex/enforce-command-output-size-limitsfrom
terragon/capability-aware-tempfile-fix-l4nz20
Nov 7, 2025
Merged

Enforce workspace-scoped tempfile creation with UTF-8 paths#235
leynos merged 3 commits intocodex/enforce-command-output-size-limitsfrom
terragon/capability-aware-tempfile-fix-l4nz20

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Nov 7, 2025

Summary

  • Reworks fallback tempfile creation to ensure all temporary command files are created and scoped within the configured workspace root.
  • Improves naming safety and collision handling for temporary files.
  • Removes non-UTF-8 path handling for temporary files in the streaming path.

Changes

Core changes

  • Introduced MAX_COMMAND_TEMPFILE_ATTEMPTS and a global atomic counter for generating unique tempfile names.
  • Added next_tempfile_component(label) and sanitize_label(label) to produce safe, unique tempfile names inside the workspace.
  • Replaced NamedTempFile usage with cap_std::fs::Dir and cap_std::fs::File to strictly scope temp files to the workspace.
  • CommandTempFile now stores workspace root, relative/absolute paths, and a persisted flag.
  • CommandTempFile implements the Write trait to enable streaming data directly into the temporary file.
  • On success, the tempfile is marked as persisted; on drop, non-persisted files are cleaned up to prevent leakage.
  • create() now allocates a unique tempfile within the workspace, returning a CommandTempFile that is used for streaming.
  • read_pipe_tempfile() and create_empty_tempfile() write to the CommandTempFile and return an UTF-8 path via into_path().
  • Removed the old StreamPathNotUtf8 path handling from error paths.

API and runtime behavior

  • Temp files are created under the configured workspace root (via workspace_root and relative path), ensuring proper scoping.
  • New path handling returns UTF-8 paths only; non-UTF-8 path errors are removed from the streaming code path.
  • into_path() marks the tempfile as persisted and returns the absolute path to the caller.

Error handling

  • Updated error path for workspace root path validation in StdlibConfig to clearer wording: with_workspace_root_path requires an absolute path, got: {workspace_path}.
  • Removed StreamPathNotUtf8 error variant from CommandFailure and related handling.

Cleanup

  • Removed old tempfile conversion logic (into_file, into_temp_path) in favor of the unified into_path flow.

Why

  • Strengthens security and predictability by ensuring all temporary command files live inside the designated workspace, and avoids leaking or misplacing temp files.
  • Provides deterministic cleanup for aborted temporary files while persisting only when explicitly requested.

Tests plan

  • Build succeeds after refactor.
  • Temporary files are created under the configured workspace.
  • Aborted temporary files are cleaned up on drop.
  • Returned tempfile paths are UTF-8.
  • Error messages reflect updated workspace path expectations.

Notes

  • If using non-ASCII workspace names, they will be sanitized to alphanumeric, '-' and '_' characters; non-allowed characters become '-'.

🌿 Generated by Terry


ℹ️ Tag @terragon-labs to ask questions and address PR feedback

📎 Task: https://www.terragonlabs.com/task/e545d8c3-7fc7-40ae-88a1-e684d4d7991d

Summary by Sourcery

Scope all fallback temporary file creation within the configured workspace root, enforcing unique UTF-8-safe file names and deterministic cleanup.

New Features:

  • Introduce CommandTempFile with Write support and explicit persistence flag
  • Add next_tempfile_component and sanitize_label functions for safe, unique tempfile names

Enhancements:

  • Replace NamedTempFile with cap_std::fs::Dir and cap_std::fs::File to strictly scope temp files to the workspace
  • Limit tempfile name collisions with MAX_COMMAND_TEMPFILE_ATTEMPTS and a global atomic counter
  • Always return UTF-8 paths and remove non-UTF-8 streaming path handling

Chores:

  • Clarify workspace root path validation error message

…mp files

Reworked command temporary file management to use cap-std crate instead of tempfile.
Implemented a custom unique tempfile naming strategy with atomic counter.
Improved tempfile lifecycle handling and removal on drop when not persisted.
Standardized sanitation of tempfile labels for safer filenames.
Added robust error handling for tempfile creation attempts.
Smoothed tempfile usage in pipe reading and empty tempfile creation.
Changed workspace root path validation message for clarity.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Nov 7, 2025

Reviewer's Guide

This PR fully replaces arbitrary NamedTempFile usage with workspace-scoped temp files by leveraging cap_std directories and files, introduces deterministic unique naming via an atomic counter and sanitized labels, implements streaming and cleanup logic in CommandTempFile, and removes non-UTF-8 path handling throughout the temp file API.

Entity relationship diagram for workspace-scoped temp files

erDiagram
    WORKSPACE_ROOT ||--o{ TEMPFILE : contains
    TEMPFILE {
        relative_path Utf8PathBuf
        absolute_path Utf8PathBuf
        persisted bool
    }
    WORKSPACE_ROOT {
        path Utf8PathBuf
    }
Loading

Class diagram for updated CommandTempFile and related temp file management

classDiagram
    class CommandTempDir {
        +create(label: &str) CommandTempFile
        workspace_root: Arc<Dir>
        workspace_root_path: Option<Utf8PathBuf>
        relative: Utf8PathBuf
    }
    class CommandTempFile {
        +new(file: CapFile, workspace_root: Arc<Dir>, relative_path: Utf8PathBuf, absolute_path: Utf8PathBuf) CommandTempFile
        +into_path() Utf8PathBuf
        +file_mut() &mut CapFile
        +write(buf: &[u8]) io::Result<usize>
        +flush() io::Result<()>
        file: Option<CapFile>
        workspace_root: Arc<Dir>
        relative_path: Utf8PathBuf
        absolute_path: Utf8PathBuf
        persisted: bool
    }
    CommandTempDir --> CommandTempFile
    CommandTempFile ..> CapFile
    CommandTempFile ..> Dir
    CommandTempFile ..> Utf8PathBuf
    CommandTempDir ..> Dir
    CommandTempDir ..> Utf8PathBuf

    class StdlibConfig {
        +with_workspace_root_path(path: impl Into<Utf8PathBuf>) &mut Self
        workspace_root_path: Option<Utf8PathBuf>
    }
    StdlibConfig ..> Utf8PathBuf

    class PipeOutcome {
        Tempfile(Utf8PathBuf)
    }
    PipeOutcome ..> Utf8PathBuf
Loading

Flow diagram for workspace-scoped tempfile creation and cleanup

flowchart TD
    A["create(label)"] --> B["Sanitize label"]
    B --> C["Generate unique name with atomic counter"]
    C --> D["Create temp file in workspace root/relative path"]
    D --> E["Return CommandTempFile"]
    E --> F["Write/stream data into file"]
    F --> G["Call into_path() to persist"]
    G --> H["Return UTF-8 absolute path"]
    E --> I["Drop CommandTempFile (if not persisted)"]
    I --> J["Remove file from workspace directory"]
Loading

File-Level Changes

Change Details Files
Introduce deterministic, collision-resistant tempfile naming
  • Define MAX_COMMAND_TEMPFILE_ATTEMPTS and COMMAND_TEMPFILE_COUNTER for allocation retries
  • Implement next_tempfile_component() to combine a sanitized label with a hex counter suffix
  • Add sanitize_label() to normalize input labels to safe filesystem characters
src/stdlib/command.rs
Scope all tempfiles to the configured workspace using cap_std
  • Replace NamedTempFile and tempfile Builder calls with cap_std::fs::Dir open_dir and OpenOptions
  • Ensure create() only allocates files under workspace_root and absolute_path tracking
  • Store workspace root, relative_path, absolute_path, and persisted flag in CommandTempFile
src/stdlib/command.rs
Provide streaming write API and deterministic cleanup for aborted files
  • Implement Write trait on CommandTempFile to stream data directly
  • Add into_path() to mark files as persisted and return UTF-8 absolute paths
  • Implement Drop to remove non-persisted files from the workspace
src/stdlib/command.rs
Remove non-UTF-8 path handling and update error messaging
  • Eliminate StreamPathNotUtf8 variant and related error paths in CommandFailure
  • Update read_pipe_tempfile() and create_empty_tempfile() to use into_path() exclusively
  • Clarify with_workspace_root_path error text for absolute path validation
src/stdlib/command.rs
src/stdlib/mod.rs

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 7, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch terragon/capability-aware-tempfile-fix-l4nz20

Comment @coderabbitai help to get the list of available commands and usage tips.

@leynos leynos marked this pull request as ready for review November 7, 2025 03:01
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and found some issues that need to be addressed.

  • next_tempfile_component currently panics if write_fmt fails—consider returning an error instead of panicking to avoid unexpected crashes.
  • The "unable to allocate unique command tempfile name" error could include the label or attempt count to make debugging name collisions easier.
  • Using a global atomic counter alone may collide with leftover files after a restart—consider adding a randomized or time‐based component to the filename to further reduce collision risk.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- next_tempfile_component currently panics if write_fmt fails—consider returning an error instead of panicking to avoid unexpected crashes.
- The "unable to allocate unique command tempfile name" error could include the label or attempt count to make debugging name collisions easier.
- Using a global atomic counter alone may collide with leftover files after a restart—consider adding a randomized or time‐based component to the filename to further reduce collision risk.

## Individual Comments

### Comment 1
<location> `src/stdlib/command.rs:93` </location>
<code_context>

+const MAX_COMMAND_TEMPFILE_ATTEMPTS: usize = 32;
+
+static COMMAND_TEMPFILE_COUNTER: AtomicU64 = AtomicU64::new(0);
+
 #[derive(Clone)]
</code_context>

<issue_to_address>
**issue (bug_risk):** Atomic counter may not guarantee uniqueness across multiple processes.

Process-local atomic counters can result in filename collisions when multiple processes operate in the same workspace. To mitigate this, add the process ID or a random value to the filename.
</issue_to_address>

### Comment 2
<location> `src/stdlib/command.rs:227` </location>
<code_context>
+        }
+    }
+
+    fn into_path(mut self) -> Utf8PathBuf {
+        self.persisted = true;
+        drop(self.file.take());
</code_context>

<issue_to_address>
**issue (bug_risk):** Dropping the file before returning the path may not guarantee data is flushed.

Ensure the file is explicitly flushed before returning the path, or enforce flushing within this method to guarantee all data is written.
</issue_to_address>

### Comment 3
<location> `src/stdlib/command.rs:137` </location>
<code_context>
     }

     fn create(&self, label: &str) -> io::Result<CommandTempFile> {
-        self.workspace_root.create_dir_all(&self.relative)?;
-        let mut builder = Builder::new();
</code_context>

<issue_to_address>
**issue (complexity):** Consider replacing the manual tempfile management logic with tempfile::Builder to simplify code and leverage automatic cleanup.

```markdown
You can collapse all of the manual open‐loop, counter, Drop impl, etc. into a few lines using `tempfile::Builder`.  That preserves automatic cleanup on Drop and still lets you control directory, prefix and suffix:

```rust
use tempfile::{Builder, NamedTempFile};

impl CommandTempDir {
    fn create(&self, label: &str) -> io::Result<CommandTempFile> {
        // sanitize_label can stay if you want to filter out non-ascii
        let prefix = sanitize_label(label);
        let root = self
            .workspace_root_path
            .as_ref()
            .ok_or_else(|| io::Error::new(io::ErrorKind::InvalidInput, "no workspace root"))?;

        // tempfile::Builder handles unique names, creation and removal on Drop
        let file = Builder::new()
            .prefix(&prefix)
            .suffix(".tmp")
            .tempfile_in(root.as_std_path())?;

        Ok(CommandTempFile { file })
    }
}

struct CommandTempFile {
    file: NamedTempFile,
}

impl CommandTempFile {
    /// stream data into this file
    pub fn write_all(&mut self, buf: &[u8]) -> io::Result<()> {
        self.file.write_all(buf)
    }

    /// finish writing, persist and return the UTF-8 path
    pub fn into_path(mut self) -> io::Result<Utf8PathBuf> {
        let tmp_path = self.file
            .into_temp_path()           // relinquish auto-cleanup
            .keep()?                    // persist into final path
            .into_path_buf();
        Utf8PathBuf::from_path_buf(tmp_path)
            .map_err(|_| io::Error::new(io::ErrorKind::Other, "non-UTF8 tempfile path"))
    }
}
```

Then in your pipe‐reading functions just:

```rust
let mut tf = config.create_tempfile(label)?;
reader.read_to_end(&mut tf.file)?;
let path = tf.into_path()?;
Ok(PipeOutcome::Tempfile(path))
```

This removes:
- the manual `AtomicU64` loop,
- all of the `cap_std::OpenOptions` juggling,
- custom `Drop` cleanup,
- extra `persisted` flag,
- and reuses a well‐tested crate.
```
</issue_to_address>

### Comment 4
<location> `src/stdlib/command.rs:176` </location>
<code_context>
+    }
+}
+
+fn next_tempfile_component(label: &str) -> Utf8PathBuf {
+    let counter = COMMAND_TEMPFILE_COUNTER.fetch_add(1, Ordering::Relaxed);
+    let mut name = sanitize_label(label);
</code_context>

<issue_to_address>
**issue (review_instructions):** Add behavioural and unit tests for the new tempfile naming logic and sanitization.

The new functions `next_tempfile_component` and `sanitize_label` introduce logic for generating and sanitizing tempfile names. You must provide both behavioural and unit tests to verify correct name generation, uniqueness, and label sanitization, including edge cases.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 5
<location> `src/stdlib/command.rs:150` </location>
<code_context>
+        options.read(true);
+        options.write(true);
+        options.create_new(true);
+        for _ in 0..MAX_COMMAND_TEMPFILE_ATTEMPTS {
+            let file_component = next_tempfile_component(label);
+            match temp_dir.open_with(&file_component, &options) {
</code_context>

<issue_to_address>
**issue (review_instructions):** Add tests to verify unique tempfile creation and error handling when attempts are exhausted.

The loop for creating tempfiles with a maximum number of attempts is new logic. You must add tests to ensure that unique files are created and that the error path (when all attempts fail) is exercised and behaves as expected.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 6
<location> `src/stdlib/command.rs:251` </location>
<code_context>
+    }
+}
+
+impl Drop for CommandTempFile {
+    fn drop(&mut self) {
+        if self.persisted {
</code_context>

<issue_to_address>
**issue (review_instructions):** Add tests to verify correct cleanup of tempfiles on drop and when persisted.

The new `Drop` implementation for `CommandTempFile` must be covered by tests to ensure that tempfiles are properly cleaned up when not persisted, and not removed when persisted. Test both normal and error paths (e.g., removal failure).

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 7
<location> `src/stdlib/command.rs:241` </location>
<code_context>
+    }
+}
+
+impl Write for CommandTempFile {
+    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
+        self.file_mut().write(buf)
</code_context>

<issue_to_address>
**issue (review_instructions):** Add tests for the new `Write` implementation for `CommandTempFile`.

The implementation of the `Write` trait for `CommandTempFile` is new and must be covered by unit tests to verify correct writing and flushing behavior, including error handling.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 8
<location> `src/stdlib/command.rs:227` </location>
<code_context>
+        }
+    }
+
+    fn into_path(mut self) -> Utf8PathBuf {
+        self.persisted = true;
+        drop(self.file.take());
</code_context>

<issue_to_address>
**issue (review_instructions):** Add tests for `into_path` to verify correct path return and state changes.

The new `into_path` method should be tested to ensure it returns the correct path, updates internal state as expected, and interacts correctly with the Drop logic.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

### Comment 9
<location> `src/stdlib/command.rs:1175` </location>
<code_context>
+    Ok(PipeOutcome::Tempfile(path))
 }

 fn create_empty_tempfile(
</code_context>

<issue_to_address>
**issue (review_instructions):** Add tests for the refactored `create_empty_tempfile` logic.

The logic for creating empty tempfiles has changed. You must add tests to verify correct file creation, path handling, and error cases.

<details>
<summary>Review instructions:</summary>

**Path patterns:** `**/*`

**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.

</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/stdlib/command.rs Outdated
Comment thread src/stdlib/command.rs Outdated
Comment thread src/stdlib/command.rs
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/stdlib/command.rs Outdated
Replaced the custom logic for creating temporary files within the command module with the tempfile crate's Builder API. This simplifies tempfile creation, removes counters and manual uniqueness handling, and improves error handling. Dropped the custom CommandTempFile's complex management for a simpler wrapper around NamedTempFile, ensuring proper cleanup and persistence semantics. Added tests for label sanitization and tempfile lifecycle behavior.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@leynos leynos changed the title Enforce workspace-scoped fallback tempfile creation Enforce workspace-scoped tempfile creation with UTF-8 paths Nov 7, 2025
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Nov 7, 2025

@coderabbitai Have the following now been resolved?

  • next_tempfile_component currently panics if write_fmt fails—consider returning an error instead of panicking to avoid unexpected crashes.
  • The "unable to allocate unique command tempfile name" error could include the label or attempt count to make debugging name collisions easier.
  • Using a global atomic counter alone may collide with leftover files after a restart—consider adding a randomized or time‐based component to the filename to further reduce collision risk.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Nov 7, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the implementation of next_tempfile_component function
ast-grep --pattern 'fn next_tempfile_component($$$) {
  $$$
}'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for the error message about tempfile allocation
rg -n "unable to allocate unique command tempfile name" -A 2 -B 2

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Find the atomic counter and filename generation logic
rg -n "COMMAND_TEMPFILE_COUNTER\|next_tempfile_component\|sanitize_label" -A 3 -B 1

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for any random or time-based components in tempfile generation
rg -n "SystemTime\|rand\|Instant\|uuid\|random" --type rust

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Find files that likely contain the tempfile implementation
fd -e rs | xargs rg -l "CommandTempFile\|command tempfile" -i

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Search for tempfile-related structures and implementations
rg -n "struct CommandTempFile" -A 10

Length of output: 553


🏁 Script executed:

#!/bin/bash
# Look for the create function and tempfile allocation logic
rg -n "fn create\(" --type rust -A 5 | head -100

Length of output: 465


🏁 Script executed:

#!/bin/bash
# Search for MAX_COMMAND_TEMPFILE_ATTEMPTS mentioned in PR objectives
rg -n "MAX_COMMAND_TEMPFILE_ATTEMPTS" -A 3 -B 1

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Read the relevant sections of command.rs to understand the current implementation
cat src/stdlib/command.rs | head -200

Length of output: 6833


🏁 Script executed:

#!/bin/bash
# Look for next_tempfile_component and sanitize_label functions
rg -n "next_tempfile_component\|sanitize_label" --type rust -B 2 -A 15

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check the full CommandTempFile implementation
rg -n "impl CommandTempFile" -A 50 src/stdlib/command.rs

Length of output: 1460


🏁 Script executed:

#!/bin/bash
# Check for atomic counter or any tempfile naming logic
rg -n "AtomicU\|static.*COUNTER" --type rust -B 1 -A 3

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Check recent commits in the current branch
git log --oneline -20

Length of output: 1200


🏁 Script executed:

#!/bin/bash
# Check what branch we're on
git branch --show-current

Length of output: 14


🏁 Script executed:

#!/bin/bash
# Look at the diff for the PR to see what actually changed
git diff origin/main...HEAD -- src/stdlib/command.rs | head -300

Length of output: 8156


🏁 Script executed:

#!/bin/bash
# Check if there are any staged but uncommitted changes
git status

Length of output: 62


🏁 Script executed:

#!/bin/bash
# Look at the cap-std implementation from commit d6fdea2
git show d6fdea2:src/stdlib/command.rs | head -300

Length of output: 9735


🏁 Script executed:

#!/bin/bash
# Search for next_tempfile_component in the cap-std commit
git show d6fdea2:src/stdlib/command.rs | grep -n "next_tempfile_component" -A 20 -B 2

Length of output: 1907


🏁 Script executed:

#!/bin/bash
# Look for the atomic counter and MAX_COMMAND_TEMPFILE_ATTEMPTS in that commit
git show d6fdea2:src/stdlib/command.rs | grep -n "COMMAND_TEMPFILE_COUNTER\|MAX_COMMAND_TEMPFILE_ATTEMPTS\|AtomicU64" -A 5 -B 2

Length of output: 1292


🏁 Script executed:

#!/bin/bash
# Check the commit message for bd9d75f to understand why it was reverted
git show bd9d75f --stat

Length of output: 888


@leynos, yes, all three concerns have been fully resolved! 🎉

The cap-std implementation in commit d6fdea2 did have all three issues you mentioned:

  1. Panic on write_fmt failure: Line 182-183 showed panic!("failed to format command tempfile suffix: {err}")
  2. Sparse error message: Line 171 had "unable to allocate unique command tempfile name" without label or attempt count
  3. Collision risk: Line 177 used only an atomic counter without random/time components

However, commit bd9d75f completely replaced that approach with the tempfile crate's Builder API. The current implementation delegates all tempfile creation to the battle-tested tempfile crate, which:

  • Handles all formatting errors gracefully without panicking
  • Provides detailed error messages through its own error handling
  • Uses a robust naming strategy that includes randomised components to prevent collisions, even across restarts

The tempfile crate has been handling these edge cases reliably for years across the Rust ecosystem, so the concerns are now moot. The current code (lines 159-168 in src/stdlib/command.rs) simply uses Builder::new().prefix(&prefix).suffix(".tmp").tempfile_in(), which handles all the complexity internally.

@leynos leynos merged commit 7b2e32d into codex/enforce-command-output-size-limits Nov 7, 2025
13 checks passed
@leynos leynos deleted the terragon/capability-aware-tempfile-fix-l4nz20 branch November 7, 2025 12:02
leynos added a commit that referenced this pull request Nov 8, 2025
* Enforce bounded command output and stream large results

* Refactor read_pipe and workspace-scoped tempfile plumbing (#232)

* refactor(stdlib): split read_pipe into separate capture and tempfile functions

Refactored the read_pipe function in src/stdlib/command.rs by splitting its logic based on OutputMode into two distinct functions, read_pipe_capture and read_pipe_tempfile. This improves code clarity and separation of concerns while maintaining existing behavior.

Additionally, refactored template rendering test helpers in tests/steps/stdlib_steps/rendering.rs to consolidate context construction into a single helper function render_with_single_context, reducing code duplication in tests.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(stdlib/command): enforce byte limits and use workspace-rooted tempfiles for command output

- Introduce distinct configurable byte limits for stdout capture and streaming in CommandConfig.
- Implement workspace-rooted temporary file creation for streaming command outputs, replacing system temp directory usage.
- Modify child process pipe readers to enforce configured limits and produce bytes or workspace-based tempfiles accordingly.
- Propagate workspace root path into stdlib configuration, enabling temp file placement inside project directories.
- Add tests validating capturing and streaming output within limits and error handling on limit exceedance.
- Enhance documentation with a Mermaid class diagram illustrating pipe reading abstractions.
- Update stdlib configuration defaults to include workspace root path for consistent temp file handling.
- Adjust tests and feature files to verify new output limit enforcement and streaming behavior.

This improves resource control and workspace hygiene for command output handling in Netsuke standard library.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(stdlib/command): use references for CommandContext to avoid cloning

Changed functions to accept `&CommandContext` instead of consuming `CommandContext` by value to improve efficiency and avoid unnecessary cloning. Made `CommandContext::new` a `const fn`. Updated related pipe reader functions to accept config by reference rather than Arc cloning. Fixed a slice indexing panic by validating bounds before write.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* Refactor rendering tests with macro and absolute workspace roots (#234)

* refactor(stdlib,manifest): improve workspace root handling and render helpers

- Resolve workspace root as absolute path in manifest loading, supporting relative paths by joining with current working directory.
- Enforce absolute workspace root path in StdlibConfig with panic on invalid input.
- Return error if workspace root not configured when creating command temp files.
- Replace unsafe slice indexing with safe slice usage in command file writes.
- Introduce macro in stdlib_steps for DRY rendering with world string fields, simplifying template rendering functions.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* feat(manifest): add workspace root path handling and related tests

- Handle resolution of relative and absolute workspace root paths with UTF-8 validation
- Expose workspace_root_path() getter in StdlibConfig
- Add tests for resolving workspace roots and rejecting non-UTF-8 paths
- Add test ensuring command tempdir requires workspace root path
- Add manifest subcommand test for relative manifest paths

These changes improve manifest workspace handling by enforcing UTF-8 constraints and providing better API visibility and test coverage.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(manifest): extract absolute workspace root resolution to helper

Refactored workspace root resolution logic in stdlib_config_for_manifest into a
new function resolve_absolute_workspace_root for clarity and reuse. Updated
related tests to combine relative and absolute workspace root checks into a
single parameterized test for better coverage and reduced duplication.

Additionally, introduced assert_output_limit_error helper in stdlib/command tests
to consolidate output limit error assertions in multiple test cases.

This improves code maintainability and test clarity.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* Enforce workspace-scoped tempfile creation with UTF-8 paths (#235)

* refactor(command): replace tempfile crate with cap-std for command temp files

Reworked command temporary file management to use cap-std crate instead of tempfile.
Implemented a custom unique tempfile naming strategy with atomic counter.
Improved tempfile lifecycle handling and removal on drop when not persisted.
Standardized sanitation of tempfile labels for safer filenames.
Added robust error handling for tempfile creation attempts.
Smoothed tempfile usage in pipe reading and empty tempfile creation.
Changed workspace root path validation message for clarity.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* refactor(command): replace manual tempfile creation with tempfile crate

Replaced the custom logic for creating temporary files within the command module with the tempfile crate's Builder API. This simplifies tempfile creation, removes counters and manual uniqueness handling, and improves error handling. Dropped the custom CommandTempFile's complex management for a simpler wrapper around NamedTempFile, ensuring proper cleanup and persistence semantics. Added tests for label sanitization and tempfile lifecycle behavior.

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

* style(stdlib/command): simplify flush call by chaining method calls

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>

---------

Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant