Skip to content

Escape interpolated commands with shell-quote#142

Merged
leynos merged 10 commits intomainfrom
codex/integrate-shell-quote-crate-for-security
Aug 29, 2025
Merged

Escape interpolated commands with shell-quote#142
leynos merged 10 commits intomainfrom
codex/integrate-shell-quote-crate-for-security

Conversation

@leynos
Copy link
Copy Markdown
Owner

@leynos leynos commented Aug 27, 2025

Summary

  • escape $in/$out expansions with shell-quote
  • validate interpolated commands using shlex
  • cover quoting with unit and behavioural tests

Testing

  • make fmt
  • make lint
  • make test
  • make markdownlint

https://chatgpt.com/codex/tasks/task_e_68af7a8b60788322821665ad2804c2af

Summary by Sourcery

Escape interpolated $in/$out placeholders using the shell-quote crate, validate the resulting commands with shlex, update IR and Ninja generators to handle quoting and errors, and adjust documentation and tests accordingly

New Features:

  • Shell-quote $in/$out placeholders in command recipes and validate interpolated commands with shlex
  • Introduce an InvalidCommand error for unparsable shell commands

Enhancements:

  • Store Rule objects in the IR rule map instead of action hashes and simplify rule resolution
  • Embed input/output quoting and validation assertions in Ninja file generation

Build:

  • Add shell-quote and shlex crates to Cargo.toml

Documentation:

  • Update design and roadmap documentation to describe shell-quote integration and validation

Tests:

  • Add unit and feature tests for quoted $in/$out interpolation and invalid command error cases
  • Update existing ninja and IR tests to expect quoted commands and revised action counts

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Aug 27, 2025

Reviewer's Guide

This PR integrates shell-quote and shlex to escape and validate $in/$out placeholders in commands, refactors IR generation to propagate Rule data for quoting, adds run-time checks in ninja code, updates docs, dependencies, and AST, and expands test coverage for secure command interpolation.

Entity relationship diagram for Rule and Recipe propagation in IR generation

erDiagram
    NETSUKE_MANIFEST ||--o{ RULE : contains
    RULE ||--|{ RECIPE : has
    BUILD_GRAPH ||--o{ RULE : uses
    BUILD_GRAPH ||--o{ ACTION : creates
    ACTION ||--|{ RECIPE : uses
Loading

Class diagram for updated Rule and BuildGraph processing

classDiagram
    class BuildGraph {
        +process_rules(manifest, rule_map)
        +process_targets(manifest, actions, targets, rule_map)
        +process_defaults(manifest, default_targets)
    }
    class Rule {
        +name: String
        +recipe: Recipe
        +description: Option<String>
    }
    class Recipe {
        <<enum>>
        Command
        Script
        Rule
    }
    BuildGraph --> "*" Rule : uses
    BuildGraph --> "*" Recipe : uses
    Rule --> Recipe
Loading

Flow diagram for command interpolation and validation

flowchart TD
    A[Recipe::Command with $in/$out] --> B[Escape $in/$out with shell-quote]
    B --> C[Interpolate command string]
    C --> D[Validate with shlex::split]
    D -- valid --> E[Register Action]
    D -- invalid --> F[Raise IrGenError::InvalidCommand]
Loading

File-Level Changes

Change Details Files
Escape and validate interpolated commands
  • Quote input/output paths with shell-quote before interpolation
  • Replace $in/$out placeholders in command strings with quoted paths
  • Validate final command via shlex::split and return InvalidCommand on failure
src/ir.rs
Refactor IR generation to carry Rule objects
  • Change process_rules to map names to Rule clones
  • Adjust process_targets to pass inputs/outputs into register_action
  • Update resolve_rule to return &Rule instead of action hashes
src/ir.rs
Add shell command validation in Ninja generation
  • Assert shlex::split succeeds for Recipe::Command entries
  • Wrap escaped scripts in a printf pipeline and validate with shlex
src/ninja_gen.rs
Update documentation for quoting and validation
  • Describe shell-quote integration and replacement of ins/outs
  • Mark roadmap items for shell-quote and shlex use as completed
docs/netsuke-design.md
docs/roadmap.md
Extend AST and bump dependencies
  • Add Clone derive to Rule struct
  • Add shell-quote and shlex crates in Cargo.toml
src/ast.rs
Cargo.toml
Expand and adjust tests for quoting behavior
  • Add command_escaping_tests for quoted paths and invalid commands
  • Adjust ninja_gen_tests to expect quoted commands
  • Update feature tests for IR/Ninja to reflect new action counts
tests/command_escaping_tests.rs
tests/ninja_gen_tests.rs
tests/features/ninja.feature
tests/features/ir.feature
tests/features/ir_generation.feature

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 27, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Note

Reviews paused

Use the following commands to manage reviews:

  • @coderabbit resume to resume automatic reviews.
  • @coderabbit review to trigger a single review.

Summary by CodeRabbit

  • New Features

    • Automatic POSIX shell-quoting of input/output paths in generated commands; placeholders are expanded to concrete, quoted paths.
    • Command strings are validated; invalid shell syntax now fails with a clear error message.
    • Early circular dependency detection with deterministic error messages.
  • Refactor

    • Identical rules now produce distinct actions when their file paths differ.
  • Documentation

    • Updated design and roadmap to reflect the new quoting model, validation, and action-identity semantics.
  • Tests

    • Added and updated tests to verify shell-quoting, command validation, and revised action expectations.

Walkthrough

Add shell-quote and shlex dependencies; document POSIX quoting and shlex validation; derive Clone for Rule; refactor IR to store Arc, compute inputs/outputs, interpolate and validate command templates (introduce IrGenError::InvalidCommand); validate commands during Ninja emission; add and update tests and test data; invoke cycle detection during graph construction.

Changes

Cohort / File(s) Summary
Dependencies
Cargo.toml
Add shell-quote = { version = "0.7.2", default-features = false, features = ["sh"] } and shlex = "1.3.0" under [dependencies].
Documentation
docs/netsuke-design.md, docs/roadmap.md
Document POSIX/sh quoting for $in/$out via shell-quote (Sh mode), mandate shlex validation of final command, update roadmap statuses.
AST
src/ast.rs
Derive Clone for Rule (#[derive(Debug, Clone, Deserialize, Serialize)]).
IR refactor & validation
src/ir.rs
Store Rule as Arc<Rule> in rule_map; compute inputs/outputs once per target; change register_action to accept explicit inputs/outputs; add interpolate_command(template, inputs, outputs) and substitute(...) using shell_quote; validate interpolated commands with shlex and add IrGenError::InvalidCommand { command, snippet }; call graph.detect_cycles() during BuildGraph construction.
Ninja generation validation
src/ninja_gen.rs
Validate command strings with shlex::split for Recipe::Command and Recipe::Script before emitting; construct Script commands and assert parsability first.
Unit tests & helpers
tests/command_escaping_tests.rs, tests/ir_from_manifest_tests.rs, tests/ninja_gen_tests.rs, tests/steps/ninja_steps.rs
Add tests verifying shell-quoting and invalid-command errors; update expectations where identical rules now emit distinct actions; add test step helpers that split emitted commands with shlex.
Feature tests
tests/features/ir.feature, tests/features/ir_generation.feature, tests/features/ninja.feature
Rename scenarios to reflect distinct-action behaviour; update expected action counts; add scenarios asserting shell-quoted command tokens in generated Ninja file.
Test data
tests/data/quote.yml
Add manifest examples with spaced and edge-case paths to exercise quoting.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Manifest as Manifest Parser
  participant IR as IR Builder
  participant SQ as shell-quote (Sh)
  participant Shlex as shlex
  participant Graph as BuildGraph

  User->>Manifest: Provide manifest (targets, rules)
  Manifest-->>IR: Parsed targets & rules
  IR->>IR: process_rules() -> store Rule objects (Arc<Rule>)
  IR->>IR: process_targets() -> compute inputs, outputs
  alt Recipe::Command
    IR->>SQ: Quote paths (inputs/outputs)
    SQ-->>IR: Quoted tokens
    IR->>IR: Substitute $in/$out into template
    IR->>Shlex: Validate interpolated command (POSIX)
    alt Parsable
      Shlex-->>IR: Ok
      IR->>Graph: register_action(inputs, outputs, Command)
    else Not parsable
      Shlex-->>IR: Err
      IR-->>User: IrGenError::InvalidCommand
    end
  else Recipe::Script / Rule-based
    IR->>IR: Prepare command/script or use rule recipe
    IR->>Graph: register_action(inputs, outputs, recipe)
  end
Loading
sequenceDiagram
  autonumber
  participant Ninja as Ninja Generator
  participant Shlex as shlex
  participant Writer as Output Writer

  Ninja->>Shlex: shlex::split(command_str)
  alt Parsable
    Shlex-->>Ninja: Ok
    Ninja->>Writer: Emit "command = ..." line
  else Not parsable
    Shlex-->>Ninja: Err
    Ninja-->>Ninja: Propagate/abort with error
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • codescene-delta-analysis
Prompt for the AI agent: Review the IR changes for correct lifetimes when returning &Rule from resolve_rule; ensure substitute() avoids accidental replacement inside identifiers; run test suite focusing on command_escaping_tests and ninja feature scenarios.

Poem

Quote the paths, let spaces sing,
$in and $out now safely cling.
Rules duplicate, still march two-by-two,
Shlex nods, shell-quote wraps them true.
Ninja writes commands, tidy and new.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/integrate-shell-quote-crate-for-security

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

codescene-delta-analysis[bot]

This comment was marked as outdated.

sourcery-ai[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tests/features/ninja.feature (1)

15-19: Ensure POSIX-compliant quoting in feature expectations

Pinpoint the Bash-only ANSI-C quoting and replace it with standard single quotes:

• File: tests/features/ninja.feature
• Lines: 15–19

-    Then the ninja file contains "command = cat $'in file' > $'out file'"
+    Then the ninja file contains "command = cat 'in file' > 'out file'"

Add a CI check to reject Bash-style $'…' in /bin/sh:

#!/usr/bin/env bash
set -euo pipefail

if sh -c "echo \$'ok'" >/dev/null 2>&1; then
  echo "ERROR: /bin/sh accepts ANSI-C quoting. Use POSIX single quotes." >&2
  exit 1
fi
tests/ninja_gen_tests.rs (1)

74-98: Reassess per-edge command specialisation; it inflates rule count and harms reuse.

Embedding concrete inputs/outputs into rule command reduces deduplication and can bloat build.ninja for many edges. Prefer:

  • Keeping a reusable rule with $in/$out and sh-compatible, per-arg quoting via rspfile, or
  • Generating per-edge rspfiles to avoid shell-quoting complexities.

Audit current usage of Bash-only quoting in tests and sources:

#!/bin/bash
set -euo pipefail
rg -n --hidden --glob '!*target*' --glob '!**/snapshots/**' "\$'" -C0 || true
echo "If matches appear above, replace ANSI-C $'…' with POSIX-safe quoting or enforce bash."
src/ninja_gen.rs (1)

87-103: Document printf %b portability and escape set

The transform relies on printf %b semantics for dropping backslashes on non-escape sequences (e.g., $ -> $). Document this explicitly and cite shells that conform (dash, bash, busybox ash). Add a unit test exercising $, %, \r, and non-ASCII.

Do POSIX /bin/sh implementations (dash, busybox ash) treat backslash before a non-special character in `printf %b` as a literal (i.e., `\$` -> `$`)?
src/ir.rs (1)

199-234: Use sh-compatible quoting, and avoid partial macro replacement ($in inside $input)

Two issues:

  • Using Bash quoting (Bash::quote_vec) can emit $'…' which is not supported by /bin/sh (dash). Ninja defaults to /bin/sh, and Script path forces /bin/sh. Quote for sh, not bash.
  • String::replace() rewrites $in inside longer identifiers (e.g., $input), breaking variables.

Quote for sh and perform boundary-aware macro interpolation.

-        Recipe::Command { command } => {
-            use shell_quote::Bash;
+        Recipe::Command { command } => {
+            use shell_quote::Sh;
             let quote_paths = |paths: &[PathBuf]| -> String {
-                paths
+                paths
                     .iter()
                     .map(|p| {
                         let raw = p.display().to_string();
-                        String::from_utf8(Bash::quote_vec(&raw))
+                        String::from_utf8(Sh::quote_vec(&raw))
                             .expect("quoted path is valid UTF-8")
                     })
                     .collect::<Vec<String>>()
                     .join(" ")
             };
             let ins = quote_paths(inputs);
             let outs = quote_paths(outputs);
-            let interpolated = command.replace("$in", &ins).replace("$out", &outs);
+            let interpolated = interpolate_macros(&command, &ins, &outs);
             if shlex::split(&interpolated).is_none() {
                 return Err(IrGenError::InvalidCommand {
-                    command: interpolated,
+                    command: interpolated,
                 });
             }
             Recipe::Command {
                 command: interpolated,
             }
         }

Add this helper within the module:

fn interpolate_macros(cmd: &str, ins: &str, outs: &str) -> String {
    let bytes = cmd.as_bytes();
    let mut out = String::with_capacity(cmd.len() + ins.len() + outs.len());
    let mut i = 0;
    while i < bytes.len() {
        if bytes[i] == b'$' {
            // Try $in
            if bytes.get(i + 1) == Some(&b'i') && bytes.get(i + 2) == Some(&b'n') {
                if matches!(bytes.get(i + 3), None | Some(b) if !b.is_ascii_alphanumeric() && *b != b'_') {
                    out.push_str(ins);
                    i += 3;
                    continue;
                }
            }
            // Try $out
            if bytes.get(i + 1) == Some(&b'o')
                && bytes.get(i + 2) == Some(&b'u')
                && bytes.get(i + 3) == Some(&b't')
            {
                if matches!(bytes.get(i + 4), None | Some(b) if !b.is_ascii_alphanumeric() && *b != b'_') {
                    out.push_str(outs);
                    i += 4;
                    continue;
                }
            }
        }
        out.push(bytes[i] as char);
        i += 1;
    }
    out
}

Add tests for:

  • Command containing $input and $outfile (no substitution).
  • Empty inputs/outputs (no stray punctuation).
  • Paths with quotes and non-ASCII.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f162b07 and 6d45855.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • tests/snapshots/ninja/ninja_snapshot_tests__touch_manifest_ninja.snap is excluded by !**/*.snap
📒 Files selected for processing (13)
  • Cargo.toml (1 hunks)
  • docs/netsuke-design.md (1 hunks)
  • docs/roadmap.md (1 hunks)
  • src/ast.rs (1 hunks)
  • src/ir.rs (7 hunks)
  • src/ninja_gen.rs (1 hunks)
  • tests/command_escaping_tests.rs (1 hunks)
  • tests/data/quote.yml (1 hunks)
  • tests/features/ir.feature (1 hunks)
  • tests/features/ir_generation.feature (1 hunks)
  • tests/features/ninja.feature (1 hunks)
  • tests/ir_from_manifest_tests.rs (1 hunks)
  • tests/ninja_gen_tests.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions

Files:

  • tests/ir_from_manifest_tests.rs
  • src/ast.rs
  • src/ninja_gen.rs
  • tests/command_escaping_tests.rs
  • tests/ninja_gen_tests.rs
  • src/ir.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • tests/ir_from_manifest_tests.rs
  • src/ast.rs
  • src/ninja_gen.rs
  • tests/command_escaping_tests.rs
  • tests/ninja_gen_tests.rs
  • src/ir.rs
tests/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md

Files:

  • tests/ir_from_manifest_tests.rs
  • tests/command_escaping_tests.rs
  • tests/ninja_gen_tests.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries

Files:

  • src/ast.rs
  • src/ninja_gen.rs
  • src/ir.rs
Cargo.toml

📄 CodeRabbit inference engine (AGENTS.md)

Cargo.toml: Use explicit version ranges and keep dependencies up to date
Mandate SemVer caret requirements for all dependencies (e.g., "1.2.3")
Prohibit wildcard (*) and open-ended (>=) version specifiers; use ~ only with a specific, documented reason

Files:

  • Cargo.toml
docs/**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md

⚙️ CodeRabbit configuration file

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
🧬 Code graph analysis (1)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
  • from_str (314-316)
src/ir.rs (1)
  • from_manifest (120-131)
🔍 Remote MCP

Here are some additional facts about the two newly added crates that may inform your review:

• shell-quote 0.7.2
– Implements safe quoting for GNU Bash, Z Shell, fish, and /bin/sh-like shells (including Dash), handling spaces, metacharacters, and multibyte inputs via extension traits for strings and byte buffers.
– Default features enabled: bstr, bash, fish, and sh, but these can be disabled to narrow support and reduce footprint. (docs.rs)
– Licensed under Apache 2.0; last built on docs.rs Feb 25 2025; maintained by user “allenap” on crates.io. (docs.rs)

• shlex 1.3.0
– Provides POSIX-compatible shell-word splitting (à la Python’s shlex), with support for both Rust strings and byte slices and a no_std mode.
– Tested against Bash, Z Shell, Dash, BusyBox ash, mksh, and fish to ensure robust parsing. (docs.rs)
– Dual-licensed MIT OR Apache 2.0; version 1.3.0 released Jan 22 2024. (docs.rs)

• Security notes
– The popular Node.js “shell-quote” package suffered critical command-injection vulnerabilities (e.g. CVE-2016-10541), but this Rust crate is a separate implementation and, as of 0.7.2, has no known Rust-crate CVEs. (nvd.nist.gov, docs.rs)

These crates are well-maintained, permissively licensed, and address the PR’s goals of secure quoting and validation of interpolated commands.

🔇 Additional comments (7)
src/ast.rs (1)

92-105: Deriving Clone for Rule is appropriate.

Enabling cheap cloning for Rule aligns with the IR changes and introduces no API risk.

tests/ninja_gen_tests.rs (1)

76-76: Quote with POSIX single quotes in emitted rule text.

This expectation uses portable quoting and aligns with sh semantics.

Also applies to: 95-95

tests/command_escaping_tests.rs (1)

1-4: LGTM: Module scaffolding and imports

Imports are minimal and focused. Keep.

src/ir.rs (4)

77-77: LGTM: Broadened imports

Importing Rule enables rule_map to store Rule directly. Keep.


124-127: LGTM: Rule/targets/defaults pipeline

The staged processing is clean and readable.


167-168: LGTM: Inputs are now recorded on edges

Carrying inputs from sources into BuildEdge.inputs removes re-derivation later. Good.


274-301: LGTM: Rule resolution errors are precise

Error variants and sorting for determinism read well. Keep.

Comment thread Cargo.toml Outdated
Comment thread docs/netsuke-design.md Outdated
Comment thread docs/roadmap.md
Comment thread src/ir.rs Outdated
Comment thread src/ir.rs Outdated
Comment thread tests/command_escaping_tests.rs
Comment thread tests/data/quote.yml
Comment thread tests/features/ir_generation.feature
Comment thread tests/features/ir.feature
Comment thread tests/ir_from_manifest_tests.rs
codescene-delta-analysis[bot]

This comment was marked as outdated.

@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Aug 27, 2025

@coderabbitai have the following now been resolved?

  • The new quoting logic emits bash-style $'…' literals which aren’t POSIX and will break under many /bin/sh implementations (e.g. dash); consider using a POSIX-safe quoting strategy or explicitly invoking bash.
  • Changing process_rules/resolve_rule now causes duplicate rules to produce separate actions instead of deduplicating them—verify this semantic shift is intentional or restore the original deduplication behavior.
  • Using naive .replace("$in", …).replace("$out", …) can accidentally rewrite other parts of the command; consider a more structured templating approach (e.g. indexed placeholders) to avoid unintended substitutions.

@coderabbitai

This comment was marked as resolved.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/steps/ninja_steps.rs (2)

16-20: Replace forbidden #[allow] with narrowly scoped #[expect] and keep the reason.

Conform to lint policy: do not silence; expect with rationale.

-#[allow(
-    clippy::needless_pass_by_value,
-    reason = "Cucumber requires owned String arguments"
-)]
+#[expect(
+    clippy::needless_pass_by_value,
+    reason = "Cucumber requires owned String arguments"
+)]

29-61: Use #[expect] instead of #[allow] for the new step functions.

Keep suppressions tight and justified.

-#[allow(
-    clippy::needless_pass_by_value,
-    reason = "Cucumber requires owned String arguments"
-)]
+#[expect(
+    clippy::needless_pass_by_value,
+    reason = "Cucumber requires owned String arguments"
+)]
 #[then(expr = "shlex splitting command {int} yields {string}")]
 fn ninja_command_tokens(world: &mut CliWorld, index: usize, expected: String) {
     …
 }
 
-#[allow(
-    clippy::needless_pass_by_value,
-    reason = "Cucumber requires owned String arguments"
-)]
+#[expect(
+    clippy::needless_pass_by_value,
+    reason = "Cucumber requires owned String arguments"
+)]
 #[then(expr = "shlex splitting the command yields {string}")]
 fn ninja_first_command_tokens(world: &mut CliWorld, expected: String) {
     …
 }
♻️ Duplicate comments (10)
tests/features/ir.feature (1)

15-17: Adopt 2-action expectation and keep the doc link valid

Accept the behaviour change and the scenario rename. Verify the anchor exists and purge stale “1 actions” expectations across features.

#!/bin/bash
set -euo pipefail
# 1) Anchor check (ensure a matching heading or HTML anchor exists)
rg -nP '^\s*#{1,6}\s+.*design decisions|name="55-design-decisions"|id="55-design-decisions"' docs/netsuke-design.md || true

# 2) Stale expectations check
rg -nP -C2 '\bgraph has\s+1 actions\b' tests/features || true
tests/features/ir_generation.feature (1)

19-23: Synchronise dedup semantics in IR-generation tests

The rename and “2 actions” assertion align with the new rule-mapping semantics. Confirm no remaining IR-generation specs still expect 1 action.

#!/bin/bash
set -euo pipefail
rg -nP -C2 '\bgraph has\s+1 actions\b' tests/features/ir_generation.feature || true
tests/data/quote.yml (1)

1-8: Add hardening fixtures: embedded quote + newline and dash-prefixed sources

Expand coverage for tricky tokens to prove quoting and shlex validation across edge cases.

Append this additional target:

  - name: "o'ut\nfile"
    sources: "-in file"
    command: "printf %s $in > $out"

Optional: also add a multi-line YAML scalar if preferred:

  - name: |-
      o'ut
      file
    sources: "-in file"
    command: "printf %s $in > $out"
tests/ir_from_manifest_tests.rs (1)

19-23: Test rename and expectation now match the new dedup semantics.

Name and assertion are aligned with the 2-action outcome.

src/ninja_gen.rs (2)

114-120: Replace assert! with debug_assert! in Command path — good.

Avoid release panics while retaining a guard in debug builds.


126-129: Replace assert! with debug_assert! in Script path — good.

Mirror the Command path; no release-time panics.

tests/command_escaping_tests.rs (2)

28-29: Assert lexical arguments instead of relying on Bash-specific quoting

The current assertion on line 29 validates that shlex::split produces the expected tokens, which is good. However, the implementation still uses Bash-specific $'...' quoting that won't work with /bin/sh. Update the IR quoting to use POSIX-compatible quoting or explicitly invoke bash in Ninja execution.


80-92: Expand parameterised tests with additional boundary cases

Add test cases for unterminated single quotes and backticks to ensure comprehensive validation of invalid commands.

 #[rstest]
 #[case("echo \"unterminated")]
 #[case("echo 'unterminated")]
 #[case("echo `unterminated")]
+#[case("echo 'single unterminated")]
+#[case("echo `backtick unterminated")]
 fn invalid_command_errors(#[case] cmd: &str) {
src/ir.rs (2)

134-137: Use Arc<Rule> to avoid expensive cloning

Store rules as Arc<Rule> in the map to reduce memory allocation and copying costs for large Rule structs.

+use std::sync::Arc;
 
-fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Rule>) {
+fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Arc<Rule>>) {
     for rule in &manifest.rules {
-        rule_map.insert(rule.name.clone(), rule.clone());
+        rule_map.insert(rule.name.clone(), Arc::new(rule.clone()));
     }
 }

Then update the rule_map parameter type in process_targets and resolve_rule accordingly.


151-160: Document rule precedence for future schema changes

Add a comment clarifying that Rule recipe/description take precedence over any potential target-level values in future schema versions.

             let target_name = get_target_display_name(&outputs);
             let action_id = match &target.recipe {
                 Recipe::Rule { rule } => {
+                    // Rule recipe/description take precedence; adjust if schema changes merge semantics
                     let tmpl = resolve_rule(rule, rule_map, &target_name)?;
                     register_action(
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6d45855 and 7f5fe55.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • Cargo.toml (1 hunks)
  • docs/netsuke-design.md (2 hunks)
  • docs/roadmap.md (1 hunks)
  • src/ir.rs (8 hunks)
  • src/ninja_gen.rs (1 hunks)
  • tests/command_escaping_tests.rs (1 hunks)
  • tests/data/quote.yml (1 hunks)
  • tests/features/ir.feature (1 hunks)
  • tests/features/ir_generation.feature (1 hunks)
  • tests/features/ninja.feature (1 hunks)
  • tests/ir_from_manifest_tests.rs (1 hunks)
  • tests/steps/ninja_steps.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions

Files:

  • tests/ir_from_manifest_tests.rs
  • src/ninja_gen.rs
  • tests/steps/ninja_steps.rs
  • tests/command_escaping_tests.rs
  • src/ir.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • tests/ir_from_manifest_tests.rs
  • src/ninja_gen.rs
  • tests/steps/ninja_steps.rs
  • tests/command_escaping_tests.rs
  • src/ir.rs
tests/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md

Files:

  • tests/ir_from_manifest_tests.rs
  • tests/steps/ninja_steps.rs
  • tests/command_escaping_tests.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries

Files:

  • src/ninja_gen.rs
  • src/ir.rs
Cargo.toml

📄 CodeRabbit inference engine (AGENTS.md)

Cargo.toml: Use explicit version ranges and keep dependencies up to date
Mandate SemVer caret requirements for all dependencies (e.g., "1.2.3")
Prohibit wildcard (*) and open-ended (>=) version specifiers; use ~ only with a specific, documented reason

Files:

  • Cargo.toml
docs/**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md

⚙️ CodeRabbit configuration file

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
🧬 Code graph analysis (2)
tests/ir_from_manifest_tests.rs (2)
src/manifest.rs (1)
  • from_path (512-517)
src/ir.rs (1)
  • from_manifest (121-132)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
  • from_str (314-316)
src/ir.rs (1)
  • from_manifest (121-132)
🪛 LanguageTool
docs/roadmap.md

[uncategorized] ~132-~132: Possible missing comma found.
Context: ... interpolations are validated with shlex but are not shell-quoted. - [x] Emit POS...

(AI_HYDRA_LEO_MISSING_COMMA)

🔍 Remote MCP

Here are some additional facts about the two newly added crates that may inform your review:

• shell-quote 0.7.2
– Implements safe quoting for GNU Bash, Z Shell, fish, and /bin/sh-like shells (including Dash), handling spaces, metacharacters, and multibyte inputs via extension traits for strings and byte buffers.
– Default features enabled: bstr, bash, fish, and sh, but these can be disabled to narrow support and reduce footprint. (docs.rs)
– Licensed under Apache 2.0; last built on docs.rs Feb 25 2025; maintained by user “allenap” on crates.io. (docs.rs)

• shlex 1.3.0
– Provides POSIX-compatible shell-word splitting (à la Python’s shlex), with support for both Rust strings and byte slices and a no_std mode.
– Tested against Bash, Z Shell, Dash, BusyBox ash, mksh, and fish to ensure robust parsing. (docs.rs)
– Dual-licensed MIT OR Apache 2.0; version 1.3.0 released Jan 22 2024. (docs.rs)

• Security notes
– The popular Node.js “shell-quote” package suffered critical command-injection vulnerabilities (e.g. CVE-2016-10541), but this Rust crate is a separate implementation and, as of 0.7.2, has no known Rust-crate CVEs. (nvd.nist.gov, docs.rs)

These crates are well-maintained, permissively licensed, and address the PR’s goals of secure quoting and validation of interpolated commands.

🔇 Additional comments (6)
Cargo.toml (1)

26-27: Constrain quoting to POSIX sh and pin versions; now remove Bash-only usage

Good: shell-quote is pinned and restricted to feature "sh". Sweep the codebase for Bash-only quoting ($'…') and Bash types to avoid runtime breakage under /bin/sh.

#!/bin/bash
set -euo pipefail
# Confirm cargo feature set
cargo tree -e features -i shell-quote | sed -n '1,120p'

# Find Bash-specific usages in source and tests
rg -nP -C2 '\bshell_quote::Bash\b|\bBash::\w+\b|quoted\s*\(\s*Bash\s*\)|\$\'' src tests || true
docs/netsuke-design.md (2)

242-249: Make the POSIX /bin/sh claim verifiable and enforced end-to-end.

Good clarification. Verify the implementation actually uses Sh mode exclusively and that no Bash-only $'…' quoting leaks into generated commands or tests.

#!/bin/bash
set -euo pipefail

echo "Cargo features for shell-quote:"
rg -nP '^\s*shell-quote\s*=\s*\{[^}]*features\s*=\s*\[\s*"sh"\s*\]' Cargo.toml || true

echo "Search for selection of Sh mode in code:"
rg -nP '\bSh\b' -n src || true
rg -nP 'shell_?quote|QuoteExt|push_quoted' -n src || true

echo "Ensure no Bash $\'...\' style appears in code or tests:"
rg -nP "\$'[^']+'" -n --iglob '!**/target/**' . || true

echo "Confirm /bin/sh invocation in ninja generator:"
rg -nP '/bin/sh\s+-e\s+-c' -n src/ninja_gen.rs || true

1236-1244: Validate that the implementation uses a proper tokeniser instead of naive string replacements:

• Inspect the code that handles $in/$out interpolation.
• Ensure there are no calls like .replace("$in", …) or .replace('$out', …).
• Confirm the code parses the command template into tokens and only substitutes standalone $in/$out tokens.
• Review the scanner/parser logic in the module that emits the final shell command.

Verify these points manually against the source to guarantee token-aware interpolation and prevent unintended replacements.

tests/features/ninja.feature (1)

15-24: Add shlex-token assertions for quoting — good coverage.

Validate tokenisation of redirection and embedded quotes via shlex-based steps.

src/ir.rs (2)

110-111: Good security improvement for error messages

The error variant now includes a truncated snippet field to avoid leaking full commands with potentially sensitive paths, whilst retaining the full command for programmatic access.


234-242: Remove review comment about quoting mismatch

Verified that shell-quote is configured with features = ["sh"] in Cargo.toml for POSIX-compatible quoting and that no tests use Bash-specific $'…' syntax. Approve the code as-is.

Comment thread docs/roadmap.md
Comment thread src/ir.rs Outdated
Comment thread src/ir.rs Outdated
Comment thread tests/command_escaping_tests.rs
codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
src/ir.rs (6)

339-366: Update resolve_rule to return &Arc<Rule> when adopting shared map type.

Keep reference semantics; avoid extra clones.

-fn resolve_rule<'a>(
+fn resolve_rule<'a>(
     rule: &StringOrList,
-    rule_map: &'a HashMap<String, Rule>,
+    rule_map: &'a HashMap<String, Arc<Rule>>,
     target_name: &str,
-) -> Result<&'a Rule, IrGenError> {
+) -> Result<&'a Arc<Rule>, IrGenError> {
     …
 }

134-144: Avoid cloning large Rule values; store by Arc in rule_map.

Stop copying Rule structs into the map. Use shared ownership to reduce moves/copies. Propagate Arc through call sites.

Apply:

-    fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Rule>) {
+    use std::sync::Arc;
+    fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Arc<Rule>>) {
         for rule in &manifest.rules {
-            rule_map.insert(rule.name.clone(), rule.clone());
+            rule_map.insert(rule.name.clone(), Arc::new(rule.clone()));
         }
     }

Outside selected lines, update type imports and callers:

Rust

use std::sync::Arc;

150-151: Align rule_map type with shared ownership.

Update parameter type to match Arc<Rule> to prevent accidental clones down the line.

-        rule_map: &HashMap<String, Rule>,
+        rule_map: &HashMap<String, Arc<Rule>>,

157-166: Document precedence where Rule vs target-level recipe/description could conflict.

Add a brief comment stating that a resolved Rule’s recipe/description take precedence over any future target-level fields.

                 Recipe::Rule { rule } => {
-                    let tmpl = resolve_rule(rule, rule_map, &target_name)?;
+                    // Precedence: resolved Rule recipe/description take priority over any
+                    // future target-level equivalents. Update if schema introduces overrides.
+                    let tmpl = resolve_rule(rule, rule_map, &target_name)?;

243-271: Extract backtick check and pre-join arguments to cut allocations.

Improve readability and avoid repeated joins inside substitute.

Apply:

 fn interpolate_command(
     template: &str,
     inputs: &[PathBuf],
     outputs: &[PathBuf],
 ) -> Result<String, IrGenError> {
+    fn has_unmatched_backticks(s: &str) -> bool {
+        s.chars().filter(|&c| c == '`').count() % 2 != 0
+    }
     fn quote_paths(paths: &[PathBuf]) -> Vec<String> {
         paths
             .iter()
             .map(|p| {
                 let raw = p.display().to_string();
                 String::from_utf8(raw.quoted(Sh)).expect("quoted path is valid UTF-8")
             })
             .collect()
     }
 
     let ins = quote_paths(inputs);
     let outs = quote_paths(outputs);
-    let interpolated = substitute(template, &ins, &outs);
-    let unmatched_backticks = interpolated.chars().filter(|&c| c == '`').count() & 1 != 0;
-    if unmatched_backticks || shlex::split(&interpolated).is_none() {
+    let ins_joined = ins.join(" ");
+    let outs_joined = outs.join(" ");
+    let interpolated = substitute(template, &ins_joined, &outs_joined);
+    if has_unmatched_backticks(&interpolated) || shlex::split(&interpolated).is_none() {
         let snippet = interpolated.chars().take(160).collect();
         return Err(IrGenError::InvalidCommand {
             command: interpolated,
             snippet,
         });
     }
     Ok(interpolated)
 }

Outside selected lines, adjust substitute signature to take &str joined replacements (see next comment).


272-310: Reduce cyclomatic complexity in substitute and avoid repeated joins.

Extract matchers and operate on pre-joined replacements; keep token-boundary guarantees.

-fn substitute(template: &str, ins: &[String], outs: &[String]) -> String {
+fn substitute(template: &str, ins_joined: &str, outs_joined: &str) -> String {
     fn is_ident(ch: char) -> bool {
         ch.is_ascii_alphanumeric() || ch == '_'
     }
 
+    fn try_match_in(chars: &[char], i: usize) -> Option<usize> {
+        if matches!(chars.get(i + 1), Some('i')) && matches!(chars.get(i + 2), Some('n')) {
+            let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c));
+            let next_ok = chars.get(i + 3).is_none_or(|c| !is_ident(*c));
+            if prev_ok && next_ok {
+                return Some(3);
+            }
+        }
+        None
+    }
+
+    fn try_match_out(chars: &[char], i: usize) -> Option<usize> {
+        if matches!(chars.get(i + 1), Some('o'))
+            && matches!(chars.get(i + 2), Some('u'))
+            && matches!(chars.get(i + 3), Some('t'))
+        {
+            let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c));
+            let next_ok = chars.get(i + 4).is_none_or(|c| !is_ident(*c));
+            if prev_ok && next_ok {
+                return Some(4);
+            }
+        }
+        None
+    }
+
     let chars: Vec<char> = template.chars().collect();
     let mut out = String::with_capacity(template.len());
     let mut i = 0;
     while let Some(&ch) = chars.get(i) {
         if ch == '$' {
-            // Attempt to match $in
-            if matches!(chars.get(i + 1), Some('i')) && matches!(chars.get(i + 2), Some('n')) {
-                let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c));
-                let next_ok = chars.get(i + 3).is_none_or(|c| !is_ident(*c));
-                if prev_ok && next_ok {
-                    out.push_str(&ins.join(" "));
-                    i += 3;
-                    continue;
-                }
-            }
-            // Attempt to match $out
-            if matches!(chars.get(i + 1), Some('o'))
-                && matches!(chars.get(i + 2), Some('u'))
-                && matches!(chars.get(i + 3), Some('t'))
-            {
-                let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c));
-                let next_ok = chars.get(i + 4).is_none_or(|c| !is_ident(*c));
-                if prev_ok && next_ok {
-                    out.push_str(&outs.join(" "));
-                    i += 4;
-                    continue;
-                }
-            }
+            if let Some(skip) = try_match_in(&chars, i) {
+                out.push_str(ins_joined);
+                i += skip;
+                continue;
+            }
+            if let Some(skip) = try_match_out(&chars, i) {
+                out.push_str(outs_joined);
+                i += skip;
+                continue;
+            }
         }
         out.push(ch);
         i += 1;
     }
     out
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7f5fe55 and 4b33719.

📒 Files selected for processing (2)
  • docs/netsuke-design.md (4 hunks)
  • src/ir.rs (8 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions

Files:

  • src/ir.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • src/ir.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries

Files:

  • src/ir.rs
docs/**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)

Files:

  • docs/netsuke-design.md
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie

Files:

  • docs/netsuke-design.md

⚙️ CodeRabbit configuration file

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/netsuke-design.md
🔍 Remote MCP

I attempted to run external documentation searches but they failed (status 402), so no additional documents were fetched.

Concise actionable review facts (from provided PR context):

  • New crates added: shell-quote = "0.7.2" (default-features = false, features = ["sh"]) and shlex = "1.3.0". Impact: quoting + POSIX parsing/validation added to interpolation and ninja emission.
  • Behavior change: $in/$out expansions are shell-quoted and commands are validated via shlex; invalid interpolations now produce IrGenError::InvalidCommand.
  • Quoting correctness concern: implementation uses shell_quote::QuoteExt::quote() which emits bash-style $'…' sequences — not guaranteed POSIX /bin/sh-safe. Reviewer recommended using sh_quote() or otherwise ensuring POSIX-compatible quoting or requiring bash.
  • Replacement fragility: current interpolation uses naive .replace("$in", …).replace("$out", …), which can accidentally rewrite unrelated identifiers (e.g., "$input"). Tests include a case to confirm variable_name_overlap_not_rewritten, but reviewer flagged this as fragile and suggested a structured templating approach.
  • Deduplication semantics changed: rules are stored as Rule objects keyed by name (no action-hash dedup at process_rules), causing previously-duplicate rules to emit distinct actions. Tests and features were updated to expect the increased action counts; this is an intentional architectural shift but should be documented and validated against project semantics.
  • Tests added/updated: unit tests (tests/command_escaping_tests.rs), integration/feature tests (ninja.feature, ir_generation.feature), and existing tests updated to expect quoted commands and adjusted action counts.
  • Ninja generation: now validates commands with shlex::split before emitting; emission logic asserts split-ability for both Command and Script paths.

Tag:

🔇 Additional comments (7)
src/ir.rs (6)

77-79: Adopt POSIX /bin/sh quoting (good).

Importing Sh and using shell_quote in Sh mode addresses the prior Bash-only quoting issue.


110-112: Limit error message PII leakage (good).

Displaying only snippet in the InvalidCommand variant is appropriate; keep the full command for programmatic access.


206-214: Clarify dedup boundary in docstring (good).

The comment correctly states dedup is by fully interpolated command + file sets.


218-229: Interpolate only Command recipes (good).

Restricting validation/quoting to Command here is correct; scripts will be handled by their interpreter path.


243-271: Verify MSRV for Option::is_none_or and replace if too new.

is_none_or requires a recent Rust. Confirm MSRV or gate with a small helper to avoid bumping MSRV unintentionally.

Run:

#!/bin/bash
set -euo pipefail
# Detect declared MSRV
rg -n 'rust-version\s*=\s*"[0-9]+' Cargo.toml || true
rg -n 'channel\s*=\s*"(stable|[0-9.]+)"' rust-toolchain* 2>/dev/null || true
# Report usage sites of is_none_or
rg -n --type=rust '\.is_none_or\('
echo "If MSRV < 1.70, replace is_none_or with explicit matches."

Also applies to: 272-310


243-271: P1: Ensure tests no longer expect Bash $'…' strings.

Sh-mode quoting should not emit $'…'. Purge stale expectations.

Run:

#!/bin/bash
set -euo pipefail
rg -nP "\$\'.*\'" tests || true
rg -nP "cat\s+\$\'.*\'\s*>\s*\$\'.*\'" -S tests || true
echo "Expect no matches; update snapshots if present."
docs/netsuke-design.md (1)

242-249: POSIX /bin/sh wording (good).

The text now explicitly states Sh-mode quoting and POSIX shlex validation.

Comment thread docs/netsuke-design.md Outdated
Comment thread docs/netsuke-design.md
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Aug 28, 2025

@coderabbitai Has the following now been resolved?

Rule Deduplication Behavior ❌ Intentional Change But Should Be Documented

The rule deduplication behavior has indeed changed. Looking at src/ir.rs, the process_rules function now stores actual Rule objects in rule_map instead of computing action hashes immediately:

fn process_rules(rules: &[Rule]) -> Result<HashMap<String, Rule>, IrGenError> {
let mut rule_map = HashMap::new();
for rule in rules {
rule_map.insert(rule.name.clone(), rule.clone());
}
Ok(rule_map)
}

This means identical rule definitions now create separate actions for each target that uses them, rather than deduplicating at the rule level. The test changes confirm this - duplicate_rules_are_deduped now expects 2 actions instead of 1.

This appears to be an intentional architectural change to move from immediate action creation to a rule-template approach, but it fundamentally changes the deduplication semantics.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 28, 2025

Tip

For best results, initiate chat on the files or code changes.

@leynos Looking at the updated documentation in docs/netsuke-design.md, this issue has been ✅ Resolved.

The design documentation now explicitly describes the architectural change:

  • IR construction workflow redefined: The old "Action Consolidation" step was replaced with a two-part approach - "Rule Collection" (storing rules in a name-keyed map) and "Action Registration and Edge Creation"
  • New deduplication semantics: "Actions are now hashed on the fully resolved command and file-set, so identical rule templates yield distinct actions when file paths differ"
  • Explicit clarification: "action deduplication now considers both the command text and the exact file sets (shell-quoted paths); two targets with the same command but different file sets produce distinct actions"

The documentation now clearly explains that this is an intentional shift from the previous action-hash deduplication to a rule-template approach where deduplication occurs at the action level based on the complete resolved command and file sets, rather than at the rule definition level.

The change is well-documented and the architectural rationale is explained, so the "should be documented" concern has been addressed.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/ir.rs (1)

1-550: Split the module; file exceeds the 400‑line limit.

Extract interpolation/quoting into src/ir/command.rs and cycle detection into src/ir/cycle.rs; keep ir.rs as the public façade.

♻️ Duplicate comments (1)
docs/roadmap.md (1)

130-139: Correct the scope, casing, and duplication in the security checklist.

Limit quoting to $in/$out (as implemented), capitalise “Bash”, and drop the duplicate shlex bullet.

-  - [x] Mandate its use for variable substitutions within command strings
-    during IR generation to prevent command injection, and validate the final
-    command string with shlex.
+  - [x] Mandate its use for `$in`/`$out` substitutions within command strings
+    during IR generation to prevent command injection. Validate the final
+    command string with `shlex`.

-  - [x] Emit POSIX-sh-compatible quoting (portable single-quote style)
-    rather than Bash-only forms. If Bash-specific quoting is required, document
-    and enforce execution under bash.
+  - [x] Emit POSIX sh-compatible quoting (portable single-quote style),
+    rather than Bash-only forms. If Bash-specific quoting is required,
+    document and enforce execution under Bash.
-
-  - [x] After interpolation, validate the final command string is parsable using
-    the shlex crate.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • Jira integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4b33719 and 93f04ef.

📒 Files selected for processing (4)
  • docs/netsuke-design.md (4 hunks)
  • docs/roadmap.md (1 hunks)
  • src/ir.rs (7 hunks)
  • tests/command_escaping_tests.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md

⚙️ CodeRabbit configuration file

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

  • Use en-GB-oxendict (-ize / -our) spelling and grammar
  • Headings must not be wrapped.
  • Documents must start with a level 1 heading
  • Headings must correctly increase or decrease by no more than one level at a time
  • Use GitHub-flavoured Markdown style for footnotes and endnotes.
  • Numbered footnotes must be numbered by order of appearance in the document.

Files:

  • docs/roadmap.md
  • docs/netsuke-design.md
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions

Files:

  • tests/command_escaping_tests.rs
  • src/ir.rs

⚙️ CodeRabbit configuration file

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

  • Adhere to single responsibility and CQRS

  • Place function attributes after doc comments.

  • Do not use return in single-line functions.

  • Move conditionals with >2 branches into a predicate function.

  • Avoid unsafe unless absolutely necessary.

  • Every module must begin with a //! doc comment that explains the module's purpose and utility.

  • Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

  • Lints must not be silenced except as a last resort.

    • #[allow] is forbidden.
    • Only narrowly scoped #[expect(lint, reason = "...")] is allowed.
    • No lint groups, no blanket or file-wide suppression.
    • Include FIXME: with link if a fix is expected.
  • Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.

  • Use rstest fixtures for shared setup and to avoid repetition between tests.

  • Replace duplicated tests with #[rstest(...)] parameterised cases.

  • Prefer mockall for mocks/stubs.

  • Prefer .expect() over .unwrap()

  • Ensure that any API or behavioural changes are reflected in the documentation in docs/

  • Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

  • Files must not exceed 400 lines in length

    • Large modules must be decomposed
    • Long match statements or dispatch tables should be decomposed by domain and collocated with targets
    • Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
  • Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such

    • For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option.
    • If mockable cannot be used, env mutations in tests ...

Files:

  • tests/command_escaping_tests.rs
  • src/ir.rs
tests/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md

Files:

  • tests/command_escaping_tests.rs
src/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries

Files:

  • src/ir.rs
🧬 Code graph analysis (1)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
  • from_str (314-316)
src/ir.rs (1)
  • from_manifest (122-133)
🔍 Remote MCP

I attempted to run external documentation searches but they failed (status 402), so no additional documents were fetched.

Concise actionable review facts (from provided PR context):

  • New crates added: shell-quote = "0.7.2" (default-features = false, features = ["sh"]) and shlex = "1.3.0". Impact: quoting + POSIX parsing/validation added to interpolation and ninja emission.
  • Behavior change: $in/$out expansions are shell-quoted and commands are validated via shlex; invalid interpolations now produce IrGenError::InvalidCommand.
  • Quoting correctness concern: implementation uses shell_quote::QuoteExt::quote() which emits bash-style $'…' sequences — not guaranteed POSIX /bin/sh-safe. Reviewer recommended using sh_quote() or otherwise ensuring POSIX-compatible quoting or requiring bash.
  • Replacement fragility: current interpolation uses naive .replace("$in", …).replace("$out", …), which can accidentally rewrite unrelated identifiers (e.g., "$input"). Tests include a case to confirm variable_name_overlap_not_rewritten, but reviewer flagged this as fragile and suggested a structured templating approach.
  • Deduplication semantics changed: rules are stored as Rule objects keyed by name (no action-hash dedup at process_rules), causing previously-duplicate rules to emit distinct actions. Tests and features were updated to expect the increased action counts; this is an intentional architectural shift but should be documented and validated against project semantics.
  • Tests added/updated: unit tests (tests/command_escaping_tests.rs), integration/feature tests (ninja.feature, ir_generation.feature), and existing tests updated to expect quoted commands and adjusted action counts.
  • Ninja generation: now validates commands with shlex::split before emitting; emission logic asserts split-ability for both Command and Script paths.

Tag:

🔇 Additional comments (5)
tests/command_escaping_tests.rs (2)

17-31: Approve: assert lexical tokens, not dialect-specific quoting.

The shlex-based assertion robustly validates correct argument boundaries across shells.


95-107: Guard against false positives: backticks inside single quotes are literals.

Add a positive case to prove unmatched-backtick detection does not flag single-quoted content.

@@
 fn invalid_command_errors(#[case] cmd: &str) {
@@
 }
+
+#[rstest]
+fn backticks_inside_single_quotes_are_literal() {
+    let yaml = manifest_yaml(
+        "targets:\n  - name: out\n    sources: in\n    command: \"echo '$in `literal`' > $out\"\n",
+    );
+    let manifest = manifest::from_str(&yaml).expect("parse");
+    let graph = BuildGraph::from_manifest(&manifest).expect("graph");
+    let action = graph.actions.values().next().expect("action");
+    let Recipe::Command { command } = &action.recipe else {
+        panic!(\"expected command\")
+    };
+    let words = shlex::split(command).expect(\"split command into words\");
+    assert_eq!(words, [\"echo\", \"$in `literal`\", \">\", \"out\"]);
+}
src/ir.rs (2)

135-141: Approve: clarify rule-template storage and deferred expansion.

The docstring accurately captures the intentional change in dedup semantics.


210-218: Approve: hash on fully interpolated command and file-set.

The registration path matches the documented identity semantics.

docs/netsuke-design.md (1)

1236-1245: Approve: document IR-time substitution and POSIX Sh quoting.

The implementation notes match interpolate_command and token-boundary substitution.

Comment thread docs/netsuke-design.md
Comment thread src/ir.rs
Comment thread src/ir.rs
Comment thread src/ir.rs Outdated
Comment thread tests/command_escaping_tests.rs
@leynos
Copy link
Copy Markdown
Owner Author

leynos commented Aug 28, 2025

@coderabbitai pause

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 28, 2025

✅ Actions performed

Reviews paused.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Repository owner deleted a comment from coderabbitai Bot Aug 28, 2025
Repository owner deleted a comment from coderabbitai Bot Aug 28, 2025
Repository owner deleted a comment from coderabbitai Bot Aug 28, 2025
codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

codescene-delta-analysis[bot]

This comment was marked as outdated.

Repository owner deleted a comment from coderabbitai Bot Aug 29, 2025
@leynos leynos merged commit 98f17fa into main Aug 29, 2025
4 checks passed
@leynos leynos deleted the codex/integrate-shell-quote-crate-for-security branch August 29, 2025 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant