Escape interpolated commands with shell-quote#142
Conversation
Reviewer's GuideThis PR integrates shell-quote and shlex to escape and validate $in/$out placeholders in commands, refactors IR generation to propagate Rule data for quoting, adds run-time checks in ninja code, updates docs, dependencies, and AST, and expands test coverage for secure command interpolation. Entity relationship diagram for Rule and Recipe propagation in IR generationerDiagram
NETSUKE_MANIFEST ||--o{ RULE : contains
RULE ||--|{ RECIPE : has
BUILD_GRAPH ||--o{ RULE : uses
BUILD_GRAPH ||--o{ ACTION : creates
ACTION ||--|{ RECIPE : uses
Class diagram for updated Rule and BuildGraph processingclassDiagram
class BuildGraph {
+process_rules(manifest, rule_map)
+process_targets(manifest, actions, targets, rule_map)
+process_defaults(manifest, default_targets)
}
class Rule {
+name: String
+recipe: Recipe
+description: Option<String>
}
class Recipe {
<<enum>>
Command
Script
Rule
}
BuildGraph --> "*" Rule : uses
BuildGraph --> "*" Recipe : uses
Rule --> Recipe
Flow diagram for command interpolation and validationflowchart TD
A[Recipe::Command with $in/$out] --> B[Escape $in/$out with shell-quote]
B --> C[Interpolate command string]
C --> D[Validate with shlex::split]
D -- valid --> E[Register Action]
D -- invalid --> F[Raise IrGenError::InvalidCommand]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Note Reviews pausedUse the following commands to manage reviews:
Summary by CodeRabbit
WalkthroughAdd shell-quote and shlex dependencies; document POSIX quoting and shlex validation; derive Clone for Rule; refactor IR to store Arc, compute inputs/outputs, interpolate and validate command templates (introduce IrGenError::InvalidCommand); validate commands during Ninja emission; add and update tests and test data; invoke cycle detection during graph construction. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Manifest as Manifest Parser
participant IR as IR Builder
participant SQ as shell-quote (Sh)
participant Shlex as shlex
participant Graph as BuildGraph
User->>Manifest: Provide manifest (targets, rules)
Manifest-->>IR: Parsed targets & rules
IR->>IR: process_rules() -> store Rule objects (Arc<Rule>)
IR->>IR: process_targets() -> compute inputs, outputs
alt Recipe::Command
IR->>SQ: Quote paths (inputs/outputs)
SQ-->>IR: Quoted tokens
IR->>IR: Substitute $in/$out into template
IR->>Shlex: Validate interpolated command (POSIX)
alt Parsable
Shlex-->>IR: Ok
IR->>Graph: register_action(inputs, outputs, Command)
else Not parsable
Shlex-->>IR: Err
IR-->>User: IrGenError::InvalidCommand
end
else Recipe::Script / Rule-based
IR->>IR: Prepare command/script or use rule recipe
IR->>Graph: register_action(inputs, outputs, recipe)
end
sequenceDiagram
autonumber
participant Ninja as Ninja Generator
participant Shlex as shlex
participant Writer as Output Writer
Ninja->>Shlex: shlex::split(command_str)
alt Parsable
Shlex-->>Ninja: Ok
Ninja->>Writer: Emit "command = ..." line
else Not parsable
Shlex-->>Ninja: Err
Ninja-->>Ninja: Propagate/abort with error
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
tests/features/ninja.feature (1)
15-19: Ensure POSIX-compliant quoting in feature expectationsPinpoint the Bash-only ANSI-C quoting and replace it with standard single quotes:
• File: tests/features/ninja.feature
• Lines: 15–19- Then the ninja file contains "command = cat $'in file' > $'out file'" + Then the ninja file contains "command = cat 'in file' > 'out file'"Add a CI check to reject Bash-style
$'…'in/bin/sh:#!/usr/bin/env bash set -euo pipefail if sh -c "echo \$'ok'" >/dev/null 2>&1; then echo "ERROR: /bin/sh accepts ANSI-C quoting. Use POSIX single quotes." >&2 exit 1 fitests/ninja_gen_tests.rs (1)
74-98: Reassess per-edge command specialisation; it inflates rule count and harms reuse.Embedding concrete inputs/outputs into rule
commandreduces deduplication and can bloat build.ninja for many edges. Prefer:
- Keeping a reusable rule with
$in/$outand sh-compatible, per-arg quoting via rspfile, or- Generating per-edge rspfiles to avoid shell-quoting complexities.
Audit current usage of Bash-only quoting in tests and sources:
#!/bin/bash set -euo pipefail rg -n --hidden --glob '!*target*' --glob '!**/snapshots/**' "\$'" -C0 || true echo "If matches appear above, replace ANSI-C $'…' with POSIX-safe quoting or enforce bash."src/ninja_gen.rs (1)
87-103: Document printf %b portability and escape setThe transform relies on printf %b semantics for dropping backslashes on non-escape sequences (e.g., $ ->
$). Document this explicitly and cite shells that conform (dash, bash, busybox ash). Add a unit test exercising $ , %, \r, and non-ASCII.Do POSIX /bin/sh implementations (dash, busybox ash) treat backslash before a non-special character in `printf %b` as a literal (i.e., `\$` -> `$`)?src/ir.rs (1)
199-234: Use sh-compatible quoting, and avoid partial macro replacement ($in inside $input)Two issues:
- Using Bash quoting (Bash::quote_vec) can emit $'…' which is not supported by /bin/sh (dash). Ninja defaults to /bin/sh, and Script path forces /bin/sh. Quote for sh, not bash.
- String::replace() rewrites $in inside longer identifiers (e.g., $input), breaking variables.
Quote for sh and perform boundary-aware macro interpolation.
- Recipe::Command { command } => { - use shell_quote::Bash; + Recipe::Command { command } => { + use shell_quote::Sh; let quote_paths = |paths: &[PathBuf]| -> String { - paths + paths .iter() .map(|p| { let raw = p.display().to_string(); - String::from_utf8(Bash::quote_vec(&raw)) + String::from_utf8(Sh::quote_vec(&raw)) .expect("quoted path is valid UTF-8") }) .collect::<Vec<String>>() .join(" ") }; let ins = quote_paths(inputs); let outs = quote_paths(outputs); - let interpolated = command.replace("$in", &ins).replace("$out", &outs); + let interpolated = interpolate_macros(&command, &ins, &outs); if shlex::split(&interpolated).is_none() { return Err(IrGenError::InvalidCommand { - command: interpolated, + command: interpolated, }); } Recipe::Command { command: interpolated, } }Add this helper within the module:
fn interpolate_macros(cmd: &str, ins: &str, outs: &str) -> String { let bytes = cmd.as_bytes(); let mut out = String::with_capacity(cmd.len() + ins.len() + outs.len()); let mut i = 0; while i < bytes.len() { if bytes[i] == b'$' { // Try $in if bytes.get(i + 1) == Some(&b'i') && bytes.get(i + 2) == Some(&b'n') { if matches!(bytes.get(i + 3), None | Some(b) if !b.is_ascii_alphanumeric() && *b != b'_') { out.push_str(ins); i += 3; continue; } } // Try $out if bytes.get(i + 1) == Some(&b'o') && bytes.get(i + 2) == Some(&b'u') && bytes.get(i + 3) == Some(&b't') { if matches!(bytes.get(i + 4), None | Some(b) if !b.is_ascii_alphanumeric() && *b != b'_') { out.push_str(outs); i += 4; continue; } } } out.push(bytes[i] as char); i += 1; } out }Add tests for:
- Command containing $input and $outfile (no substitution).
- Empty inputs/outputs (no stray punctuation).
- Paths with quotes and non-ASCII.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locktests/snapshots/ninja/ninja_snapshot_tests__touch_manifest_ninja.snapis excluded by!**/*.snap
📒 Files selected for processing (13)
Cargo.toml(1 hunks)docs/netsuke-design.md(1 hunks)docs/roadmap.md(1 hunks)src/ast.rs(1 hunks)src/ir.rs(7 hunks)src/ninja_gen.rs(1 hunks)tests/command_escaping_tests.rs(1 hunks)tests/data/quote.yml(1 hunks)tests/features/ir.feature(1 hunks)tests/features/ir_generation.feature(1 hunks)tests/features/ninja.feature(1 hunks)tests/ir_from_manifest_tests.rs(1 hunks)tests/ninja_gen_tests.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions
Files:
tests/ir_from_manifest_tests.rssrc/ast.rssrc/ninja_gen.rstests/command_escaping_tests.rstests/ninja_gen_tests.rssrc/ir.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and the
mockablecrate are the preferred option.- If mockable cannot be used, env mutations in tests ...
Files:
tests/ir_from_manifest_tests.rssrc/ast.rssrc/ninja_gen.rstests/command_escaping_tests.rstests/ninja_gen_tests.rssrc/ir.rs
tests/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md
Files:
tests/ir_from_manifest_tests.rstests/command_escaping_tests.rstests/ninja_gen_tests.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries
Files:
src/ast.rssrc/ninja_gen.rssrc/ir.rs
Cargo.toml
📄 CodeRabbit inference engine (AGENTS.md)
Cargo.toml: Use explicit version ranges and keep dependencies up to date
Mandate SemVer caret requirements for all dependencies (e.g., "1.2.3")
Prohibit wildcard (*) and open-ended (>=) version specifiers; use ~ only with a specific, documented reason
Files:
Cargo.toml
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)
Files:
docs/roadmap.mddocs/netsuke-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie
Files:
docs/roadmap.mddocs/netsuke-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/roadmap.mddocs/netsuke-design.md
🧬 Code graph analysis (1)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
from_str(314-316)src/ir.rs (1)
from_manifest(120-131)
🔍 Remote MCP
Here are some additional facts about the two newly added crates that may inform your review:
• shell-quote 0.7.2
– Implements safe quoting for GNU Bash, Z Shell, fish, and /bin/sh-like shells (including Dash), handling spaces, metacharacters, and multibyte inputs via extension traits for strings and byte buffers.
– Default features enabled: bstr, bash, fish, and sh, but these can be disabled to narrow support and reduce footprint. (docs.rs)
– Licensed under Apache 2.0; last built on docs.rs Feb 25 2025; maintained by user “allenap” on crates.io. (docs.rs)
• shlex 1.3.0
– Provides POSIX-compatible shell-word splitting (à la Python’s shlex), with support for both Rust strings and byte slices and a no_std mode.
– Tested against Bash, Z Shell, Dash, BusyBox ash, mksh, and fish to ensure robust parsing. (docs.rs)
– Dual-licensed MIT OR Apache 2.0; version 1.3.0 released Jan 22 2024. (docs.rs)
• Security notes
– The popular Node.js “shell-quote” package suffered critical command-injection vulnerabilities (e.g. CVE-2016-10541), but this Rust crate is a separate implementation and, as of 0.7.2, has no known Rust-crate CVEs. (nvd.nist.gov, docs.rs)
These crates are well-maintained, permissively licensed, and address the PR’s goals of secure quoting and validation of interpolated commands.
🔇 Additional comments (7)
src/ast.rs (1)
92-105: Deriving Clone for Rule is appropriate.Enabling cheap cloning for Rule aligns with the IR changes and introduces no API risk.
tests/ninja_gen_tests.rs (1)
76-76: Quote with POSIX single quotes in emitted rule text.This expectation uses portable quoting and aligns with sh semantics.
Also applies to: 95-95
tests/command_escaping_tests.rs (1)
1-4: LGTM: Module scaffolding and importsImports are minimal and focused. Keep.
src/ir.rs (4)
77-77: LGTM: Broadened importsImporting Rule enables rule_map to store Rule directly. Keep.
124-127: LGTM: Rule/targets/defaults pipelineThe staged processing is clean and readable.
167-168: LGTM: Inputs are now recorded on edgesCarrying inputs from sources into BuildEdge.inputs removes re-derivation later. Good.
274-301: LGTM: Rule resolution errors are preciseError variants and sorting for determinism read well. Keep.
|
@coderabbitai have the following now been resolved?
|
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/steps/ninja_steps.rs (2)
16-20: Replace forbidden #[allow] with narrowly scoped #[expect] and keep the reason.Conform to lint policy: do not silence; expect with rationale.
-#[allow( - clippy::needless_pass_by_value, - reason = "Cucumber requires owned String arguments" -)] +#[expect( + clippy::needless_pass_by_value, + reason = "Cucumber requires owned String arguments" +)]
29-61: Use #[expect] instead of #[allow] for the new step functions.Keep suppressions tight and justified.
-#[allow( - clippy::needless_pass_by_value, - reason = "Cucumber requires owned String arguments" -)] +#[expect( + clippy::needless_pass_by_value, + reason = "Cucumber requires owned String arguments" +)] #[then(expr = "shlex splitting command {int} yields {string}")] fn ninja_command_tokens(world: &mut CliWorld, index: usize, expected: String) { … } -#[allow( - clippy::needless_pass_by_value, - reason = "Cucumber requires owned String arguments" -)] +#[expect( + clippy::needless_pass_by_value, + reason = "Cucumber requires owned String arguments" +)] #[then(expr = "shlex splitting the command yields {string}")] fn ninja_first_command_tokens(world: &mut CliWorld, expected: String) { … }
♻️ Duplicate comments (10)
tests/features/ir.feature (1)
15-17: Adopt 2-action expectation and keep the doc link validAccept the behaviour change and the scenario rename. Verify the anchor exists and purge stale “1 actions” expectations across features.
#!/bin/bash set -euo pipefail # 1) Anchor check (ensure a matching heading or HTML anchor exists) rg -nP '^\s*#{1,6}\s+.*design decisions|name="55-design-decisions"|id="55-design-decisions"' docs/netsuke-design.md || true # 2) Stale expectations check rg -nP -C2 '\bgraph has\s+1 actions\b' tests/features || truetests/features/ir_generation.feature (1)
19-23: Synchronise dedup semantics in IR-generation testsThe rename and “2 actions” assertion align with the new rule-mapping semantics. Confirm no remaining IR-generation specs still expect 1 action.
#!/bin/bash set -euo pipefail rg -nP -C2 '\bgraph has\s+1 actions\b' tests/features/ir_generation.feature || truetests/data/quote.yml (1)
1-8: Add hardening fixtures: embedded quote + newline and dash-prefixed sourcesExpand coverage for tricky tokens to prove quoting and shlex validation across edge cases.
Append this additional target:
- name: "o'ut\nfile" sources: "-in file" command: "printf %s $in > $out"Optional: also add a multi-line YAML scalar if preferred:
- name: |- o'ut file sources: "-in file" command: "printf %s $in > $out"tests/ir_from_manifest_tests.rs (1)
19-23: Test rename and expectation now match the new dedup semantics.Name and assertion are aligned with the 2-action outcome.
src/ninja_gen.rs (2)
114-120: Replace assert! with debug_assert! in Command path — good.Avoid release panics while retaining a guard in debug builds.
126-129: Replace assert! with debug_assert! in Script path — good.Mirror the Command path; no release-time panics.
tests/command_escaping_tests.rs (2)
28-29: Assert lexical arguments instead of relying on Bash-specific quotingThe current assertion on line 29 validates that
shlex::splitproduces the expected tokens, which is good. However, the implementation still uses Bash-specific $'...' quoting that won't work with/bin/sh. Update the IR quoting to use POSIX-compatible quoting or explicitly invoke bash in Ninja execution.
80-92: Expand parameterised tests with additional boundary casesAdd test cases for unterminated single quotes and backticks to ensure comprehensive validation of invalid commands.
#[rstest] #[case("echo \"unterminated")] #[case("echo 'unterminated")] #[case("echo `unterminated")] +#[case("echo 'single unterminated")] +#[case("echo `backtick unterminated")] fn invalid_command_errors(#[case] cmd: &str) {src/ir.rs (2)
134-137: UseArc<Rule>to avoid expensive cloningStore rules as
Arc<Rule>in the map to reduce memory allocation and copying costs for large Rule structs.+use std::sync::Arc; -fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Rule>) { +fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Arc<Rule>>) { for rule in &manifest.rules { - rule_map.insert(rule.name.clone(), rule.clone()); + rule_map.insert(rule.name.clone(), Arc::new(rule.clone())); } }Then update the
rule_mapparameter type inprocess_targetsandresolve_ruleaccordingly.
151-160: Document rule precedence for future schema changesAdd a comment clarifying that Rule recipe/description take precedence over any potential target-level values in future schema versions.
let target_name = get_target_display_name(&outputs); let action_id = match &target.recipe { Recipe::Rule { rule } => { + // Rule recipe/description take precedence; adjust if schema changes merge semantics let tmpl = resolve_rule(rule, rule_map, &target_name)?; register_action(
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
Cargo.toml(1 hunks)docs/netsuke-design.md(2 hunks)docs/roadmap.md(1 hunks)src/ir.rs(8 hunks)src/ninja_gen.rs(1 hunks)tests/command_escaping_tests.rs(1 hunks)tests/data/quote.yml(1 hunks)tests/features/ir.feature(1 hunks)tests/features/ir_generation.feature(1 hunks)tests/features/ninja.feature(1 hunks)tests/ir_from_manifest_tests.rs(1 hunks)tests/steps/ninja_steps.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions
Files:
tests/ir_from_manifest_tests.rssrc/ninja_gen.rstests/steps/ninja_steps.rstests/command_escaping_tests.rssrc/ir.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and the
mockablecrate are the preferred option.- If mockable cannot be used, env mutations in tests ...
Files:
tests/ir_from_manifest_tests.rssrc/ninja_gen.rstests/steps/ninja_steps.rstests/command_escaping_tests.rssrc/ir.rs
tests/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md
Files:
tests/ir_from_manifest_tests.rstests/steps/ninja_steps.rstests/command_escaping_tests.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries
Files:
src/ninja_gen.rssrc/ir.rs
Cargo.toml
📄 CodeRabbit inference engine (AGENTS.md)
Cargo.toml: Use explicit version ranges and keep dependencies up to date
Mandate SemVer caret requirements for all dependencies (e.g., "1.2.3")
Prohibit wildcard (*) and open-ended (>=) version specifiers; use ~ only with a specific, documented reason
Files:
Cargo.toml
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)
Files:
docs/roadmap.mddocs/netsuke-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie
Files:
docs/roadmap.mddocs/netsuke-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/roadmap.mddocs/netsuke-design.md
🧬 Code graph analysis (2)
tests/ir_from_manifest_tests.rs (2)
src/manifest.rs (1)
from_path(512-517)src/ir.rs (1)
from_manifest(121-132)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
from_str(314-316)src/ir.rs (1)
from_manifest(121-132)
🪛 LanguageTool
docs/roadmap.md
[uncategorized] ~132-~132: Possible missing comma found.
Context: ... interpolations are validated with shlex but are not shell-quoted. - [x] Emit POS...
(AI_HYDRA_LEO_MISSING_COMMA)
🔍 Remote MCP
Here are some additional facts about the two newly added crates that may inform your review:
• shell-quote 0.7.2
– Implements safe quoting for GNU Bash, Z Shell, fish, and /bin/sh-like shells (including Dash), handling spaces, metacharacters, and multibyte inputs via extension traits for strings and byte buffers.
– Default features enabled: bstr, bash, fish, and sh, but these can be disabled to narrow support and reduce footprint. (docs.rs)
– Licensed under Apache 2.0; last built on docs.rs Feb 25 2025; maintained by user “allenap” on crates.io. (docs.rs)
• shlex 1.3.0
– Provides POSIX-compatible shell-word splitting (à la Python’s shlex), with support for both Rust strings and byte slices and a no_std mode.
– Tested against Bash, Z Shell, Dash, BusyBox ash, mksh, and fish to ensure robust parsing. (docs.rs)
– Dual-licensed MIT OR Apache 2.0; version 1.3.0 released Jan 22 2024. (docs.rs)
• Security notes
– The popular Node.js “shell-quote” package suffered critical command-injection vulnerabilities (e.g. CVE-2016-10541), but this Rust crate is a separate implementation and, as of 0.7.2, has no known Rust-crate CVEs. (nvd.nist.gov, docs.rs)
These crates are well-maintained, permissively licensed, and address the PR’s goals of secure quoting and validation of interpolated commands.
🔇 Additional comments (6)
Cargo.toml (1)
26-27: Constrain quoting to POSIX sh and pin versions; now remove Bash-only usageGood: shell-quote is pinned and restricted to feature "sh". Sweep the codebase for Bash-only quoting ($'…') and Bash types to avoid runtime breakage under /bin/sh.
#!/bin/bash set -euo pipefail # Confirm cargo feature set cargo tree -e features -i shell-quote | sed -n '1,120p' # Find Bash-specific usages in source and tests rg -nP -C2 '\bshell_quote::Bash\b|\bBash::\w+\b|quoted\s*\(\s*Bash\s*\)|\$\'' src tests || truedocs/netsuke-design.md (2)
242-249: Make the POSIX/bin/shclaim verifiable and enforced end-to-end.Good clarification. Verify the implementation actually uses Sh mode exclusively and that no Bash-only
$'…'quoting leaks into generated commands or tests.#!/bin/bash set -euo pipefail echo "Cargo features for shell-quote:" rg -nP '^\s*shell-quote\s*=\s*\{[^}]*features\s*=\s*\[\s*"sh"\s*\]' Cargo.toml || true echo "Search for selection of Sh mode in code:" rg -nP '\bSh\b' -n src || true rg -nP 'shell_?quote|QuoteExt|push_quoted' -n src || true echo "Ensure no Bash $\'...\' style appears in code or tests:" rg -nP "\$'[^']+'" -n --iglob '!**/target/**' . || true echo "Confirm /bin/sh invocation in ninja generator:" rg -nP '/bin/sh\s+-e\s+-c' -n src/ninja_gen.rs || true
1236-1244: Validate that the implementation uses a proper tokeniser instead of naive string replacements:• Inspect the code that handles
$in/$outinterpolation.
• Ensure there are no calls like.replace("$in", …)or.replace('$out', …).
• Confirm the code parses the command template into tokens and only substitutes standalone$in/$outtokens.
• Review the scanner/parser logic in the module that emits the final shell command.Verify these points manually against the source to guarantee token-aware interpolation and prevent unintended replacements.
tests/features/ninja.feature (1)
15-24: Add shlex-token assertions for quoting — good coverage.Validate tokenisation of redirection and embedded quotes via shlex-based steps.
src/ir.rs (2)
110-111: Good security improvement for error messagesThe error variant now includes a truncated
snippetfield to avoid leaking full commands with potentially sensitive paths, whilst retaining the fullcommandfor programmatic access.
234-242: Remove review comment about quoting mismatchVerified that
shell-quoteis configured withfeatures = ["sh"]in Cargo.toml for POSIX-compatible quoting and that no tests use Bash-specific$'…'syntax. Approve the code as-is.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (6)
src/ir.rs (6)
339-366: Updateresolve_ruleto return&Arc<Rule>when adopting shared map type.Keep reference semantics; avoid extra clones.
-fn resolve_rule<'a>( +fn resolve_rule<'a>( rule: &StringOrList, - rule_map: &'a HashMap<String, Rule>, + rule_map: &'a HashMap<String, Arc<Rule>>, target_name: &str, -) -> Result<&'a Rule, IrGenError> { +) -> Result<&'a Arc<Rule>, IrGenError> { … }
134-144: Avoid cloning largeRulevalues; store byArcinrule_map.Stop copying
Rulestructs into the map. Use shared ownership to reduce moves/copies. PropagateArcthrough call sites.Apply:
- fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Rule>) { + use std::sync::Arc; + fn process_rules(manifest: &NetsukeManifest, rule_map: &mut HashMap<String, Arc<Rule>>) { for rule in &manifest.rules { - rule_map.insert(rule.name.clone(), rule.clone()); + rule_map.insert(rule.name.clone(), Arc::new(rule.clone())); } }Outside selected lines, update type imports and callers:
Rust
use std::sync::Arc;
150-151: Alignrule_maptype with shared ownership.Update parameter type to match
Arc<Rule>to prevent accidental clones down the line.- rule_map: &HashMap<String, Rule>, + rule_map: &HashMap<String, Arc<Rule>>,
157-166: Document precedence where Rule vs target-level recipe/description could conflict.Add a brief comment stating that a resolved Rule’s
recipe/descriptiontake precedence over any future target-level fields.Recipe::Rule { rule } => { - let tmpl = resolve_rule(rule, rule_map, &target_name)?; + // Precedence: resolved Rule recipe/description take priority over any + // future target-level equivalents. Update if schema introduces overrides. + let tmpl = resolve_rule(rule, rule_map, &target_name)?;
243-271: Extract backtick check and pre-join arguments to cut allocations.Improve readability and avoid repeated joins inside
substitute.Apply:
fn interpolate_command( template: &str, inputs: &[PathBuf], outputs: &[PathBuf], ) -> Result<String, IrGenError> { + fn has_unmatched_backticks(s: &str) -> bool { + s.chars().filter(|&c| c == '`').count() % 2 != 0 + } fn quote_paths(paths: &[PathBuf]) -> Vec<String> { paths .iter() .map(|p| { let raw = p.display().to_string(); String::from_utf8(raw.quoted(Sh)).expect("quoted path is valid UTF-8") }) .collect() } let ins = quote_paths(inputs); let outs = quote_paths(outputs); - let interpolated = substitute(template, &ins, &outs); - let unmatched_backticks = interpolated.chars().filter(|&c| c == '`').count() & 1 != 0; - if unmatched_backticks || shlex::split(&interpolated).is_none() { + let ins_joined = ins.join(" "); + let outs_joined = outs.join(" "); + let interpolated = substitute(template, &ins_joined, &outs_joined); + if has_unmatched_backticks(&interpolated) || shlex::split(&interpolated).is_none() { let snippet = interpolated.chars().take(160).collect(); return Err(IrGenError::InvalidCommand { command: interpolated, snippet, }); } Ok(interpolated) }Outside selected lines, adjust
substitutesignature to take&strjoined replacements (see next comment).
272-310: Reduce cyclomatic complexity insubstituteand avoid repeated joins.Extract matchers and operate on pre-joined replacements; keep token-boundary guarantees.
-fn substitute(template: &str, ins: &[String], outs: &[String]) -> String { +fn substitute(template: &str, ins_joined: &str, outs_joined: &str) -> String { fn is_ident(ch: char) -> bool { ch.is_ascii_alphanumeric() || ch == '_' } + fn try_match_in(chars: &[char], i: usize) -> Option<usize> { + if matches!(chars.get(i + 1), Some('i')) && matches!(chars.get(i + 2), Some('n')) { + let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c)); + let next_ok = chars.get(i + 3).is_none_or(|c| !is_ident(*c)); + if prev_ok && next_ok { + return Some(3); + } + } + None + } + + fn try_match_out(chars: &[char], i: usize) -> Option<usize> { + if matches!(chars.get(i + 1), Some('o')) + && matches!(chars.get(i + 2), Some('u')) + && matches!(chars.get(i + 3), Some('t')) + { + let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c)); + let next_ok = chars.get(i + 4).is_none_or(|c| !is_ident(*c)); + if prev_ok && next_ok { + return Some(4); + } + } + None + } + let chars: Vec<char> = template.chars().collect(); let mut out = String::with_capacity(template.len()); let mut i = 0; while let Some(&ch) = chars.get(i) { if ch == '$' { - // Attempt to match $in - if matches!(chars.get(i + 1), Some('i')) && matches!(chars.get(i + 2), Some('n')) { - let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c)); - let next_ok = chars.get(i + 3).is_none_or(|c| !is_ident(*c)); - if prev_ok && next_ok { - out.push_str(&ins.join(" ")); - i += 3; - continue; - } - } - // Attempt to match $out - if matches!(chars.get(i + 1), Some('o')) - && matches!(chars.get(i + 2), Some('u')) - && matches!(chars.get(i + 3), Some('t')) - { - let prev_ok = chars.get(i.wrapping_sub(1)).is_none_or(|c| !is_ident(*c)); - let next_ok = chars.get(i + 4).is_none_or(|c| !is_ident(*c)); - if prev_ok && next_ok { - out.push_str(&outs.join(" ")); - i += 4; - continue; - } - } + if let Some(skip) = try_match_in(&chars, i) { + out.push_str(ins_joined); + i += skip; + continue; + } + if let Some(skip) = try_match_out(&chars, i) { + out.push_str(outs_joined); + i += skip; + continue; + } } out.push(ch); i += 1; } out }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
docs/netsuke-design.md(4 hunks)src/ir.rs(8 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions
Files:
src/ir.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and the
mockablecrate are the preferred option.- If mockable cannot be used, env mutations in tests ...
Files:
src/ir.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries
Files:
src/ir.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)
Files:
docs/netsuke-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie
Files:
docs/netsuke-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/netsuke-design.md
🔍 Remote MCP
I attempted to run external documentation searches but they failed (status 402), so no additional documents were fetched.
Concise actionable review facts (from provided PR context):
- New crates added: shell-quote = "0.7.2" (default-features = false, features = ["sh"]) and shlex = "1.3.0". Impact: quoting + POSIX parsing/validation added to interpolation and ninja emission.
- Behavior change: $in/$out expansions are shell-quoted and commands are validated via shlex; invalid interpolations now produce IrGenError::InvalidCommand.
- Quoting correctness concern: implementation uses shell_quote::QuoteExt::quote() which emits bash-style $'…' sequences — not guaranteed POSIX /bin/sh-safe. Reviewer recommended using sh_quote() or otherwise ensuring POSIX-compatible quoting or requiring bash.
- Replacement fragility: current interpolation uses naive .replace("$in", …).replace("$out", …), which can accidentally rewrite unrelated identifiers (e.g., "$input"). Tests include a case to confirm variable_name_overlap_not_rewritten, but reviewer flagged this as fragile and suggested a structured templating approach.
- Deduplication semantics changed: rules are stored as Rule objects keyed by name (no action-hash dedup at process_rules), causing previously-duplicate rules to emit distinct actions. Tests and features were updated to expect the increased action counts; this is an intentional architectural shift but should be documented and validated against project semantics.
- Tests added/updated: unit tests (tests/command_escaping_tests.rs), integration/feature tests (ninja.feature, ir_generation.feature), and existing tests updated to expect quoted commands and adjusted action counts.
- Ninja generation: now validates commands with shlex::split before emitting; emission logic asserts split-ability for both Command and Script paths.
Tag:
🔇 Additional comments (7)
src/ir.rs (6)
77-79: Adopt POSIX/bin/shquoting (good).Importing
Shand usingshell_quotein Sh mode addresses the prior Bash-only quoting issue.
110-112: Limit error message PII leakage (good).Displaying only
snippetin theInvalidCommandvariant is appropriate; keep the fullcommandfor programmatic access.
206-214: Clarify dedup boundary in docstring (good).The comment correctly states dedup is by fully interpolated command + file sets.
218-229: Interpolate onlyCommandrecipes (good).Restricting validation/quoting to
Commandhere is correct; scripts will be handled by their interpreter path.
243-271: Verify MSRV forOption::is_none_orand replace if too new.
is_none_orrequires a recent Rust. Confirm MSRV or gate with a small helper to avoid bumping MSRV unintentionally.Run:
#!/bin/bash set -euo pipefail # Detect declared MSRV rg -n 'rust-version\s*=\s*"[0-9]+' Cargo.toml || true rg -n 'channel\s*=\s*"(stable|[0-9.]+)"' rust-toolchain* 2>/dev/null || true # Report usage sites of is_none_or rg -n --type=rust '\.is_none_or\(' echo "If MSRV < 1.70, replace is_none_or with explicit matches."Also applies to: 272-310
243-271: P1: Ensure tests no longer expect Bash $'…' strings.Sh-mode quoting should not emit
$'…'. Purge stale expectations.Run:
#!/bin/bash set -euo pipefail rg -nP "\$\'.*\'" tests || true rg -nP "cat\s+\$\'.*\'\s*>\s*\$\'.*\'" -S tests || true echo "Expect no matches; update snapshots if present."docs/netsuke-design.md (1)
242-249: POSIX/bin/shwording (good).The text now explicitly states Sh-mode quoting and POSIX shlex validation.
|
@coderabbitai Has the following now been resolved? Rule Deduplication Behavior ❌ Intentional Change But Should Be Documented The rule deduplication behavior has indeed changed. Looking at src/ir.rs, the process_rules function now stores actual Rule objects in rule_map instead of computing action hashes immediately: fn process_rules(rules: &[Rule]) -> Result<HashMap<String, Rule>, IrGenError> { This means identical rule definitions now create separate actions for each target that uses them, rather than deduplicating at the rule level. The test changes confirm this - duplicate_rules_are_deduped now expects 2 actions instead of 1. This appears to be an intentional architectural change to move from immediate action creation to a rule-template approach, but it fundamentally changes the deduplication semantics. |
|
Tip For best results, initiate chat on the files or code changes. @leynos Looking at the updated documentation in The design documentation now explicitly describes the architectural change:
The documentation now clearly explains that this is an intentional shift from the previous action-hash deduplication to a rule-template approach where deduplication occurs at the action level based on the complete resolved command and file sets, rather than at the rule definition level. The change is well-documented and the architectural rationale is explained, so the "should be documented" concern has been addressed. |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/ir.rs (1)
1-550: Split the module; file exceeds the 400‑line limit.Extract interpolation/quoting into
src/ir/command.rsand cycle detection intosrc/ir/cycle.rs; keepir.rsas the public façade.
♻️ Duplicate comments (1)
docs/roadmap.md (1)
130-139: Correct the scope, casing, and duplication in the security checklist.Limit quoting to $in/$out (as implemented), capitalise “Bash”, and drop the duplicate shlex bullet.
- - [x] Mandate its use for variable substitutions within command strings - during IR generation to prevent command injection, and validate the final - command string with shlex. + - [x] Mandate its use for `$in`/`$out` substitutions within command strings + during IR generation to prevent command injection. Validate the final + command string with `shlex`. - - [x] Emit POSIX-sh-compatible quoting (portable single-quote style) - rather than Bash-only forms. If Bash-specific quoting is required, document - and enforce execution under bash. + - [x] Emit POSIX sh-compatible quoting (portable single-quote style), + rather than Bash-only forms. If Bash-specific quoting is required, + document and enforce execution under Bash. - - - [x] After interpolation, validate the final command string is parsable using - the shlex crate.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- Jira integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
docs/netsuke-design.md(4 hunks)docs/roadmap.md(1 hunks)src/ir.rs(7 hunks)tests/command_escaping_tests.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
docs/**/*.md: Treat docs/ markdown as the reference source of truth for requirements and decisions; keep updated with changes
Use en-GB-oxendict spelling/grammar in documentation (LICENSE name unchanged)
Files:
docs/roadmap.mddocs/netsuke-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown with make markdownlint
Wrap Markdown paragraphs and bullet points at 80 columns; do not wrap tables or headings
Wrap code blocks in Markdown at 120 columns
Use dashes (-) for list bullets
Use GitHub-flavoured Markdown footnotes ([^1]) for references
Validate Mermaid diagrams in Markdown using make nixie
Files:
docs/roadmap.mddocs/netsuke-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/roadmap.mddocs/netsuke-design.md
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Comment why, not what; document assumptions, edge cases, trade-offs, and complexity without restating obvious code
Extract reusable logic into functions; prefer composition and declarative constructs when readable
Keep functions small, single-responsibility, and observe command/query separation
Use precise, descriptive names; boolean names should start with is/has/should
Use en-GB-oxendict spelling/grammar in comments (except external API references)
Function docs must include clear usage examples; avoid redundant examples in test docs
No Rust source file may exceed 400 lines; split long switches/tables; move large test data to external files
Fix test warnings in code; do not silence them
Extract helpers when functions are too long; group many parameters into well-named structs
Consider using Arc when returning large errors to reduce data movement
Document public APIs using Rustdoc (///) so cargo doc can generate documentation
Prefer immutable data; avoid unnecessary mut bindings
Prefer Result-based error handling over panicking where feasible
Avoid unsafe code unless absolutely necessary and document any usage clearly
Place function attributes after doc comments
Do not use return in single-line functions
Use predicate helper functions when conditionals have more than two branches
Do not silence lints except as a last resort; suppressions must be tightly scoped and include a clear reason
Prefer #[expect(...)] over #[allow(...)] for lint management
Prefer .expect() over .unwrap()
Use conditional compilation (#[cfg]/#[cfg_attr]) for functions unused under specific feature sets
Use concat!() to join long string literals rather than using backslash-newline escapes
Prefer single-line function bodies where appropriate (e.g., pub fn new(id: u64) -> Self { Self(id) })
Prefer semantic error enums deriving std::error::Error via thiserror for caller-inspectable conditions
Files:
tests/command_escaping_tests.rssrc/ir.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.
Adhere to single responsibility and CQRS
Place function attributes after doc comments.
Do not use
returnin single-line functions.Move conditionals with >2 branches into a predicate function.
Avoid
unsafeunless absolutely necessary.Every module must begin with a
//!doc comment that explains the module's purpose and utility.Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar
Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
Use
rstestfixtures for shared setup and to avoid repetition between tests.Replace duplicated tests with
#[rstest(...)]parameterised cases.Prefer
mockallfor mocks/stubs.Prefer
.expect()over.unwrap()Ensure that any API or behavioural changes are reflected in the documentation in
docs/Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and the
mockablecrate are the preferred option.- If mockable cannot be used, env mutations in tests ...
Files:
tests/command_escaping_tests.rssrc/ir.rs
tests/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.rs: Use rstest fixtures for shared setup and replace duplicated tests with #[rstest(...)] parameterised cases
Prefer mockall for mocks/stubs
Mock non-deterministic dependencies via dependency injection (e.g., Env/Clock traits) using mockable crate; follow docs/reliable-testing-in-rust-via-dependency-injection.md
Files:
tests/command_escaping_tests.rs
src/**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.rs: Every module must begin with a //! module-level comment explaining its purpose and utility
Never export eyre::Report from library code; convert to domain error enums at API boundaries
Files:
src/ir.rs
🧬 Code graph analysis (1)
tests/command_escaping_tests.rs (2)
src/manifest.rs (1)
from_str(314-316)src/ir.rs (1)
from_manifest(122-133)
🔍 Remote MCP
I attempted to run external documentation searches but they failed (status 402), so no additional documents were fetched.
Concise actionable review facts (from provided PR context):
- New crates added: shell-quote = "0.7.2" (default-features = false, features = ["sh"]) and shlex = "1.3.0". Impact: quoting + POSIX parsing/validation added to interpolation and ninja emission.
- Behavior change: $in/$out expansions are shell-quoted and commands are validated via shlex; invalid interpolations now produce IrGenError::InvalidCommand.
- Quoting correctness concern: implementation uses shell_quote::QuoteExt::quote() which emits bash-style $'…' sequences — not guaranteed POSIX /bin/sh-safe. Reviewer recommended using sh_quote() or otherwise ensuring POSIX-compatible quoting or requiring bash.
- Replacement fragility: current interpolation uses naive .replace("$in", …).replace("$out", …), which can accidentally rewrite unrelated identifiers (e.g., "$input"). Tests include a case to confirm variable_name_overlap_not_rewritten, but reviewer flagged this as fragile and suggested a structured templating approach.
- Deduplication semantics changed: rules are stored as Rule objects keyed by name (no action-hash dedup at process_rules), causing previously-duplicate rules to emit distinct actions. Tests and features were updated to expect the increased action counts; this is an intentional architectural shift but should be documented and validated against project semantics.
- Tests added/updated: unit tests (tests/command_escaping_tests.rs), integration/feature tests (ninja.feature, ir_generation.feature), and existing tests updated to expect quoted commands and adjusted action counts.
- Ninja generation: now validates commands with shlex::split before emitting; emission logic asserts split-ability for both Command and Script paths.
Tag:
🔇 Additional comments (5)
tests/command_escaping_tests.rs (2)
17-31: Approve: assert lexical tokens, not dialect-specific quoting.The shlex-based assertion robustly validates correct argument boundaries across shells.
95-107: Guard against false positives: backticks inside single quotes are literals.Add a positive case to prove unmatched-backtick detection does not flag single-quoted content.
@@ fn invalid_command_errors(#[case] cmd: &str) { @@ } + +#[rstest] +fn backticks_inside_single_quotes_are_literal() { + let yaml = manifest_yaml( + "targets:\n - name: out\n sources: in\n command: \"echo '$in `literal`' > $out\"\n", + ); + let manifest = manifest::from_str(&yaml).expect("parse"); + let graph = BuildGraph::from_manifest(&manifest).expect("graph"); + let action = graph.actions.values().next().expect("action"); + let Recipe::Command { command } = &action.recipe else { + panic!(\"expected command\") + }; + let words = shlex::split(command).expect(\"split command into words\"); + assert_eq!(words, [\"echo\", \"$in `literal`\", \">\", \"out\"]); +}src/ir.rs (2)
135-141: Approve: clarify rule-template storage and deferred expansion.The docstring accurately captures the intentional change in dedup semantics.
210-218: Approve: hash on fully interpolated command and file-set.The registration path matches the documented identity semantics.
docs/netsuke-design.md (1)
1236-1245: Approve: document IR-time substitution and POSIX Sh quoting.The implementation notes match
interpolate_commandand token-boundary substitution.
|
@coderabbitai pause |
✅ Actions performedReviews paused. |
Summary
Testing
make fmtmake lintmake testmake markdownlinthttps://chatgpt.com/codex/tasks/task_e_68af7a8b60788322821665ad2804c2af
Summary by Sourcery
Escape interpolated $in/$out placeholders using the shell-quote crate, validate the resulting commands with shlex, update IR and Ninja generators to handle quoting and errors, and adjust documentation and tests accordingly
New Features:
Enhancements:
Build:
Documentation:
Tests: