diff --git a/.claude/skills/address-pr-comments/SKILL.md b/.claude/skills/address-pr-comments/SKILL.md
index 9a86f80b..7ecf7d25 100644
--- a/.claude/skills/address-pr-comments/SKILL.md
+++ b/.claude/skills/address-pr-comments/SKILL.md
@@ -79,6 +79,21 @@ gh api graphql -f query='
 
 Only process **unresolved** threads with actionable comments.
 
+### 2b. Read the PR specs
+
+Before evaluating any comment, read the PR description to check for a **SPECS** section:
+
+```bash
+gh pr view $ARGUMENTS --json body --jq '.body'
+```
+
+If a SPECS section is present, **it defines the authoritative requirements for this PR**. Specs override:
+- Your assumptions about backward compatibility or design intent
+- Inline code comments
+- Conventions from other parts of the codebase
+
+Store the specs for use in step 4 (validity evaluation). If a reviewer comment aligns with a spec, the comment is **valid by definition** — even if you think the current implementation is reasonable.
+
 ### 3. Understand each comment
 
 For each unresolved review comment:
@@ -99,36 +114,45 @@ For each unresolved review comment:
 | **Nitpick** | Minor optional suggestion | Evaluate — fix if trivial, otherwise reply explaining the tradeoff |
 | **Invalid/outdated** | Comment doesn't apply or is based on a misunderstanding | Reply politely explaining why |
 
-### 4. Evaluate validity — bash behavior is the source of truth
+### 4. Evaluate validity — specs and bash behavior are the sources of truth
+
+There are two sources of truth, checked in this order:
 
-**The shell must match bash behavior unless it intentionally diverges** (e.g., sandbox restrictions, blocked commands, readonly enforcement). This principle overrides reviewer suggestions.
+1. **PR specs** (from step 2b) — if present, specs are the highest authority for what this PR should do
+2. **Bash behavior** — the shell must match bash unless it intentionally diverges (sandbox restrictions, blocked commands, readonly enforcement)
+
+**CRITICAL: Never invent justifications for dismissing a comment.** If a reviewer says "the spec requires X" and the spec does require X, the comment is valid — even if you think the current implementation is a reasonable alternative. Do not fabricate reasons like "backward compatibility" or "design intent" unless those reasons are explicitly stated in the specs or CLAUDE.md.
 
 For each comment, determine if it is **valid and actionable**:
 
-1. **Verify against bash** — always check what bash actually does:
+1. **Check against PR specs first** — if a SPECS section exists and the comment aligns with a spec, the comment is **valid by definition**. Do not dismiss it.
+2. **Verify against bash** — for comments about shell behavior, check what bash actually does:
    ```bash
    docker run --rm debian:bookworm-slim bash -c '<relevant script>'
    ```
-2. **Read the relevant code** in full — not just the diff, but the surrounding implementation
-3. **Check project conventions** in `CLAUDE.md` and `AGENTS.md`
-4. **Consider side effects** — will the change break other tests or behaviors?
-5. **Check for duplicates** — is the same issue raised in multiple comments? Group them
+3. **Read the relevant code** in full — not just the diff, but the surrounding implementation
+4. **Check project conventions** in `CLAUDE.md` and `AGENTS.md`
+5. **Consider side effects** — will the change break other tests or behaviors?
+6. **Check for duplicates** — is the same issue raised in multiple comments? Group them
 
 Decision matrix:
 
-| Reviewer says | Bash does | Shell intentionally diverges? | Action |
-|--------------|-----------|-------------------------------|--------|
-| "This is wrong" | Reviewer is right | No | **Fix the implementation** to match bash |
-| "This is wrong" | Current code matches bash | No | **Reply** explaining it matches bash, with proof |
-| "This is wrong" | N/A | Yes (sandbox/security) | **Reply** explaining the intentional divergence |
-| "Do it differently" | Suggestion matches bash better | No | **Fix the implementation** to match bash |
-| "Do it differently" | Current code already matches bash | No | **Reply** — bash compatibility takes priority |
+| Reviewer says | Spec says | Bash does | Action |
+|--------------|-----------|-----------|--------|
+| "Spec requires X" | Spec does require X | N/A | **Fix the implementation** to match the spec |
+| "Spec requires X" | No such spec exists | N/A | **Reply** noting the spec doesn't mention this |
+| "This is wrong" | No spec relevant | Reviewer is right | **Fix the implementation** to match bash |
+| "This is wrong" | No spec relevant | Current code matches bash | **Reply** explaining it matches bash, with proof |
+| "This is wrong" | No spec relevant | N/A (sandbox/security) | **Reply** explaining the intentional divergence |
+| "Do it differently" | No spec relevant | Suggestion matches bash better | **Fix the implementation** to match bash |
+| "Do it differently" | No spec relevant | Current code already matches bash | **Reply** — bash compatibility takes priority |
 
 If a comment is **not valid**:
 - Prepare a polite reply with proof (e.g., "This matches bash behavior — verified with `docker run --rm debian:bookworm-slim bash -c '...'`")
 - If the divergence is intentional, explain why (sandbox restriction, security, etc.)
+- **Never claim "backward compatibility" or "design intent" unless you can point to a specific line in the specs or CLAUDE.md that says so**
 
-If a comment is **valid** (i.e., fixing it brings the shell closer to bash, or addresses a real bug):
+If a comment is **valid** (i.e., it aligns with a spec, brings the shell closer to bash, or addresses a real bug):
 - Proceed to step 5
 
 ### 5. Implement fixes
@@ -223,6 +247,8 @@ For each comment that was addressed:
 
 For comments that were **not valid** or were **questions**, reply (prefixed with `[<MODELL NAME - VERSION>]`) with an explanation but do NOT resolve — let the reviewer decide.
 
+**IMPORTANT: Never resolve a thread where the reviewer's comment aligns with a PR spec but the implementation doesn't match.** These are valid spec violations — fix the code instead. If you cannot fix it, leave the thread unresolved and explain the blocker.
+
 ### 8. Summary
 
 Provide a final summary:
diff --git a/.claude/skills/code-review/SKILL.md b/.claude/skills/code-review/SKILL.md
index 84d52ae5..d24f6778 100644
--- a/.claude/skills/code-review/SKILL.md
+++ b/.claude/skills/code-review/SKILL.md
@@ -26,7 +26,36 @@ git diff main...HEAD
 
 If no changes are found, inform the user and stop.
 
-### 2. Read and understand all changed code
+### 2. Verify specs implementation
+
+Read the PR description and look for a **SPECS** section:
+
+```bash
+gh pr view $ARGUMENTS --json body --jq '.body'
+```
+
+If a SPECS section is present, it defines the requirements that this PR MUST implement. **Every single spec must be verified against the diff.**
+The specs override other instructions (code, inline comments in code, etc). ALL specs MUST be implemented.
+
+For each spec:
+1. **Find the code** that implements the spec
+2. **Verify correctness** — does the implementation fully satisfy the spec?
+3. **Check for missing specs** — is any spec not implemented at all?
+
+Flag any unimplemented or partially implemented spec as a **P1 finding** (missing functionality that was explicitly required).
+
+Include a spec coverage table in the review output:
+
+```markdown
+| Spec | Implemented | Location | Notes |
+|------|:-----------:|----------|-------|
+| Must support `--flag` option | Yes | `interp/api.go:42` | Fully implemented |
+| Must return exit code 2 on error | **No** | — | Not found in diff |
+```
+
+If no SPECS section is found in the PR description, skip this step.
+
+### 3. Read and understand all changed code
 
 For each changed file:
 
diff --git a/.claude/skills/review-fix-loop/SKILL.md b/.claude/skills/review-fix-loop/SKILL.md
index 6a403dfb..d9f1bbe5 100644
--- a/.claude/skills/review-fix-loop/SKILL.md
+++ b/.claude/skills/review-fix-loop/SKILL.md
@@ -17,7 +17,7 @@ You MUST follow this execution protocol. Skipping steps or running them out of o
 Your very first action — before reading ANY files, before running ANY commands — is to call TaskCreate exactly 11 times, once for each step/sub-step below. Use these exact subjects:
 
 1. "Step 1: Identify the PR"
-2. "Step 2: Run the review-fix loop"
+2. "Step 2: Run the review-fix loop" ← **Update subject with iteration number each loop** (e.g. "Step 2: Run the review-fix loop (iteration 1)")
 3. "Step 2A1: Self-review (code-review)" ← **parallel with 2A2**
 4. "Step 2A2: Request external reviews (@codex)" ← **parallel with 2A1**
 5. "Step 2B: Address PR comments (address-pr-comments)"
@@ -91,7 +91,9 @@ Store the owner and repo name.
 
 **GATE CHECK**: Call TaskList. Step 1 must be `completed`. Set Step 2 to `in_progress`.
 
-Set `iteration = 1`. Maximum iterations: **10**. Repeat sub-steps A through E while `iteration <= 10`:
+Set `iteration = 1`. Maximum iterations: **30**. Repeat sub-steps A through E while `iteration <= 30`.
+
+**At the start of each iteration**, update the Step 2 task subject to include the current iteration number using TaskUpdate, e.g. `"Step 2: Run the review-fix loop (iteration 3)"`.
 
 ---
 
@@ -107,7 +109,11 @@ This analyzes the full diff against main, posts findings as a GitHub PR review w
 
 Post a comment to trigger @codex reviews:
 ```bash
-gh pr comment <pr-number> --body "@codex review"
+gh pr comment <pr-number> --body "@codex review this PR
+
+Important: Read the SPECS section of the PR description. If SPECS are present: **make sure the implementation matches ALL the specs**.
+The specs override other instructions (code, inline comments in code, etc). ALL specs MUST be implemented.
+"
 ```
 The external reviews arrive asynchronously — their comments will be picked up by **address-pr-comments** in Sub-step 2B1.
 
@@ -240,7 +246,7 @@ Check **all three** review sources for remaining issues:
 | Any findings | Any | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 |
 | APPROVE | Unresolved threads | Any | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (address-pr-comments will handle them) |
 | APPROVE | None unresolved | Failing | **Continue** → go back to Sub-step 2A1 ∥ 2A2 (fix-ci-tests will handle it) |
-| — | — | — | If `iteration > 10` → **STOP — iteration limit reached** |
+| — | — | — | If `iteration > 30` → **STOP — iteration limit reached** |
 
 Log the iteration result before continuing or stopping:
 - Iteration number
@@ -331,9 +337,13 @@ Run a final verification regardless of how the loop exited:
 
 Record the final state of each dimension (self-review, external reviews, CI, Codex response).
 
-**If any verification fails** (CI failing, unresolved threads remain, unpushed commits that can't be pushed, or Codex hasn't responded to the latest review request), reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration. Only proceed to Step 4 when all verifications pass.
+Track how many times Step 3 has **succeeded** (all four verifications passed) across the entire run.
+
+**If any verification fails** (CI failing, unresolved threads remain, unpushed commits that can't be pushed, or Codex hasn't responded to the latest review request), reset the success counter to 0, reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration.
+
+**If all verifications pass**, increment the success counter. If this is the **5th consecutive success** of Step 3 → proceed to **Step 4**. Otherwise → reset Step 2 and all its sub-steps to `pending`, and go back to **Step 2: Run the review-fix loop** for another iteration to re-confirm stability.
 
-**Completion check:** All four verifications passed. Mark Step 3 as `completed`.
+**Completion check:** Step 3 has succeeded 5 consecutive times. Mark Step 3 as `completed`.
 
 ---
 
@@ -385,5 +395,5 @@ gh pr comment <pr-number> --body "<the summary markdown above>"
 - **Run address-pr-comments before fix-ci-tests** — 2B then 2C, sequentially, so CI fixes run on code that already incorporates review feedback.
 - **Pull before fixing** — always `git pull --rebase` before launching fix agents to avoid working on stale code.
 - **Stop early on APPROVE + CI green + no unresolved threads** — don't waste iterations if the PR is already clean.
-- **Respect the iteration limit** — hard stop at 10 to prevent infinite loops. If issues persist after 10 iterations, report what's left for the user to handle.
+- **Respect the iteration limit** — hard stop at 30 to prevent infinite loops. If issues persist after 30 iterations, report what's left for the user to handle.
 - **Use gate checks** — always call TaskList and verify prerequisites before starting a step. This prevents out-of-order execution.
diff --git a/.github/workflows/fuzz.yml b/.github/workflows/fuzz.yml
index c001e6a2..f6d94ce4 100644
--- a/.github/workflows/fuzz.yml
+++ b/.github/workflows/fuzz.yml
@@ -2,8 +2,9 @@ name: Fuzz Tests
 
 on:
   push:
-    branches: ['**']
+    branches: [main]
   pull_request:
+    branches: [main]
 
 permissions:
   contents: read
@@ -75,7 +76,7 @@ jobs:
           fi
           for FUNC in $FUZZ_FUNCS; do
             echo "Fuzzing $FUNC..."
-            go test -fuzz="^${FUNC}$" -fuzztime=30s ${{ matrix.pkg }} -timeout 300s
+            go test -fuzz="^${FUNC}$" -fuzztime=60s ${{ matrix.pkg }} -timeout 300s
           done
 
       # Save corpus
diff --git a/README.md b/README.md
index cc2df95a..88501e29 100644
--- a/README.md
+++ b/README.md
@@ -45,11 +45,14 @@ Every access path is default-deny:
 
 | Resource             | Default                             | Opt-in                                       |
 |----------------------|-------------------------------------|----------------------------------------------|
+| Commands (builtins)  | Blocked (exit code 1)               | Allow with `AllowedCommands` list or `AllowAllCommands` |
 | External commands    | Blocked (exit code 127)             | Provide an `ExecHandler`                     |
 | Filesystem access    | Blocked                             | Configure `AllowedPaths` with directory list |
 | Environment variables| Empty (no host env inherited)       | Pass variables via the `Env` option          |
 | Output redirections  | Only `/dev/null` allowed (exit code 2 for other targets) | `>/dev/null`, `2>/dev/null`, `&>/dev/null`, `2>&1` |
 
+**AllowedCommands** restricts which commands (builtins and external) may execute. When set, only listed commands are allowed; disallowed commands return exit code 1 with `<name>: command not allowed` (distinct from exit code 127 used for unknown commands). Shell keywords and control flow (if/else, for, pipes, `&&`/`||`, variable assignment) are unaffected. Use the CLI flag `--allow-all-commands` to permit all commands, or `--allowed-command echo,cat,...` to allow specific commands.
+
 **AllowedPaths** restricts all file operations to specified directories using Go's `os.Root` API (`openat` syscalls), making it immune to symlink traversal, TOCTOU races, and `..` escape attacks.
 
 ## Shell Features
diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md
index f63ae606..aebd1784 100644
--- a/SHELL_FEATURES.md
+++ b/SHELL_FEATURES.md
@@ -89,6 +89,7 @@ Blocked features are rejected before execution with exit code 2.
 
 ## Execution
 
+- ✅ AllowedCommands command restriction — by default all commands are blocked; use AllowedCommands to permit specific commands or AllowAllCommands to permit everything; disallowed commands return exit code 1 with `<name>: command not allowed` (distinct from exit code 127 used for unknown commands)
 - ✅ AllowedPaths filesystem sandboxing — restricts all file access to specified directories
 - ❌ External commands — blocked by default; requires an ExecHandler to be configured and the binary to be within AllowedPaths
 - ❌ Background execution: `cmd &`
diff --git a/allowedpaths/portable.go b/allowedpaths/portable.go
index 9b757172..984048e7 100644
--- a/allowedpaths/portable.go
+++ b/allowedpaths/portable.go
@@ -18,6 +18,14 @@ func PortableErrMsg(err error) string {
 	if err == nil {
 		return ""
 	}
+	// Unwrap *os.PathError so we normalise the inner Err rather than
+	// returning the full "op path: msg" string. This handles the case
+	// where the error has already been wrapped by PortablePathError
+	// (which replaces the inner Err with a plain errors.New string).
+	var pe *os.PathError
+	if errors.As(err, &pe) {
+		return PortableErrMsg(pe.Err)
+	}
 	switch {
 	case errors.Is(err, fs.ErrNotExist):
 		return "no such file or directory"
diff --git a/builtins/ls/ls.go b/builtins/ls/ls.go
index e5d72ea3..fbbe4dea 100644
--- a/builtins/ls/ls.go
+++ b/builtins/ls/ls.go
@@ -384,6 +384,23 @@ func listDir(ctx context.Context, callCtx *builtins.CallContext, dir string, opt
 	// so no post-sort slicing is needed. The effectiveLimit is already
 	// enforced by readDir's maxRead parameter.
 
+	// Print total line for long format (matching GNU ls behaviour).
+	// GNU ls prints "total <blocks>" where blocks are 1024-byte (1K) blocks
+	// by default. Since syscall.Stat_t is not available (import restriction),
+	// we approximate using file sizes: blocks = ceil(size / 1024) per file,
+	// with directories counting as 0 blocks (matching GNU ls on most FSes
+	// where directory size varies).
+	if opts.longFmt {
+		var totalBlocks int64
+		for _, ei := range infoEntries {
+			if !ei.info.IsDir() {
+				sz := ei.info.Size()
+				totalBlocks += (sz + 1023) / 1024
+			}
+		}
+		callCtx.Outf("total %d\n", totalBlocks)
+	}
+
 	// Print.
 	for _, ei := range infoEntries {
 		if ctx.Err() != nil {
diff --git a/builtins/tests/cat/helpers_test.go b/builtins/tests/cat/helpers_test.go
index 791f3267..0023ae7f 100644
--- a/builtins/tests/cat/helpers_test.go
+++ b/builtins/tests/cat/helpers_test.go
@@ -25,7 +25,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 		t.Fatal(err)
 	}
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	if err != nil {
 		t.Fatal(err)
diff --git a/builtins/tests/cut/cut_gnu_compat_test.go b/builtins/tests/cut/cut_gnu_compat_test.go
index b9e19d3e..a54aa0ff 100644
--- a/builtins/tests/cut/cut_gnu_compat_test.go
+++ b/builtins/tests/cut/cut_gnu_compat_test.go
@@ -31,6 +31,7 @@ func cutRun(t *testing.T, script, dir string) (string, string, int) {
 	opts := []interp.RunnerOption{
 		interp.StdIO(nil, &outBuf, &errBuf),
 		interp.AllowedPaths([]string{dir}),
+		interp.AllowAllCommands(),
 	}
 
 	runner, err := interp.New(opts...)
diff --git a/builtins/tests/cut/cut_pentest_test.go b/builtins/tests/cut/cut_pentest_test.go
index e05c7b11..5e8ca8b3 100644
--- a/builtins/tests/cut/cut_pentest_test.go
+++ b/builtins/tests/cut/cut_pentest_test.go
@@ -37,6 +37,7 @@ func cutPentestRunCtx(ctx context.Context, t *testing.T, script, dir string) (st
 	opts := []interp.RunnerOption{
 		interp.StdIO(nil, &outBuf, &errBuf),
 		interp.AllowedPaths([]string{dir}),
+		interp.AllowAllCommands(),
 	}
 
 	runner, err := interp.New(opts...)
diff --git a/builtins/tests/cut/cut_test.go b/builtins/tests/cut/cut_test.go
index e37b1376..c26518bc 100644
--- a/builtins/tests/cut/cut_test.go
+++ b/builtins/tests/cut/cut_test.go
@@ -32,7 +32,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 	prog, err := parser.Parse(strings.NewReader(script), "")
 	require.NoError(t, err)
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	require.NoError(t, err)
 	defer runner.Close()
diff --git a/builtins/tests/head/helpers_test.go b/builtins/tests/head/helpers_test.go
index 95caab35..0050716b 100644
--- a/builtins/tests/head/helpers_test.go
+++ b/builtins/tests/head/helpers_test.go
@@ -25,7 +25,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 		t.Fatal(err)
 	}
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	if err != nil {
 		t.Fatal(err)
diff --git a/builtins/tests/sed/sed_test.go b/builtins/tests/sed/sed_test.go
index ef4e3d14..4eed69ea 100644
--- a/builtins/tests/sed/sed_test.go
+++ b/builtins/tests/sed/sed_test.go
@@ -31,7 +31,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 	prog, err := parser.Parse(strings.NewReader(script), "")
 	require.NoError(t, err)
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	require.NoError(t, err)
 	defer runner.Close()
diff --git a/builtins/tests/tail/helpers_test.go b/builtins/tests/tail/helpers_test.go
index b8c88401..43d91b9f 100644
--- a/builtins/tests/tail/helpers_test.go
+++ b/builtins/tests/tail/helpers_test.go
@@ -25,7 +25,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 		t.Fatal(err)
 	}
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	if err != nil {
 		t.Fatal(err)
diff --git a/builtins/tests/wc/helpers_test.go b/builtins/tests/wc/helpers_test.go
index 954207ee..526f4ff6 100644
--- a/builtins/tests/wc/helpers_test.go
+++ b/builtins/tests/wc/helpers_test.go
@@ -25,7 +25,7 @@ func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ..
 		t.Fatal(err)
 	}
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 	runner, err := interp.New(allOpts...)
 	if err != nil {
 		t.Fatal(err)
diff --git a/builtins/tests/wc/wc_pentest_test.go b/builtins/tests/wc/wc_pentest_test.go
index 95cd2db4..13b6d341 100644
--- a/builtins/tests/wc/wc_pentest_test.go
+++ b/builtins/tests/wc/wc_pentest_test.go
@@ -37,6 +37,7 @@ func wcRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, st
 	opts := []interp.RunnerOption{
 		interp.StdIO(nil, &outBuf, &errBuf),
 		interp.AllowedPaths([]string{dir}),
+		interp.AllowAllCommands(),
 	}
 
 	runner, err := interp.New(opts...)
diff --git a/builtins/testutil/testutil.go b/builtins/testutil/testutil.go
index bed6dead..ad8a5208 100644
--- a/builtins/testutil/testutil.go
+++ b/builtins/testutil/testutil.go
@@ -57,7 +57,10 @@ func RunScriptCtx(ctx context.Context, t testing.TB, script, dir string, opts ..
 	require.NoError(t, err)
 
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{
+		interp.StdIO(nil, &outBuf, &errBuf),
+		interp.AllowAllCommands(),
+	}, opts...)
 	runner, err := interp.New(allOpts...)
 	require.NoError(t, err)
 	defer runner.Close()
@@ -102,7 +105,10 @@ func RunScriptDiscardCtx(ctx context.Context, t testing.TB, script, dir string,
 	require.NoError(t, err)
 
 	var errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, io.Discard, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{
+		interp.StdIO(nil, io.Discard, &errBuf),
+		interp.AllowAllCommands(),
+	}, opts...)
 	runner, err := interp.New(allOpts...)
 	require.NoError(t, err)
 	defer runner.Close()
diff --git a/cmd/rshell/main.go b/cmd/rshell/main.go
index 860f65d6..177ecc34 100644
--- a/cmd/rshell/main.go
+++ b/cmd/rshell/main.go
@@ -26,8 +26,10 @@ func main() {
 
 func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int {
 	var (
-		script       string
-		allowedPaths string
+		script          string
+		allowedPaths    string
+		allowedCommands string
+		allowAllCmds    bool
 	)
 
 	cmd := &cobra.Command{
@@ -47,11 +49,24 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int {
 
 			var paths []string
 			if allowedPaths != "" {
-				paths = strings.Split(allowedPaths, ",")
+				paths = splitAndTrim(allowedPaths)
+			}
+			var cmds []string
+			allowedCommandsSet := cmd.Flags().Changed("allowed-command")
+			if allowedCommands != "" {
+				cmds = splitAndTrim(allowedCommands)
+				if cmds == nil && allowedCommandsSet {
+					// Flag was set but splitAndTrim returned nil (e.g. ", ,").
+					// Treat as explicit deny-all, not "unset".
+					cmds = []string{}
+				}
+			} else if allowedCommandsSet {
+				// Explicitly passing an empty --allowed-command means deny-all.
+				cmds = []string{}
 			}
 
 			if scriptSet {
-				return execute(cmd.Context(), script, "", paths, stdin, stdout, stderr)
+				return execute(cmd.Context(), script, "", paths, cmds, allowAllCmds, stdin, stdout, stderr)
 			}
 
 			// Read stdin once so each execute() call gets its own
@@ -75,7 +90,7 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int {
 					src = string(data)
 					name = file
 				}
-				if err := execute(cmd.Context(), src, name, paths, bytes.NewReader(stdinData), stdout, stderr); err != nil {
+				if err := execute(cmd.Context(), src, name, paths, cmds, allowAllCmds, bytes.NewReader(stdinData), stdout, stderr); err != nil {
 					return err
 				}
 			}
@@ -90,6 +105,8 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int {
 
 	cmd.Flags().StringVarP(&script, "script", "s", "", "shell script to execute")
 	cmd.Flags().StringVarP(&allowedPaths, "allowed-path", "a", "", "comma-separated list of directories the shell is allowed to access")
+	cmd.Flags().StringVar(&allowedCommands, "allowed-command", "", "comma-separated list of allowed commands (omit to block all; use --allow-all-commands to allow everything)")
+	cmd.Flags().BoolVar(&allowAllCmds, "allow-all-commands", false, "allow all commands (overrides --allowed-command)")
 
 	if err := cmd.Execute(); err != nil {
 		var status interp.ExitStatus
@@ -102,7 +119,7 @@ func run(args []string, stdin io.Reader, stdout, stderr io.Writer) int {
 	return 0
 }
 
-func execute(ctx context.Context, script, name string, allowedPaths []string, stdin io.Reader, stdout, stderr io.Writer) error {
+func execute(ctx context.Context, script, name string, allowedPaths, allowedCommands []string, allowAllCmds bool, stdin io.Reader, stdout, stderr io.Writer) error {
 	// Parse.
 	prog, err := syntax.NewParser().Parse(strings.NewReader(script), name)
 	if err != nil {
@@ -118,6 +135,11 @@ func execute(ctx context.Context, script, name string, allowedPaths []string, st
 	if len(allowedPaths) > 0 {
 		opts = append(opts, interp.AllowedPaths(allowedPaths))
 	}
+	if allowAllCmds {
+		opts = append(opts, interp.AllowAllCommands())
+	} else if allowedCommands != nil {
+		opts = append(opts, interp.AllowedCommands(allowedCommands))
+	}
 
 	runner, err := interp.New(opts...)
 	if err != nil {
@@ -127,3 +149,25 @@ func execute(ctx context.Context, script, name string, allowedPaths []string, st
 
 	return runner.Run(ctx, prog)
 }
+
+// splitAndTrim splits s on commas and trims whitespace from each element.
+// It returns nil (not an empty slice) when the input is empty or contains
+// only whitespace/separators, so callers can distinguish "unset" from
+// "explicitly empty".
+func splitAndTrim(s string) []string {
+	if s == "" {
+		return nil
+	}
+	parts := strings.Split(s, ",")
+	result := make([]string, 0, len(parts))
+	for _, p := range parts {
+		p = strings.TrimSpace(p)
+		if p != "" {
+			result = append(result, p)
+		}
+	}
+	if len(result) == 0 {
+		return nil
+	}
+	return result
+}
diff --git a/cmd/rshell/main_test.go b/cmd/rshell/main_test.go
index 1d8e2c2a..54aa9d0f 100644
--- a/cmd/rshell/main_test.go
+++ b/cmd/rshell/main_test.go
@@ -32,19 +32,19 @@ func runCLIWithStdin(t *testing.T, stdin string, args ...string) (exitCode int,
 }
 
 func TestEcho(t *testing.T) {
-	code, stdout, _ := runCLI(t, "-s", `echo hello world`)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `echo hello world`)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "hello world\n", stdout)
 }
 
 func TestShortFlag(t *testing.T) {
-	code, stdout, _ := runCLI(t, "-s", `echo short`)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `echo short`)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "short\n", stdout)
 }
 
 func TestLongFlag(t *testing.T) {
-	code, stdout, _ := runCLI(t, "--script", `echo long`)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "--script", `echo long`)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "long\n", stdout)
 }
@@ -63,7 +63,7 @@ func TestEmptyScript(t *testing.T) {
 }
 
 func TestExitCode(t *testing.T) {
-	code, _, _ := runCLI(t, "-s", `exit 42`)
+	code, _, _ := runCLI(t, "--allow-all-commands", "-s", `exit 42`)
 	assert.Equal(t, 42, code)
 }
 
@@ -99,14 +99,14 @@ func setupTestFile(t *testing.T) (dir, filePath string) {
 
 func TestFileAccessDeniedByDefault(t *testing.T) {
 	_, filePath := setupTestFile(t)
-	code, _, stderr := runCLI(t, "-s", `cat `+filePath)
+	code, _, stderr := runCLI(t, "--allow-all-commands", "-s", `cat `+filePath)
 	assert.NotEqual(t, 0, code)
 	assert.Contains(t, stderr, "permission denied")
 }
 
 func TestAllowedPathGrantsAccess(t *testing.T) {
 	dir, filePath := setupTestFile(t)
-	code, stdout, _ := runCLI(t, "-s", `cat `+filePath, "-a", dir)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `cat `+filePath, "--allowed-path", dir)
 	assert.Equal(t, 0, code)
 	assert.Contains(t, stdout, "hello from testfile")
 }
@@ -117,19 +117,19 @@ func TestAllowedPathCommaSeparated(t *testing.T) {
 	if runtime.GOOS == "windows" {
 		extraDir = filepath.ToSlash(extraDir)
 	}
-	code, stdout, _ := runCLI(t, "-s", `cat `+filePath, "--allowed-path", dir+","+extraDir)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `cat `+filePath, "--allowed-path", dir+","+extraDir)
 	assert.Equal(t, 0, code)
 	assert.Contains(t, stdout, "hello from testfile")
 }
 
 func TestMultipleStatements(t *testing.T) {
-	code, stdout, _ := runCLI(t, "-s", "echo first\necho second")
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", "echo first\necho second")
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "first\nsecond\n", stdout)
 }
 
 func TestVariableExpansion(t *testing.T) {
-	code, stdout, _ := runCLI(t, "-s", `FOO=bar; echo $FOO`)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `FOO=bar; echo $FOO`)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "bar\n", stdout)
 }
@@ -139,6 +139,8 @@ func TestHelp(t *testing.T) {
 	assert.Equal(t, 0, code)
 	assert.Contains(t, stdout, "--script")
 	assert.Contains(t, stdout, "--allowed-path")
+	assert.Contains(t, stdout, "--allowed-command")
+	assert.Contains(t, stdout, "--allow-all-commands")
 }
 
 func TestFileArg(t *testing.T) {
@@ -146,7 +148,7 @@ func TestFileArg(t *testing.T) {
 	script := filepath.Join(dir, "test.sh")
 	require.NoError(t, os.WriteFile(script, []byte("echo from-file\n"), 0o644))
 
-	code, stdout, _ := runCLI(t, script)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", script)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "from-file\n", stdout)
 }
@@ -158,13 +160,13 @@ func TestMultipleFileArgs(t *testing.T) {
 	require.NoError(t, os.WriteFile(script1, []byte("echo first\n"), 0o644))
 	require.NoError(t, os.WriteFile(script2, []byte("echo second\n"), 0o644))
 
-	code, stdout, _ := runCLI(t, script1, script2)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", script1, script2)
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "first\nsecond\n", stdout)
 }
 
 func TestStdinDash(t *testing.T) {
-	code, stdout, _ := runCLIWithStdin(t, "echo from-stdin\n", "-")
+	code, stdout, _ := runCLIWithStdin(t, "echo from-stdin\n", "--allow-all-commands", "-")
 	assert.Equal(t, 0, code)
 	assert.Equal(t, "from-stdin\n", stdout)
 }
@@ -185,6 +187,54 @@ func TestFileNotFound(t *testing.T) {
 	assert.Contains(t, stderr, "reading /nonexistent/path/script.sh")
 }
 
+func TestAllowedCommandsRestriction(t *testing.T) {
+	code, _, stderr := runCLI(t, "--allowed-command", "echo", "-s", `cat /dev/null`)
+	assert.Equal(t, 1, code)
+	assert.Contains(t, stderr, "cat: command not allowed")
+}
+
+func TestAllowedCommandsTrimsWhitespace(t *testing.T) {
+	// "echo, true" with spaces around entries should still allow both commands.
+	code, stdout, _ := runCLI(t, "--allowed-command", "echo, true", "-s", `echo hello`)
+	assert.Equal(t, 0, code)
+	assert.Equal(t, "hello\n", stdout)
+}
+
+func TestAllowedCommandsEmpty(t *testing.T) {
+	code, _, stderr := runCLI(t, "--allowed-command", "", "-s", `echo hello`)
+	assert.Equal(t, 1, code)
+	assert.Contains(t, stderr, "echo: command not allowed")
+}
+
+func TestAllowedCommandsSeparatorOnlyDeniesAll(t *testing.T) {
+	// "--allowed-command ', ,'" should deny all commands, not silently allow all.
+	code, _, stderr := runCLI(t, "--allowed-command", ", ,", "-s", `echo hello`)
+	assert.Equal(t, 1, code)
+	assert.Contains(t, stderr, "echo: command not allowed")
+}
+
+func TestDefaultNoFlagBlocksAll(t *testing.T) {
+	// When neither --allowed-command nor --allow-all-commands is set,
+	// the default is deny-all per spec.
+	code, _, stderr := runCLI(t, "-s", `echo hello`)
+	assert.Equal(t, 1, code)
+	assert.Contains(t, stderr, "echo: command not allowed")
+}
+
+func TestAllowAllCommandsFlag(t *testing.T) {
+	// --allow-all-commands should permit everything.
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "-s", `echo hello`)
+	assert.Equal(t, 0, code)
+	assert.Equal(t, "hello\n", stdout)
+}
+
+func TestAllowAllCommandsOverridesAllowedCommands(t *testing.T) {
+	// --allow-all-commands should override --allowed-command.
+	code, stdout, _ := runCLI(t, "--allowed-command", "cat", "--allow-all-commands", "-s", `echo hello`)
+	assert.Equal(t, 0, code)
+	assert.Equal(t, "hello\n", stdout)
+}
+
 func TestFileArgWithAllowedPath(t *testing.T) {
 	dir := t.TempDir()
 	dataDir := t.TempDir()
@@ -199,7 +249,7 @@ func TestFileArgWithAllowedPath(t *testing.T) {
 	script := filepath.Join(dir, "test.sh")
 	require.NoError(t, os.WriteFile(script, []byte("cat "+dataFile+"\n"), 0o644))
 
-	code, stdout, _ := runCLI(t, "-a", dataDir, script)
+	code, stdout, _ := runCLI(t, "--allow-all-commands", "--allowed-path", dataDir, script)
 	assert.Equal(t, 0, code)
 	assert.Contains(t, stdout, "secret data")
 }
diff --git a/interp/allowed_paths_internal_test.go b/interp/allowed_paths_internal_test.go
index e39a3bb0..01266b48 100644
--- a/interp/allowed_paths_internal_test.go
+++ b/interp/allowed_paths_internal_test.go
@@ -39,12 +39,12 @@ func runScriptInternal(t *testing.T, script, dir string, opts ...RunnerOption) (
 	var outBuf, errBuf bytes.Buffer
 	allOpts := append([]RunnerOption{
 		StdIO(nil, &outBuf, &errBuf),
+		AllowAllCommands(),
 	}, opts...)
 
 	runner, err := New(allOpts...)
 	require.NoError(t, err)
 	defer runner.Close()
-
 	if dir != "" {
 		runner.Dir = dir
 	}
@@ -126,7 +126,7 @@ func TestAllowedPathsExecSymlinkEscape(t *testing.T) {
 
 func TestRunRecoversPanic(t *testing.T) {
 	var outBuf, errBuf bytes.Buffer
-	runner, err := New(StdIO(nil, &outBuf, &errBuf))
+	runner, err := New(StdIO(nil, &outBuf, &errBuf), AllowAllCommands())
 	require.NoError(t, err)
 	defer runner.Close()
 
@@ -168,3 +168,66 @@ func TestAllowedPathsExecDefaultBlocksAll(t *testing.T) {
 	assert.Equal(t, 127, exitCode)
 	assert.Contains(t, stderr, "command not found")
 }
+
+func TestAllowedCommandsWithExecHandler(t *testing.T) {
+	// Test that AllowedCommands gates execution before the ExecHandler is reached.
+	t.Run("allowed external command reaches ExecHandler", func(t *testing.T) {
+		var outBuf, errBuf bytes.Buffer
+		runner, err := New(
+			StdIO(nil, &outBuf, &errBuf),
+			AllowedCommands([]string{"echo", "myextcmd"}),
+		)
+		require.NoError(t, err)
+		defer runner.Close()
+
+		// Trigger initial reset so we can override the exec handler.
+		runner.Reset()
+
+		var execHandlerCalled bool
+		runner.execHandler = func(ctx context.Context, args []string) error {
+			execHandlerCalled = true
+			hc := HandlerCtx(ctx)
+			_, _ = hc.Stdout.Write([]byte("exec:" + args[0] + "\n"))
+			return nil
+		}
+
+		parser := syntax.NewParser()
+		prog, err := parser.Parse(strings.NewReader("myextcmd"), "")
+		require.NoError(t, err)
+
+		err = runner.Run(context.Background(), prog)
+		require.NoError(t, err)
+		assert.True(t, execHandlerCalled, "ExecHandler should be called for allowed external command")
+		assert.Equal(t, "exec:myextcmd\n", outBuf.String())
+	})
+
+	t.Run("disallowed external command blocked before ExecHandler", func(t *testing.T) {
+		var outBuf, errBuf bytes.Buffer
+		runner, err := New(
+			StdIO(nil, &outBuf, &errBuf),
+			AllowedCommands([]string{"echo"}),
+		)
+		require.NoError(t, err)
+		defer runner.Close()
+
+		// Trigger initial reset so we can override the exec handler.
+		runner.Reset()
+
+		var execHandlerCalled bool
+		runner.execHandler = func(ctx context.Context, args []string) error {
+			execHandlerCalled = true
+			return nil
+		}
+
+		parser := syntax.NewParser()
+		prog, err := parser.Parse(strings.NewReader("blockedcmd"), "")
+		require.NoError(t, err)
+
+		err = runner.Run(context.Background(), prog)
+		var es ExitStatus
+		require.True(t, errors.As(err, &es))
+		assert.Equal(t, 1, int(es))
+		assert.False(t, execHandlerCalled, "ExecHandler should NOT be called for disallowed command")
+		assert.Contains(t, errBuf.String(), "command not allowed")
+	})
+}
diff --git a/interp/allowed_paths_test.go b/interp/allowed_paths_test.go
index c11c8405..25f998fb 100644
--- a/interp/allowed_paths_test.go
+++ b/interp/allowed_paths_test.go
@@ -31,6 +31,7 @@ func runScript(t *testing.T, script, dir string, opts ...interp.RunnerOption) (s
 	var outBuf, errBuf bytes.Buffer
 	allOpts := append([]interp.RunnerOption{
 		interp.StdIO(nil, &outBuf, &errBuf),
+		interp.AllowAllCommands(),
 	}, opts...)
 
 	runner, err := interp.New(allOpts...)
@@ -200,6 +201,7 @@ func TestAllowedPathsPinsRootBeforeRun(t *testing.T) {
 	runner, err := interp.New(
 		interp.StdIO(nil, &outBuf, &errBuf),
 		interp.AllowedPaths([]string{allowed}),
+		interp.AllowAllCommands(),
 	)
 	require.NoError(t, err)
 	defer runner.Close()
@@ -239,6 +241,7 @@ func TestAllowedPathsClose(t *testing.T) {
 	dir := t.TempDir()
 	runner, err := interp.New(
 		interp.AllowedPaths([]string{dir}),
+		interp.AllowAllCommands(),
 	)
 	require.NoError(t, err)
 
@@ -251,3 +254,40 @@ func TestAllowedPathsClose(t *testing.T) {
 	require.NoError(t, runner.Close())
 	require.NoError(t, runner.Close())
 }
+
+func TestAllowAllCommandsPermitsBuiltins(t *testing.T) {
+	// AllowAllCommands should allow any registered builtin to run.
+	stdout, _, exitCode := runScript(t, `echo hello; printf "world\n"`, "")
+	assert.Equal(t, 0, exitCode)
+	assert.Equal(t, "hello\nworld\n", stdout)
+}
+
+func TestDefaultBlocksAllCommands(t *testing.T) {
+	// When no AllowedCommands or AllowAllCommands option is set,
+	// the default is to block all commands.
+	var outBuf, errBuf bytes.Buffer
+	runner, err := interp.New(
+		interp.StdIO(nil, &outBuf, &errBuf),
+	)
+	require.NoError(t, err)
+	defer runner.Close()
+
+	parser := syntax.NewParser()
+	prog, err := parser.Parse(strings.NewReader("echo hello"), "")
+	require.NoError(t, err)
+
+	err = runner.Run(context.Background(), prog)
+	var es interp.ExitStatus
+	require.True(t, errors.As(err, &es))
+	assert.Equal(t, 1, int(es))
+	assert.Contains(t, errBuf.String(), "echo: command not allowed")
+}
+
+func TestAllowedCommandsDuplicatesIgnored(t *testing.T) {
+	// Passing duplicate command names should work fine (map deduplicates).
+	stdout, _, exitCode := runScript(t, `echo hello`, "",
+		interp.AllowedCommands([]string{"echo", "echo"}),
+	)
+	assert.Equal(t, 0, exitCode)
+	assert.Equal(t, "hello\n", stdout)
+}
diff --git a/interp/api.go b/interp/api.go
index 894f095c..f8daef84 100644
--- a/interp/api.go
+++ b/interp/api.go
@@ -47,6 +47,13 @@ type runnerConfig struct {
 	// nil (default) blocks all file access; populate via AllowedPaths option.
 	sandbox *allowedpaths.Sandbox
 
+	// allowedCommands restricts which commands (builtins and external) may execute.
+	// nil (default) blocks all commands; populate via AllowedCommands to restrict
+	// to specific commands, or set allowAllCommands to permit everything.
+	// When allowAllCommands is true, the map is ignored and all commands are permitted.
+	allowedCommands  map[string]struct{}
+	allowAllCommands bool
+
 	// usedNew is set by New() and checked in Reset() to ensure a Runner
 	// was properly constructed rather than zero-initialized.
 	usedNew bool
@@ -382,6 +389,38 @@ func (r *Runner) Close() error {
 	return r.sandbox.Close()
 }
 
+// AllowedCommands restricts which commands (builtins and external) may execute.
+// Only commands in the provided list are allowed to run; passing an empty slice
+// blocks all commands. When this option is not used (default), all commands are
+// blocked — use [AllowAllCommands] to permit everything.
+// Shell keywords and control flow (if/else, for, pipes, &&/||, variable
+// assignment) are unaffected.
+//
+// If [AllowAllCommands] is also set, it takes precedence regardless of option
+// ordering — the allowlist is stored but ignored at runtime.
+//
+// Duplicate command names in the list are silently deduplicated (the map
+// insertion is idempotent), so callers do not need to pre-filter.
+func AllowedCommands(cmds []string) RunnerOption {
+	return func(r *Runner) error {
+		r.allowedCommands = make(map[string]struct{}, len(cmds))
+		for _, cmd := range cmds {
+			r.allowedCommands[cmd] = struct{}{}
+		}
+		return nil
+	}
+}
+
+// AllowAllCommands disables command filtering entirely, permitting any command
+// (builtin or external) to execute. This overrides any [AllowedCommands] list.
+func AllowAllCommands() RunnerOption {
+	return func(r *Runner) error {
+		r.allowAllCommands = true
+		r.allowedCommands = nil
+		return nil
+	}
+}
+
 // AllowedPaths restricts file and directory access to the specified directories.
 // Paths must be absolute directories that exist. When set, only files within
 // these directories can be opened, read, or executed.
diff --git a/interp/readonly_test.go b/interp/readonly_test.go
index 536b9827..898072ec 100644
--- a/interp/readonly_test.go
+++ b/interp/readonly_test.go
@@ -22,6 +22,7 @@ func TestReadonlyVariableBlocksReassignment(t *testing.T) {
 	r, err := New(
 		StdIO(nil, &stdout, &stderr),
 		Env("RO_VAR=original"),
+		AllowAllCommands(),
 	)
 	require.NoError(t, err)
 	t.Cleanup(func() { r.Close() })
diff --git a/interp/runner_exec.go b/interp/runner_exec.go
index 686dcb8e..7023983e 100644
--- a/interp/runner_exec.go
+++ b/interp/runner_exec.go
@@ -242,6 +242,21 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) {
 		return
 	}
 	name := args[0]
+	if !r.allowAllCommands {
+		if r.allowedCommands == nil {
+			// No allowedCommands set and allowAllCommands is false: deny all.
+			// Exit code 1 (not 127) to distinguish "restricted" from "not found".
+			fmt.Fprintf(r.stderr, "%s: command not allowed\n", name)
+			r.exit.code = 1
+			return
+		}
+		if _, ok := r.allowedCommands[name]; !ok {
+			// Exit code 1 (not 127) to distinguish "restricted" from "not found".
+			fmt.Fprintf(r.stderr, "%s: command not allowed\n", name)
+			r.exit.code = 1
+			return
+		}
+	}
 	if fn, ok := builtins.Lookup(name); ok {
 		call := &builtins.CallContext{
 			Stdout:       r.stdout,
diff --git a/interp/tests/if_clause_pentest_test.go b/interp/tests/if_clause_pentest_test.go
index 0d6d0529..d1bb8a21 100644
--- a/interp/tests/if_clause_pentest_test.go
+++ b/interp/tests/if_clause_pentest_test.go
@@ -33,7 +33,7 @@ func ifRunCtx(ctx context.Context, t *testing.T, script string) (string, string,
 	require.NoError(t, err)
 
 	var outBuf, errBuf bytes.Buffer
-	runner, err := interp.New(interp.StdIO(nil, &outBuf, &errBuf))
+	runner, err := interp.New(interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands())
 	require.NoError(t, err)
 	defer runner.Close()
 
diff --git a/interp/tests/redir_devnull_pentest_test.go b/interp/tests/redir_devnull_pentest_test.go
index 2c4a76e3..62f4c206 100644
--- a/interp/tests/redir_devnull_pentest_test.go
+++ b/interp/tests/redir_devnull_pentest_test.go
@@ -39,6 +39,7 @@ func pentestRedirRunCtx(ctx context.Context, t *testing.T, script, dir string) (
 	var outBuf, errBuf bytes.Buffer
 	opts := []interp.RunnerOption{
 		interp.StdIO(nil, &outBuf, &errBuf),
+		interp.AllowAllCommands(),
 	}
 	if dir != "" {
 		opts = append(opts, interp.AllowedPaths([]string{dir}))
diff --git a/interp/tests/redir_devnull_test.go b/interp/tests/redir_devnull_test.go
index e4670682..054a9550 100644
--- a/interp/tests/redir_devnull_test.go
+++ b/interp/tests/redir_devnull_test.go
@@ -40,7 +40,7 @@ func redirRunWithOpts(t *testing.T, script, dir string, opts ...interp.RunnerOpt
 	require.NoError(t, err)
 
 	var outBuf, errBuf bytes.Buffer
-	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...)
+	allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf), interp.AllowAllCommands()}, opts...)
 
 	runner, err := interp.New(allOpts...)
 	require.NoError(t, err)
diff --git a/tests/scenarios/cmd/ls/long_format/directory.yaml b/tests/scenarios/cmd/ls/long_format/directory.yaml
new file mode 100644
index 00000000..094aa34e
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/directory.yaml
@@ -0,0 +1,21 @@
+description: ls -l on a directory shows its contents in long format.
+skip_assert_against_bash: true
+setup:
+  files:
+    - path: subdir/file1.txt
+      content: "hello"
+      chmod: 0644
+    - path: subdir/file2.txt
+      content: "world"
+      chmod: 0644
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l subdir
+expect:
+  stdout_contains:
+    - "file1.txt"
+    - "file2.txt"
+    - "5"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/cmd/ls/long_format/empty_dir.yaml b/tests/scenarios/cmd/ls/long_format/empty_dir.yaml
new file mode 100644
index 00000000..d60ece41
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/empty_dir.yaml
@@ -0,0 +1,12 @@
+description: ls -l on an empty directory shows only the total line.
+skip_assert_against_bash: true  # total value may differ from GNU ls (approximated without syscall)
+setup:
+  files: []
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l
+expect:
+  stdout: "total 0\n"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/cmd/ls/long_format/multiple_files.yaml b/tests/scenarios/cmd/ls/long_format/multiple_files.yaml
new file mode 100644
index 00000000..94495727
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/multiple_files.yaml
@@ -0,0 +1,27 @@
+description: ls -l on multiple files shows each file in long format sorted alphabetically.
+skip_assert_against_bash: true
+setup:
+  files:
+    - path: bravo.txt
+      content: "bravo"
+      chmod: 0644
+    - path: alpha.txt
+      content: "alpha content"
+      chmod: 0644
+    - path: charlie.txt
+      content: "c"
+      chmod: 0644
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l
+expect:
+  stdout_contains:
+    - "alpha.txt"
+    - "bravo.txt"
+    - "charlie.txt"
+    - "13"
+    - "5"
+    - "1"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/cmd/ls/long_format/nonexistent.yaml b/tests/scenarios/cmd/ls/long_format/nonexistent.yaml
new file mode 100644
index 00000000..8a8d8464
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/nonexistent.yaml
@@ -0,0 +1,14 @@
+description: ls -l on a nonexistent file produces an error.
+skip_assert_against_bash: true  # stderr format differs from GNU ls (PortableErr normalization)
+setup:
+  files: []
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l nosuchfile
+expect:
+  stdout: ""
+  stderr_contains:
+    - "nosuchfile"
+    - "no such file or directory"
+  exit_code: 1
diff --git a/tests/scenarios/cmd/ls/long_format/permissions.yaml b/tests/scenarios/cmd/ls/long_format/permissions.yaml
new file mode 100644
index 00000000..058f93a1
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/permissions.yaml
@@ -0,0 +1,35 @@
+description: ls -l shows correct permission bits for different file modes.
+skip_assert_against_bash: true
+setup:
+  files:
+    - path: readonly.txt
+      content: "read only"
+      chmod: 0444
+    - path: executable.sh
+      content: "#!/bin/sh"
+      chmod: 0755
+    - path: normal.txt
+      content: "normal"
+      chmod: 0644
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l readonly.txt
+    ls -l executable.sh
+    ls -l normal.txt
+expect:
+  stdout_contains:
+    - "-r--r--r--"
+    - "-rwxr-xr-x"
+    - "-rw-r--r--"
+    - "readonly.txt"
+    - "executable.sh"
+    - "normal.txt"
+  stdout_contains_windows:
+    - "-r--r--r--"
+    - "-rw-rw-rw-"
+    - "readonly.txt"
+    - "executable.sh"
+    - "normal.txt"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/cmd/ls/long_format/total_line.yaml b/tests/scenarios/cmd/ls/long_format/total_line.yaml
new file mode 100644
index 00000000..192e05d4
--- /dev/null
+++ b/tests/scenarios/cmd/ls/long_format/total_line.yaml
@@ -0,0 +1,21 @@
+description: ls -l on a directory shows a total line at the top.
+skip_assert_against_bash: true
+setup:
+  files:
+    - path: a.txt
+      content: "aaa"
+      chmod: 0644
+    - path: b.txt
+      content: "bbb"
+      chmod: 0644
+input:
+  allowed_paths: ["$DIR"]
+  script: |+
+    ls -l
+expect:
+  stdout_contains:
+    - "total "
+    - "a.txt"
+    - "b.txt"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/cmd/ls/sandbox/outside_allowed_paths.yaml b/tests/scenarios/cmd/ls/sandbox/outside_allowed_paths.yaml
index bc70f890..87ee437e 100644
--- a/tests/scenarios/cmd/ls/sandbox/outside_allowed_paths.yaml
+++ b/tests/scenarios/cmd/ls/sandbox/outside_allowed_paths.yaml
@@ -11,5 +11,5 @@ input:
 expect:
   stdout: ""
   stderr: "ls: cannot access '/etc': permission denied\n"
-  stderr_windows: "ls: cannot access '/etc': statat etc: no such file or directory\n"
+  stderr_windows: "ls: cannot access '/etc': no such file or directory\n"
   exit_code: 1
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/after_echo.yaml b/tests/scenarios/cmd/unknown_cmd/basic/after_echo.yaml
index 5cd6c98d..6f6bcddf 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/after_echo.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/after_echo.yaml
@@ -1,10 +1,12 @@
 description: An unknown command after a successful echo still fails.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     echo hello
     foo
 expect:
   stdout: |+
     hello
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/before_echo.yaml b/tests/scenarios/cmd/unknown_cmd/basic/before_echo.yaml
index 89e417cb..0566028f 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/before_echo.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/before_echo.yaml
@@ -1,10 +1,12 @@
 description: An unknown command before echo on the next line does not stop execution.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo
     echo hello
 expect:
   stdout: |+
     hello
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/multiple_consecutive.yaml b/tests/scenarios/cmd/unknown_cmd/basic/multiple_consecutive.yaml
index 25fb86e4..af9d388d 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/multiple_consecutive.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/multiple_consecutive.yaml
@@ -1,10 +1,14 @@
 description: Multiple unknown commands on separate lines all execute and report errors.
 input:
+  allowed_commands: ["foo", "bar", "baz"]
   script: |+
     foo
     bar
     baz
 expect:
   stdout: ""
-  stderr_contains: ["foo", "bar", "baz", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
+    - "bar: command not found"
+    - "baz: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/multiword_name.yaml b/tests/scenarios/cmd/unknown_cmd/basic/multiword_name.yaml
index 9984d195..a4bfee42 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/multiword_name.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/multiword_name.yaml
@@ -1,8 +1,10 @@
 description: An unknown command with a multi-word-like name (with hyphens) prints "command not found".
 input:
+  allowed_commands: ["my-unknown-command"]
   script: |+
     my-unknown-command
 expect:
   stdout: ""
-  stderr_contains: ["my-unknown-command", "command not found"]
+  stderr_contains:
+    - "my-unknown-command: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/simple.yaml b/tests/scenarios/cmd/unknown_cmd/basic/simple.yaml
index a0b55b9f..3ca9577c 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/simple.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/simple.yaml
@@ -1,8 +1,10 @@
 description: A simple unknown command prints "command not found" to stderr.
 input:
+  allowed_commands: ["foo"]
   script: |+
     foo
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/stderr_separate_from_stdout.yaml b/tests/scenarios/cmd/unknown_cmd/basic/stderr_separate_from_stdout.yaml
index d49e7bef..8e908f3d 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/stderr_separate_from_stdout.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/stderr_separate_from_stdout.yaml
@@ -1,5 +1,6 @@
 description: Unknown command error goes to stderr while stdout remains clean.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd"]
   script: |+
     echo before
     nonexistent_cmd
@@ -9,6 +10,5 @@ expect:
     before
     after
   stderr_contains:
-    - "nonexistent_cmd"
-    - "command not found"
+    - "nonexistent_cmd: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/basic/underscore_name.yaml b/tests/scenarios/cmd/unknown_cmd/basic/underscore_name.yaml
index 389fada0..f8448f58 100644
--- a/tests/scenarios/cmd/unknown_cmd/basic/underscore_name.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/basic/underscore_name.yaml
@@ -1,8 +1,10 @@
 description: An unknown command with underscores in the name prints "command not found".
 input:
+  allowed_commands: ["my_unknown_command"]
   script: |+
     my_unknown_command
 expect:
   stdout: ""
-  stderr_contains: ["my_unknown_command", "command not found"]
+  stderr_contains:
+    - "my_unknown_command: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/bash.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/bash.yaml
index d9f94e68..c5aafc0c 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/bash.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/bash.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The bash command is not a builtin and is rejected as unknown.
+description: The bash command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["bash"]
   script: |+
     bash -c "echo hello"
 expect:
   stdout: ""
-  stderr: |+
-    bash: command not found
+  stderr: "bash: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/chmod.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/chmod.yaml
index d72f136f..1b463e8f 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/chmod.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/chmod.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The chmod command is not a builtin and is rejected as unknown.
+description: The chmod command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["chmod"]
   script: |+
     chmod 755 file.txt
 expect:
   stdout: ""
-  stderr: |+
-    chmod: command not found
+  stderr: "chmod: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/cp.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/cp.yaml
index a97d09ce..abce13ac 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/cp.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/cp.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The cp command is not a builtin and is rejected as unknown.
+description: The cp command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["cp"]
   script: |+
     cp source.txt dest.txt
 expect:
   stdout: ""
-  stderr: |+
-    cp: command not found
+  stderr: "cp: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/curl.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/curl.yaml
index 279be7a8..cc87eb51 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/curl.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/curl.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The curl command is not a builtin and is rejected as unknown.
+description: The curl command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["curl"]
   script: |+
     curl http://example.com
 expect:
   stdout: ""
-  stderr: |+
-    curl: command not found
+  stderr_contains: ["curl: command not found"]
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/mv.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/mv.yaml
index 44a18b31..15401baf 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/mv.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/mv.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The mv command is not a builtin and is rejected as unknown.
+description: The mv command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["mv"]
   script: |+
     mv old.txt new.txt
 expect:
   stdout: ""
-  stderr: |+
-    mv: command not found
+  stderr: "mv: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/python.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/python.yaml
index 27a9f647..f96a832f 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/python.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/python.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The python command is not a builtin and is rejected as unknown.
+description: The python command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["python"]
   script: |+
     python -c "print('hello')"
 expect:
   stdout: ""
-  stderr: |+
-    python: command not found
+  stderr_contains: ["python: command not found"]
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/rm.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/rm.yaml
index a65d58f2..5a849caf 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/rm.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/rm.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The rm command is not a builtin and is rejected as unknown.
+description: The rm command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["rm"]
   script: |+
     rm file.txt
 expect:
   stdout: ""
-  stderr: |+
-    rm: command not found
+  stderr: "rm: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/sh.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/sh.yaml
index d9cb1edd..f72bcd78 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/sh.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/sh.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The sh command is not a builtin and is rejected as unknown.
+description: The sh command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["sh"]
   script: |+
     sh -c "echo hello"
 expect:
   stdout: ""
-  stderr: |+
-    sh: command not found
+  stderr: "sh: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/common_progs/wget.yaml b/tests/scenarios/cmd/unknown_cmd/common_progs/wget.yaml
index 924a8c5c..69cffef9 100644
--- a/tests/scenarios/cmd/unknown_cmd/common_progs/wget.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/common_progs/wget.yaml
@@ -1,10 +1,10 @@
 skip_assert_against_bash: true
-description: The wget command is not a builtin and is rejected as unknown.
+description: The wget command is not a builtin and is rejected as not found.
 input:
+  allowed_commands: ["wget"]
   script: |+
     wget http://example.com
 expect:
   stdout: ""
-  stderr: |+
-    wget: command not found
+  stderr_contains: ["wget: command not found"]
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/control_flow/or_chain_with_unknown.yaml b/tests/scenarios/cmd/unknown_cmd/control_flow/or_chain_with_unknown.yaml
index 7f9ec22d..54e81275 100644
--- a/tests/scenarios/cmd/unknown_cmd/control_flow/or_chain_with_unknown.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/control_flow/or_chain_with_unknown.yaml
@@ -1,5 +1,6 @@
 description: Unknown command in logical OR chain falls through to the next command.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd"]
   script: |+
     nonexistent_cmd || echo "recovered"
     echo $?
@@ -8,6 +9,5 @@ expect:
     recovered
     0
   stderr_contains:
-    - "nonexistent_cmd"
-    - "command not found"
+    - "nonexistent_cmd: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/and_operator_skips.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/and_operator_skips.yaml
index 3f6fcdf4..62f68005 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/and_operator_skips.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/and_operator_skips.yaml
@@ -1,8 +1,10 @@
 description: The && operator skips the next command after an unknown command.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo && echo should_not_run
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/and_or_chain.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/and_or_chain.yaml
index fccadcd9..94abc363 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/and_or_chain.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/and_or_chain.yaml
@@ -1,9 +1,11 @@
 description: Mixed && and || operators work correctly with unknown commands.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo && echo no || echo yes
 expect:
   stdout: |+
     yes
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_127.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_127.yaml
index 9f8a58e2..868d84c4 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_127.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_127.yaml
@@ -1,8 +1,10 @@
 description: An unknown command sets exit code 127.
 input:
+  allowed_commands: ["nonexistent"]
   script: |+
     nonexistent
 expect:
   stdout: ""
-  stderr_contains: ["nonexistent", "command not found"]
+  stderr_contains:
+    - "nonexistent: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_captured.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_captured.yaml
index f340604d..7887450d 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_captured.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/exit_code_captured.yaml
@@ -1,9 +1,11 @@
 description: The exit code of an unknown command is captured by $? when using semicolons.
 input:
+  allowed_commands: ["echo", "nonexistent"]
   script: |+
     nonexistent; echo $?
 expect:
   stdout: |+
     127
-  stderr_contains: ["nonexistent", "command not found"]
+  stderr_contains:
+    - "nonexistent: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/or_chain.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/or_chain.yaml
index d45688f6..34438037 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/or_chain.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/or_chain.yaml
@@ -1,9 +1,12 @@
 description: A chain of unknown commands with || reaches the first valid command.
 input:
+  allowed_commands: ["echo", "foo", "bar"]
   script: |+
     foo || bar || echo reached
 expect:
   stdout: |+
     reached
-  stderr_contains: ["foo", "bar", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
+    - "bar: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/or_operator_continues.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/or_operator_continues.yaml
index 846e8581..ff3aba8f 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/or_operator_continues.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/or_operator_continues.yaml
@@ -1,9 +1,11 @@
 description: The || operator runs the next command after an unknown command.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo || echo fallback
 expect:
   stdout: |+
     fallback
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/exit_code/semicolon_continues.yaml b/tests/scenarios/cmd/unknown_cmd/exit_code/semicolon_continues.yaml
index 331449a2..1dfeeca0 100644
--- a/tests/scenarios/cmd/unknown_cmd/exit_code/semicolon_continues.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/exit_code/semicolon_continues.yaml
@@ -1,9 +1,11 @@
 description: A semicolon separator allows execution to continue after an unknown command.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo; echo after
 expect:
   stdout: |+
     after
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 0
diff --git a/tests/scenarios/cmd/unknown_cmd/with_args/args.yaml b/tests/scenarios/cmd/unknown_cmd/with_args/args.yaml
index 41acf64e..fb49f02a 100644
--- a/tests/scenarios/cmd/unknown_cmd/with_args/args.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/with_args/args.yaml
@@ -1,8 +1,10 @@
 description: An unknown command with arguments still prints "command not found" for the command name.
 input:
+  allowed_commands: ["foo"]
   script: |+
     foo arg1 arg2 arg3
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/with_args/flags.yaml b/tests/scenarios/cmd/unknown_cmd/with_args/flags.yaml
index b0320d12..5aabf677 100644
--- a/tests/scenarios/cmd/unknown_cmd/with_args/flags.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/with_args/flags.yaml
@@ -1,8 +1,10 @@
 description: An unknown command with flag-like arguments still prints "command not found".
 input:
+  allowed_commands: ["foo"]
   script: |+
     foo --verbose -n 5
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/with_args/quoted_args.yaml b/tests/scenarios/cmd/unknown_cmd/with_args/quoted_args.yaml
index bba97c3f..422ad221 100644
--- a/tests/scenarios/cmd/unknown_cmd/with_args/quoted_args.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/with_args/quoted_args.yaml
@@ -1,8 +1,10 @@
 description: An unknown command with quoted arguments still prints "command not found".
 input:
+  allowed_commands: ["foo"]
   script: |+
     foo "hello world" 'single quoted'
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/cmd/unknown_cmd/with_args/variable_arg.yaml b/tests/scenarios/cmd/unknown_cmd/with_args/variable_arg.yaml
index b6c45c13..a7222c51 100644
--- a/tests/scenarios/cmd/unknown_cmd/with_args/variable_arg.yaml
+++ b/tests/scenarios/cmd/unknown_cmd/with_args/variable_arg.yaml
@@ -1,9 +1,11 @@
 description: An unknown command with a variable-expanded argument still prints "command not found".
 input:
+  allowed_commands: ["foo"]
   script: |+
     MY_VAR=hello
     foo $MY_VAR
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/shell/allowed_commands/allowed_runs.yaml b/tests/scenarios/shell/allowed_commands/allowed_runs.yaml
new file mode 100644
index 00000000..a4132062
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/allowed_runs.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Command in allowlist runs successfully
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    echo hello
+expect:
+  stdout: "hello\n"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/shell/allowed_commands/case_sensitive.yaml b/tests/scenarios/shell/allowed_commands/case_sensitive.yaml
new file mode 100644
index 00000000..09c4e0a6
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/case_sensitive.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Command names in allowlist are case-sensitive
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    Echo hello
+expect:
+  stdout: ""
+  stderr: "Echo: command not allowed\n"
+  exit_code: 1
diff --git a/tests/scenarios/shell/allowed_commands/command_substitution_blocked.yaml b/tests/scenarios/shell/allowed_commands/command_substitution_blocked.yaml
new file mode 100644
index 00000000..973dc39d
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/command_substitution_blocked.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Command substitution is blocked by the shell itself, not by AllowedCommands
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    echo $(cat /dev/null)
+expect:
+  stdout: ""
+  stderr: "command substitution is not supported\n"
+  exit_code: 2
diff --git a/tests/scenarios/shell/allowed_commands/default_blocks_all.yaml b/tests/scenarios/shell/allowed_commands/default_blocks_all.yaml
new file mode 100644
index 00000000..8173af41
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/default_blocks_all.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Empty AllowedCommands list blocks all commands
+input:
+  allowed_commands: []
+  script: |+
+    echo hello
+expect:
+  stdout: ""
+  stderr: "echo: command not allowed\n"
+  exit_code: 1
diff --git a/tests/scenarios/shell/allowed_commands/disallowed_blocked.yaml b/tests/scenarios/shell/allowed_commands/disallowed_blocked.yaml
new file mode 100644
index 00000000..3e739a08
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/disallowed_blocked.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Command not in allowlist returns exit 1 with error message
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    cat foo
+expect:
+  stdout: ""
+  stderr: "cat: command not allowed\n"
+  exit_code: 1
diff --git a/tests/scenarios/shell/allowed_commands/disallowed_in_for_loop.yaml b/tests/scenarios/shell/allowed_commands/disallowed_in_for_loop.yaml
new file mode 100644
index 00000000..f2394476
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/disallowed_in_for_loop.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Disallowed command in for-loop body is blocked
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    for i in a b; do cat /dev/null; done
+expect:
+  stdout: ""
+  stderr: "cat: command not allowed\ncat: command not allowed\n"
+  exit_code: 1
diff --git a/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline.yaml b/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline.yaml
new file mode 100644
index 00000000..8cfc632d
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Disallowed command in left side of pipe is blocked
+input:
+  allowed_commands: ["cat"]
+  script: |+
+    echo hello | cat
+expect:
+  stdout: ""
+  stderr: "echo: command not allowed\n"
+  exit_code: 0
diff --git a/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline_right.yaml b/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline_right.yaml
new file mode 100644
index 00000000..c3df3478
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/disallowed_in_pipeline_right.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Disallowed command in right side of pipe is blocked
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    echo hello | cat
+expect:
+  stdout: ""
+  stderr: "cat: command not allowed\n"
+  exit_code: 1
diff --git a/tests/scenarios/shell/allowed_commands/external_no_handler.yaml b/tests/scenarios/shell/allowed_commands/external_no_handler.yaml
new file mode 100644
index 00000000..5e284f9f
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/external_no_handler.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: External command in allowlist with no ExecHandler produces command not found exit 127
+input:
+  allowed_commands: ["echo", "nonexistent_external_cmd"]
+  script: |+
+    nonexistent_external_cmd
+expect:
+  stdout: ""
+  stderr: "nonexistent_external_cmd: command not found\n"
+  exit_code: 127
diff --git a/tests/scenarios/shell/allowed_commands/keywords_still_work.yaml b/tests/scenarios/shell/allowed_commands/keywords_still_work.yaml
new file mode 100644
index 00000000..f3e3d9e1
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/keywords_still_work.yaml
@@ -0,0 +1,13 @@
+skip_assert_against_bash: true
+description: Shell keywords (if/else, pipes, &&/||) still work with restricted commands
+input:
+  allowed_commands: ["echo", "true", "false", "cat"]
+  script: |+
+    if true; then echo yes; else echo no; fi
+    false || echo fallback
+    true && echo chained
+    echo piped | cat
+expect:
+  stdout: "yes\nfallback\nchained\npiped\n"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/shell/allowed_commands/multiple_allowed.yaml b/tests/scenarios/shell/allowed_commands/multiple_allowed.yaml
new file mode 100644
index 00000000..b4848f21
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/multiple_allowed.yaml
@@ -0,0 +1,12 @@
+skip_assert_against_bash: true
+description: Multiple allowed commands work
+input:
+  allowed_commands: ["echo", "true"]
+  script: |+
+    echo hello
+    true
+    echo world
+expect:
+  stdout: "hello\nworld\n"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/shell/allowed_commands/subshell_blocked.yaml b/tests/scenarios/shell/allowed_commands/subshell_blocked.yaml
new file mode 100644
index 00000000..a6495e8b
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/subshell_blocked.yaml
@@ -0,0 +1,10 @@
+skip_assert_against_bash: true
+description: Subshell is blocked by the shell itself, not by AllowedCommands
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    (cat /dev/null)
+expect:
+  stdout: ""
+  stderr: "subshells are not supported\n"
+  exit_code: 2
diff --git a/tests/scenarios/shell/allowed_commands/variable_assignment_works.yaml b/tests/scenarios/shell/allowed_commands/variable_assignment_works.yaml
new file mode 100644
index 00000000..89decb7d
--- /dev/null
+++ b/tests/scenarios/shell/allowed_commands/variable_assignment_works.yaml
@@ -0,0 +1,11 @@
+skip_assert_against_bash: true
+description: Variable assignment works without needing to be in allowed commands
+input:
+  allowed_commands: ["echo"]
+  script: |+
+    FOO=bar
+    echo $FOO
+expect:
+  stdout: "bar\n"
+  stderr: ""
+  exit_code: 0
diff --git a/tests/scenarios/shell/allowed_paths/symlink_chain_escape.yaml b/tests/scenarios/shell/allowed_paths/symlink_chain_escape.yaml
index 0e8039f6..4684a532 100644
--- a/tests/scenarios/shell/allowed_paths/symlink_chain_escape.yaml
+++ b/tests/scenarios/shell/allowed_paths/symlink_chain_escape.yaml
@@ -14,5 +14,5 @@ input:
     cat allowed/start
 expect:
   stdout: ""
-  stderr: "cat: allowed/start: openat start: path escapes from parent\n"
+  stderr: "cat: allowed/start: path escapes from parent\n"
   exit_code: 1
diff --git a/tests/scenarios/shell/allowed_paths/symlink_escape_to_dir.yaml b/tests/scenarios/shell/allowed_paths/symlink_escape_to_dir.yaml
index 89b37aa3..e4f5c1c9 100644
--- a/tests/scenarios/shell/allowed_paths/symlink_escape_to_dir.yaml
+++ b/tests/scenarios/shell/allowed_paths/symlink_escape_to_dir.yaml
@@ -12,6 +12,5 @@ input:
     cat allowed/escape_dir/data.txt
 expect:
   stdout: ""
-  stderr: "cat: allowed/escape_dir/data.txt: openat escape_dir/data.txt: path escapes from parent\n"
-  stderr_windows: "cat: allowed/escape_dir/data.txt: openat escape_dir\\data.txt: path escapes from parent\n"
+  stderr: "cat: allowed/escape_dir/data.txt: path escapes from parent\n"
   exit_code: 1
diff --git a/tests/scenarios/shell/allowed_paths/symlink_escape_to_file.yaml b/tests/scenarios/shell/allowed_paths/symlink_escape_to_file.yaml
index 240e9d8a..ee63064b 100644
--- a/tests/scenarios/shell/allowed_paths/symlink_escape_to_file.yaml
+++ b/tests/scenarios/shell/allowed_paths/symlink_escape_to_file.yaml
@@ -12,5 +12,5 @@ input:
     cat allowed/escape_link
 expect:
   stdout: ""
-  stderr: "cat: allowed/escape_link: openat escape_link: path escapes from parent\n"
+  stderr: "cat: allowed/escape_link: path escapes from parent\n"
   exit_code: 1
diff --git a/tests/scenarios/shell/blocked_commands/blocked_eval.yaml b/tests/scenarios/shell/blocked_commands/blocked_eval.yaml
index 6c134cd5..5c07bb3c 100644
--- a/tests/scenarios/shell/blocked_commands/blocked_eval.yaml
+++ b/tests/scenarios/shell/blocked_commands/blocked_eval.yaml
@@ -2,10 +2,10 @@
 skip_assert_against_bash: true
 description: Eval command is not available.
 input:
+  allowed_commands: ["echo", "eval"]
   script: |+
     eval echo hello
 expect:
   stdout: ""
-  stderr: |+
-    eval: command not found
+  stderr: "eval: command not found\n"
   exit_code: 127
diff --git a/tests/scenarios/shell/cmd_separator/exit_code/unknown_command_continues.yaml b/tests/scenarios/shell/cmd_separator/exit_code/unknown_command_continues.yaml
index 0633fe60..f0f91b87 100644
--- a/tests/scenarios/shell/cmd_separator/exit_code/unknown_command_continues.yaml
+++ b/tests/scenarios/shell/cmd_separator/exit_code/unknown_command_continues.yaml
@@ -1,9 +1,11 @@
 description: An unknown command does not prevent subsequent commands from running.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd"]
   script: |+
     nonexistent_cmd; echo after
 expect:
   stdout: |+
     after
-  stderr_contains: ["nonexistent_cmd", "command not found"]
+  stderr_contains:
+    - "nonexistent_cmd: command not found"
   exit_code: 0
diff --git a/tests/scenarios/shell/errors/command_not_found.yaml b/tests/scenarios/shell/errors/command_not_found.yaml
index e509f86e..0b7bb8b4 100644
--- a/tests/scenarios/shell/errors/command_not_found.yaml
+++ b/tests/scenarios/shell/errors/command_not_found.yaml
@@ -1,10 +1,11 @@
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Unknown command returns exit code 127.
 input:
+  allowed_commands: ["no_such_command_xyz"]
   script: |+
     no_such_command_xyz
 expect:
   stdout: ""
-  stderr: |+
-    no_such_command_xyz: command not found
+  stderr_contains: ["no_such_command_xyz: command not found"]
   exit_code: 127
diff --git a/tests/scenarios/shell/errors/command_not_found_exit_code.yaml b/tests/scenarios/shell/errors/command_not_found_exit_code.yaml
index 9b0d65f4..b4f88fbc 100644
--- a/tests/scenarios/shell/errors/command_not_found_exit_code.yaml
+++ b/tests/scenarios/shell/errors/command_not_found_exit_code.yaml
@@ -1,13 +1,14 @@
-# skip: command-not-found error format differs from bash (no script:line prefix)
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Multiple unknown commands each produce errors, last exit code wins.
 input:
+  allowed_commands: ["unknown_cmd_1", "unknown_cmd_2"]
   script: |+
     unknown_cmd_1
     unknown_cmd_2
 expect:
   stdout: ""
-  stderr: |+
-    unknown_cmd_1: command not found
-    unknown_cmd_2: command not found
+  stderr_contains:
+    - "unknown_cmd_1: command not found"
+    - "unknown_cmd_2: command not found"
   exit_code: 127
diff --git a/tests/scenarios/shell/errors/command_not_found_in_if.yaml b/tests/scenarios/shell/errors/command_not_found_in_if.yaml
index 3950e283..333407b0 100644
--- a/tests/scenarios/shell/errors/command_not_found_in_if.yaml
+++ b/tests/scenarios/shell/errors/command_not_found_in_if.yaml
@@ -1,10 +1,11 @@
-# skip: error message format differs from bash
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Command not found in if condition causes else branch to execute.
 input:
+  allowed_commands: ["echo", "notacmd"]
   script: |+
     if notacmd; then echo yes; else echo no; fi
 expect:
   stdout: "no\n"
-  stderr: "notacmd: command not found\n"
+  stderr_contains: ["notacmd: command not found"]
   exit_code: 0
diff --git a/tests/scenarios/shell/errors/command_not_found_in_pipeline.yaml b/tests/scenarios/shell/errors/command_not_found_in_pipeline.yaml
index f1e2c1e6..e2c5de81 100644
--- a/tests/scenarios/shell/errors/command_not_found_in_pipeline.yaml
+++ b/tests/scenarios/shell/errors/command_not_found_in_pipeline.yaml
@@ -1,7 +1,8 @@
-# skip: command-not-found error format differs from bash (no script:line prefix)
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Unknown command in a pipeline produces error but pipeline continues.
 input:
+  allowed_commands: ["echo", "unknown_filter_cmd"]
   script: |+
     echo hello | unknown_filter_cmd
 expect:
diff --git a/tests/scenarios/shell/errors/error_exit_code_propagation.yaml b/tests/scenarios/shell/errors/error_exit_code_propagation.yaml
index 6adc5012..7dfa990f 100644
--- a/tests/scenarios/shell/errors/error_exit_code_propagation.yaml
+++ b/tests/scenarios/shell/errors/error_exit_code_propagation.yaml
@@ -1,13 +1,13 @@
-# skip: command-not-found error format differs from bash (no script:line prefix)
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Exit code from failed command is available in $?.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd_xyz"]
   script: |+
     nonexistent_cmd_xyz
     echo "$?"
 expect:
   stdout: |+
     127
-  stderr: |+
-    nonexistent_cmd_xyz: command not found
+  stderr_contains: ["nonexistent_cmd_xyz: command not found"]
   exit_code: 0
diff --git a/tests/scenarios/shell/errors/multiple_command_not_found.yaml b/tests/scenarios/shell/errors/multiple_command_not_found.yaml
index 91270ccd..9c161876 100644
--- a/tests/scenarios/shell/errors/multiple_command_not_found.yaml
+++ b/tests/scenarios/shell/errors/multiple_command_not_found.yaml
@@ -1,15 +1,16 @@
-# skip: error message format differs from bash
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Multiple unknown commands each produce command-not-found errors.
 input:
+  allowed_commands: ["notacmd1", "notacmd2", "notacmd3"]
   script: |+
     notacmd1
     notacmd2
     notacmd3
 expect:
   stdout: ""
-  stderr: |+
-    notacmd1: command not found
-    notacmd2: command not found
-    notacmd3: command not found
+  stderr_contains:
+    - "notacmd1: command not found"
+    - "notacmd2: command not found"
+    - "notacmd3: command not found"
   exit_code: 127
diff --git a/tests/scenarios/shell/errors/syntax_error_kills_shell.yaml b/tests/scenarios/shell/errors/syntax_error_kills_shell.yaml
index 2a8532e8..36bc1769 100644
--- a/tests/scenarios/shell/errors/syntax_error_kills_shell.yaml
+++ b/tests/scenarios/shell/errors/syntax_error_kills_shell.yaml
@@ -1,6 +1,8 @@
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Unknown command after valid command still produces error.
 input:
+  allowed_commands: ["echo", "no_such_cmd_abc"]
   script: |+
     echo before
     no_such_cmd_abc
@@ -9,6 +11,5 @@ expect:
   stdout: |+
     before
     after
-  stderr: |+
-    no_such_cmd_abc: command not found
+  stderr_contains: ["no_such_cmd_abc: command not found"]
   exit_code: 0
diff --git a/tests/scenarios/shell/errors/valid_after_error.yaml b/tests/scenarios/shell/errors/valid_after_error.yaml
index b9ca56a8..aa366372 100644
--- a/tests/scenarios/shell/errors/valid_after_error.yaml
+++ b/tests/scenarios/shell/errors/valid_after_error.yaml
@@ -1,7 +1,8 @@
-# skip: command-not-found error format differs from bash (no script:line prefix)
-skip_assert_against_bash: true
+# stderr_contains (not stderr) because bash prefixes errors with "bash: line N:"
+# while rshell uses a simpler format. This enables bash comparison testing.
 description: Valid commands execute normally after a command-not-found error.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd"]
   script: |+
     echo first
     nonexistent_cmd
@@ -10,6 +11,5 @@ expect:
   stdout: |+
     first
     third
-  stderr: |+
-    nonexistent_cmd: command not found
+  stderr_contains: ["nonexistent_cmd: command not found"]
   exit_code: 0
diff --git a/tests/scenarios/shell/for_clause/exit_code/unknown_cmd_in_body.yaml b/tests/scenarios/shell/for_clause/exit_code/unknown_cmd_in_body.yaml
index cb29f080..0c8efcde 100644
--- a/tests/scenarios/shell/for_clause/exit_code/unknown_cmd_in_body.yaml
+++ b/tests/scenarios/shell/for_clause/exit_code/unknown_cmd_in_body.yaml
@@ -1,8 +1,10 @@
 description: Unknown command in for loop body produces error on stderr.
 input:
+  allowed_commands: ["nonexistent_cmd"]
   script: |+
     for i in a; do nonexistent_cmd; done
 expect:
   stdout: ""
-  stderr_contains: ["nonexistent_cmd", "command not found"]
+  stderr_contains:
+    - "nonexistent_cmd: command not found"
   exit_code: 127
diff --git a/tests/scenarios/shell/negation/basic/negate_unknown_cmd.yaml b/tests/scenarios/shell/negation/basic/negate_unknown_cmd.yaml
index dae3b22a..13c421ff 100644
--- a/tests/scenarios/shell/negation/basic/negate_unknown_cmd.yaml
+++ b/tests/scenarios/shell/negation/basic/negate_unknown_cmd.yaml
@@ -1,5 +1,6 @@
 description: Negating a command-not-found (exit 127) produces exit code 0.
 input:
+  allowed_commands: ["nonexistent_command_xyz"]
   script: |+
     ! nonexistent_command_xyz
 expect:
diff --git a/tests/scenarios/shell/pipe/errors/left.yaml b/tests/scenarios/shell/pipe/errors/left.yaml
index 6671cd5a..9b1e6f12 100644
--- a/tests/scenarios/shell/pipe/errors/left.yaml
+++ b/tests/scenarios/shell/pipe/errors/left.yaml
@@ -1,9 +1,11 @@
 description: Unknown command on left side of pipe sends error to stderr; right side still runs.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     foo | echo ok
 expect:
   stdout: |+
     ok
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 0
diff --git a/tests/scenarios/shell/pipe/errors/right.yaml b/tests/scenarios/shell/pipe/errors/right.yaml
index 64faa8da..07c680d3 100644
--- a/tests/scenarios/shell/pipe/errors/right.yaml
+++ b/tests/scenarios/shell/pipe/errors/right.yaml
@@ -1,8 +1,10 @@
 description: Unknown command on right side of pipe sets exit code 127.
 input:
+  allowed_commands: ["echo", "foo"]
   script: |+
     echo ok | foo
 expect:
   stdout: ""
-  stderr_contains: ["foo", "command not found"]
+  stderr_contains:
+    - "foo: command not found"
   exit_code: 127
diff --git a/tests/scenarios/shell/var_expand/special_variables/status_after_unknown_cmd.yaml b/tests/scenarios/shell/var_expand/special_variables/status_after_unknown_cmd.yaml
index fac0e872..3fb939a9 100644
--- a/tests/scenarios/shell/var_expand/special_variables/status_after_unknown_cmd.yaml
+++ b/tests/scenarios/shell/var_expand/special_variables/status_after_unknown_cmd.yaml
@@ -1,5 +1,6 @@
 description: The $? variable is 127 after an unknown command.
 input:
+  allowed_commands: ["echo", "nonexistent_cmd"]
   script: |+
     nonexistent_cmd
     echo $?
@@ -7,5 +8,5 @@ expect:
   stdout: |+
     127
   stderr_contains:
-    - "not found"
+    - "command not found"
   exit_code: 0
diff --git a/tests/scenarios_test.go b/tests/scenarios_test.go
index 3f421904..04019d8a 100644
--- a/tests/scenarios_test.go
+++ b/tests/scenarios_test.go
@@ -59,9 +59,10 @@ type input struct {
 	Envs map[string]string `yaml:"envs"`
 	// InterpreterEnv sets initial environment variables for the restricted
 	// interpreter via the Env RunnerOption. These are passed as "KEY=value" pairs.
-	InterpreterEnv map[string]string `yaml:"interpreter_env"`
-	Script         string            `yaml:"script"`
-	AllowedPaths   []string          `yaml:"allowed_paths"` // relative to test temp dir; "$DIR" resolves to temp dir itself
+	InterpreterEnv  map[string]string `yaml:"interpreter_env"`
+	Script          string            `yaml:"script"`
+	AllowedPaths    []string          `yaml:"allowed_paths"`    // relative to test temp dir; "$DIR" resolves to temp dir itself
+	AllowedCommands []string          `yaml:"allowed_commands"` // restrict which commands may execute
 }
 
 // expected holds the expected output for a scenario.
@@ -159,6 +160,14 @@ func runScenario(t *testing.T, sc scenario) {
 		}
 		opts = append(opts, interp.Env(pairs...))
 	}
+	if sc.Input.AllowedCommands != nil {
+		opts = append(opts, interp.AllowedCommands(sc.Input.AllowedCommands))
+	} else {
+		// Default: allow all commands for test convenience so that
+		// existing scenarios (without allowed_commands) keep working.
+		// Note: the production default (no option) blocks all commands.
+		opts = append(opts, interp.AllowAllCommands())
+	}
 	if sc.Input.AllowedPaths != nil {
 		resolved := make([]string, len(sc.Input.AllowedPaths))
 		for i, p := range sc.Input.AllowedPaths {