DataDog · thieman · Mar 13, 2026 · Mar 12, 2026 · Mar 12, 2026 · Mar 12, 2026
@@ -72,6 +72,7 @@ This repo has the following CI jobs (defined in `.github/workflows/`):
 | `test.yml` | `Test (windows-latest)` | `go test -race -v ./...` on Windows |
 | `test.yml` | `Test against Bash (Docker)` | `RSHELL_BASH_TEST=1 go test -v -run TestShellScenariosAgainstBash ./tests/` |
 | `compliance.yml` | `compliance` | `RSHELL_COMPLIANCE_TEST=1 go test -v -run TestCompliance ./tests/` |
+| `fuzz.yml` | `Fuzz (<name>)` | Runs each `Fuzz*` function for 30 s per function; matrix across all builtin packages |
 
 Classify each failure:
 
@@ -83,6 +84,7 @@ Classify each failure:
 | **Bash comparison failure** | YAML scenario output differs from bash | Use the `fix-tests` skill workflow (determine what bash does, then fix) |
 | **Compliance failure** | Compliance check fails | Read the compliance test to understand the rule, then fix the violation |
 | **Platform-specific failure** | Passes on some OSes but not others | Check for platform-dependent behavior (path separators, line endings, etc.) |
+| **Fuzz failure** | A `Fuzz*` test found an input that caused an unexpected exit code or error | See fuzz fix workflow below |
 
 ### 4. Reproduce failures locally
 
@@ -146,6 +148,23 @@ For each failure, apply the appropriate fix:
 2. Use `stdout_windows`/`stderr_windows` fields in YAML scenarios for Windows-specific output
 3. Use build tags (`//go:build unix` / `//go:build windows`) for platform-specific test files
 
+**Fuzz failures:**
+
+The CI logs will contain the failing input inline, e.g.:
+```
+--- FAIL: FuzzGrepFixedStrings
+    grep_fuzz_test.go:240: grep -F unexpected exit code 2
+    Failing input written to testdata/fuzz/FuzzGrepFixedStrings/abc123
+    To re-run: go test -run=FuzzGrepFixedStrings/abc123
+```
+
+1. Read the failing input from the log (it is printed as a `go test fuzz v1` file)
+2. Create the corpus file manually at `interp/builtins/tests/<pkg>/testdata/fuzz/<FuzzFuncName>/<hash>` with that content
+3. Reproduce locally: `go test -run=FuzzFuncName/hash ./interp/builtins/tests/<pkg>/`
+4. Fix the bug in the implementation (never weaken the fuzz filter to hide the bug)
+5. Verify the corpus entry now passes: `go test -run=FuzzFuncName/hash ./interp/builtins/tests/<pkg>/`
+6. **Commit the corpus file** — it becomes a permanent regression test
+
 ### 7. Verify all fixes
 
 Run the full test suite locally:

@@ -90,7 +90,24 @@ For failures where the test expectation is wrong (not matching bash):
    RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash/<scenario> -timeout 120s -v
    ```
 
-### 6. Verify all fixes
+### 6. Fix fuzz failures
+
+If a `Fuzz*` test is failing (either a fuzzer-discovered corpus entry or a seed):
+
+1. Run it to see the error: `go test -v -run FuzzFuncName/corpushash ./interp/builtins/tests/<pkg>/`
+2. Fix the **implementation** — never weaken the fuzz input filter to hide the bug
+3. If the fix is to the input filter (e.g. the input is legitimately unsupported), that is also acceptable, but the reason must be clear from a comment
+4. **Always commit the failing corpus file** at `testdata/fuzz/<FuzzFuncName>/<hash>` — it becomes a permanent regression test
+
+To reproduce a fuzzer-found crash from a log message, create the corpus file manually:
+```
+go test fuzz v1
+[]byte("...")
+string("...")
+```
+Place it at `interp/builtins/tests/<pkg>/testdata/fuzz/<FuzzFuncName>/<hash>` and re-run.
+
+### 7. Verify all fixes
 
 After all fixes are applied, run the full test suite:
 

@@ -14,7 +14,7 @@ You MUST follow this execution protocol. Skipping steps has caused defects in ev
 
 ### 1. Create the full task list FIRST
 
-Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 9 times, once for each step below (Steps 1–9). Use these exact subjects:
+Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 10 times, once for each step below (Steps 1–10). Use these exact subjects:
 
 1. "Step 1: Research the command"
 2. "Step 2: User confirms which flags to implement"
@@ -24,7 +24,8 @@ Your very first action — before reading ANY files, before writing ANY code —
 6. "Step 6: Verify and Harden"
 7. "Step 7: Code review"
 8. "Step 8: Exploratory pentest"
-9. "Step 9: Update documentation"
+9. "Step 9: Write fuzz tests"
+10. "Step 10: Update documentation"
 
 ### 2. Execution order and gating
 
@@ -38,7 +39,7 @@ Step 1 → Step 2 → Steps 3 + 4 + 5 (parallel) → Step 6 → Step 7 → Step
 
 **Parallel steps (3, 4, 5):** Once Step 2 is `completed`, set Steps 3, 4, and 5 all to `in_progress` at the same time and work on all three concurrently. The implementation (Step 5) and the tests (Steps 3, 4) are all guided by the approved spec from Step 2 — they do not need to wait for each other.
 
-**Convergence (6 → 7 → 8 → 9):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9.
+**Convergence (6 → 7 → 8 → 9 → 10):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9 → 10.
 
 Before marking any step as `completed`:
 - Re-read the step description and verify every sub-bullet is satisfied
@@ -495,10 +496,104 @@ For any case where behaviour differs from expectation, run the equivalent `gtail
 2. **Safer than GNU** — document; generally keep our behaviour
 3. **Worse than GNU** — fix it
 
-## Step 9: Update documentation
+## Step 9: Write fuzz tests
 
 **GATE CHECK**: Call TaskList. Step 8 must be `completed` before starting this step. Set this step to `in_progress` now.
 
+Create `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_fuzz_test.go` (`package $ARGUMENTS_test`).
+
+Fuzz tests run seed corpus entries as normal tests (without `-fuzz=`), making them free to run in CI. Their job is to verify that the implementation never panics, crashes, or returns unexpected exit codes across a wide variety of inputs. Exit codes 0 and 1 are always acceptable; exit code 2 (usage error) is acceptable for commands that use it (e.g. `test`); any other code or a panic is a failure.
+
+### Structure
+
+Each `Fuzz*` function follows this pattern:
+
+```go
+func FuzzCmdSomething(f *testing.F) {
+    // Seed corpus entries — each f.Add() is a test case run in non-fuzz mode
+    f.Add([]byte("normal input\n"))
+    f.Add([]byte{})
+    // ... more seeds ...
+
+    f.Fuzz(func(t *testing.T, input []byte /* + any extra args */) {
+        if len(input) > 1<<20 { return } // cap at 1 MiB
+        // filter out inputs that would cause shell parse errors
+        // create temp dir, write input file
+        // run the command with a 5-second timeout
+        ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+        defer cancel()
+        _, _, code := cmdRunCtxFuzz(ctx, t, "...", dir)
+        if code != 0 && code != 1 {
+            t.Errorf("unexpected exit code %d", code)
+        }
+    })
+}
+```
+
+Define `cmdRunCtxFuzz` (not `cmdRunCtx`, to avoid redeclaration conflicts with any existing test file in the package) at the top of the fuzz test file:
+
+```go
+func cmdRunCtxFuzz(ctx context.Context, t *testing.T, script, dir string) (string, string, int) {
+    t.Helper()
+    return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir}))
+}
+```
+
+Write one `Fuzz*` function per distinct mode of the command (e.g. `FuzzCmdLines`, `FuzzCmdBytes`, `FuzzCmdStdin`, `FuzzCmdFlags`). For commands with multiple flags, write one fuzz function per mode rather than jamming all flags into a single function — this keeps the seed corpus focused and makes failures easier to reproduce.
+
+### Seed corpus sources
+
+Build the seed corpus from **all three** of these sources. Do not skip any source — each catches different classes of bugs.
+
+**Source A: Implementation edge cases.** Read `interp/builtins/$ARGUMENTS.go` and identify every named constant, boundary check, special case, and clamp. Each one needs at least one seed:
+- Memory safety constants (e.g. `MaxLineBytes = 1 << 20`, `maxStringLen = 1 << 20`)
+- Counter/allocation clamps (e.g. `MaxCount = 1<<31-1`)
+- Buffer sizes and chunk boundaries (e.g. scanner init=4096, read chunks=32KiB)
+- Input encoding edge cases the implementation handles (CRLF, null bytes, invalid UTF-8, bare CR)
+- Boundary values: exactly at a limit, one below, one above
+- Degenerate inputs: empty, single byte, no trailing newline, all-identical lines, all-unique lines
+
+**Source B: CVE and security history.** Research which CVEs and security issues have affected the GNU implementation of `$ARGUMENTS` (and related tools like binutils for `strings`). For each vulnerability, add a seed that exercises the same class of input — even though our implementation may not share the same code path, these are the inputs real attackers will try:
+- Integer overflow inputs (very large `-n`/`-c` values: `MaxInt32`, `MaxInt64`, `MaxInt64+1`, `UINT64_MAX`)
+- Long-line inputs near and past historical buffer limits (4KB, 64KB, 1 MiB)
+- Null bytes embedded in content (triggered stack overflows in distro-patched versions of `uniq`, `sort`, `join`)
+- CRLF line endings (many CVEs involve incorrect line-ending handling)
+- Invalid UTF-8 sequences (surrogates, overlong encodings, bare continuation bytes)
+- Binary format magic bytes (ELF `\x7fELF`, PE `MZ`, ZIP `PK\x03\x04`) for commands that process file content
+- ANSI/terminal escape sequences in content (for commands that output filenames or text to a terminal)
+- ReDoS-class regex patterns for `grep` (e.g. `(a+)+`, `a*a*b`, `([a-z]+)*`)
+
+**Source C: Existing test coverage.** Read through `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_test.go` and `tests/scenarios/cmd/$ARGUMENTS/`. Every distinct input value, file content, or flag combination that appears in those tests should also appear as a seed corpus entry. This ensures that known-good cases are always in the fuzz corpus baseline, and that regressions found by the unit tests cannot escape fuzz coverage.
+
+### Verify
+
+Run all fuzz seed tests before committing:
+
+```bash
+go test ./interp/builtins/tests/$ARGUMENTS/ -run 'Fuzz' -count=1
+```
+
+All seeds must pass. Also run gofmt:
+
+```bash
+gofmt -l interp/builtins/tests/$ARGUMENTS/
+```
+
+No output means clean. Fix any formatting issues with `gofmt -w`.
+
+### CI integration
+
+Add an entry for the new fuzz package to `.github/workflows/fuzz.yml` under the `matrix.package` list so the fuzzer runs in CI:
+
+```yaml
+- package: interp/builtins/tests/$ARGUMENTS
+  fuzz: Fuzz$ARGUMENTS  # use the most broadly applicable fuzz function
+```
+
+## Step 10: Update documentation
+
+**GATE CHECK**: Call TaskList. Step 9 must be `completed` before starting this step. Set this step to `in_progress` now.
+
 Verify that `SHELL_FEATURES.md` in the repository root does not need updates (e.g. if a new category of feature is added).
 
 After updating, verify the file looks correct, then commit everything together if not already committed, or amend/add to the existing commit.
@@ -0,0 +1,88 @@
+name: Fuzz Tests
+
+on:
+  push:
+    branches: ['**']
+  pull_request:
+
+permissions:
+  contents: read
+
+jobs:
+  fuzz:
+    name: Fuzz (${{ matrix.name }})
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        include:
+          - pkg: ./interp/builtins/tests/head/
+            name: head
+          - pkg: ./interp/builtins/tests/cat/
+            name: cat
+          - pkg: ./interp/builtins/tests/wc/
+            name: wc
+          - pkg: ./interp/builtins/tests/tail/
+            name: tail
+          - pkg: ./interp/builtins/tests/grep/
+            name: grep
+          - pkg: ./interp/builtins/tests/cut/
+            name: cut
+          - pkg: ./interp/builtins/tests/echo/
+            name: echo
+          - pkg: ./interp/builtins/tests/uniq/
+            name: uniq
+          - pkg: ./interp/builtins/tests/strings_cmd/
+            name: strings_cmd
+          - pkg: ./interp/builtins/tests/testcmd/
+            name: testcmd
+          - pkg: ./interp/builtins/tests/ls/
+            name: ls
+    steps:
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+      - uses: actions/setup-go@4b73464bb391d4059bd26b0524d20df3927bd417 # v6.3.0
+        with:
+          go-version-file: .go-version
+
+      # Restore corpus from previous runs
+      - name: Restore fuzz corpus
+        uses: actions/cache@v4
+        with:
+          path: |
+            interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/
+          key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }}
+          restore-keys: |
+            fuzz-corpus-${{ matrix.name }}-
+
+      # Run seed corpus as normal tests (fast, deterministic)
+      - name: Run fuzz seed corpus
+        run: |
+          # Find all Fuzz* functions in the package (excluding differential ones that need RSHELL_BASH_TEST)
+          FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u | tr '\n' '|' | sed 's/|$//')
+          if [ -n "$FUZZ_FUNCS" ]; then
+            go test -run "^(${FUZZ_FUNCS})$" ${{ matrix.pkg }} -timeout 120s
+          else
+            echo "No non-differential fuzz functions found in ${{ matrix.pkg }}, skipping"
+          fi
+
+      # Run actual fuzzing for a short duration
+      - name: Fuzz (${{ matrix.name }})
+        run: |
+          FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u)
+          if [ -z "$FUZZ_FUNCS" ]; then
+            echo "No fuzz targets found in ${{ matrix.pkg }}, skipping"
+            exit 0
+          fi
+          for FUNC in $FUZZ_FUNCS; do
+            echo "Fuzzing $FUNC..."
+            go test -fuzz="^${FUNC}$" -fuzztime=30s ${{ matrix.pkg }} -timeout 300s
+          done
+
+      # Save corpus
+      - name: Save fuzz corpus
+        uses: actions/cache/save@v4
+        if: always()
+        with:
+          path: |
+            interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/
+          key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }}
@@ -24,6 +24,8 @@ jobs:
           go-version-file: .go-version
       - name: Run tests with race detector
         run: go test -race -v ./...
+      - name: Run fuzz seed corpus (regression test)
+        run: go test -run '^Fuzz' ./interp/builtins/... -timeout 120s
 
   test-against-bash:
     name: Test against Bash (Docker)
@@ -37,3 +39,16 @@ jobs:
         env:
           RSHELL_BASH_TEST: "1"
         run: go test -v -run TestShellScenariosAgainstBash ./tests/
+      - name: Fuzz differential tests against GNU tools
+        env:
+          RSHELL_BASH_TEST: "1"
+        run: |
+          OVERALL_STATUS=0
+          for PKG in ./interp/builtins/tests/cat/ ./interp/builtins/tests/head/ ./interp/builtins/tests/tail/ ./interp/builtins/tests/wc/; do
+            FUZZ_FUNCS=$(grep -r '^func Fuzz.*Differential' $PKG 2>/dev/null | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u)
+            for FUNC in $FUZZ_FUNCS; do
+              echo "Fuzzing $FUNC in $PKG..."
+              go test -fuzz="^${FUNC}$" -fuzztime=30s $PKG -timeout 300s || OVERALL_STATUS=1
+            done
+          done
+          exit $OVERALL_STATUS
@@ -4,3 +4,7 @@
 /rshell
 
 .DS_Store
+
+# Fuzz corpus: keep checked in for regression testing.
+# Uncomment the line below if corpus grows too large:
+# interp/builtins/tests/*/testdata/fuzz/*/corpus-*