Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
49e1530
Add GitHub Actions workflow for continuous fuzz testing in CI
thieman Mar 12, 2026
a7f417f
Add memory benchmark assertions for streaming builtins (head, cat, wc…
thieman Mar 12, 2026
78475e0
Add differential fuzz tests comparing rshell builtins against GNU cor…
thieman Mar 12, 2026
55f3fb3
Address review comments: fix build failures and CI issues
thieman Mar 12, 2026
7eff229
Add native Go fuzz tests for builtin commands
thieman Mar 12, 2026
c0bd6f0
Address review comments on fuzz tests PR
thieman Mar 12, 2026
9182b2b
Consolidate native fuzz tests from PR #63 into differential fuzz PR
thieman Mar 12, 2026
cff0db4
Add native fuzz tests for all remaining builtins
thieman Mar 12, 2026
0c1e592
Expand fuzz seed corpuses across all builtins with implementation edg…
thieman Mar 12, 2026
d4e43d7
Add CVE-derived fuzz seeds: terminal injection, fixed strings, format…
thieman Mar 12, 2026
eda6f16
Add Step 9: Write fuzz tests to implement-posix-command skill
thieman Mar 12, 2026
9db3241
Remove bench/memory tests — out of scope for this fuzz PR
thieman Mar 12, 2026
a5b687d
Fix fuzz input filters: add UTF-8 and C1 control char guards
thieman Mar 12, 2026
0b598c6
Add fuzz failure handling to fix-ci-tests and fix-local-tests skills
thieman Mar 12, 2026
49d281c
Fix fuzz seed corpus CI step: remove invalid -fuzztime=0s flag
thieman Mar 12, 2026
fef803d
Fix fuzz CI: remove invalid -fuzztime=0s and add differential fuzz to…
thieman Mar 12, 2026
d424fef
Fix wc -w: C0 control chars are transparent in POSIX locale
thieman Mar 12, 2026
450c324
Restrict differential fuzz tests to Linux with LC_ALL=C.UTF-8
thieman Mar 13, 2026
61c44df
Fix wc: invalid UTF-8 bytes don't count as chars or words in C.UTF-8
thieman Mar 13, 2026
9e87279
Fix wc -w for Cn/unassigned and Zs/space-separator codepoints; fuzz a…
thieman Mar 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .claude/skills/fix-ci-tests/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ This repo has the following CI jobs (defined in `.github/workflows/`):
| `test.yml` | `Test (windows-latest)` | `go test -race -v ./...` on Windows |
| `test.yml` | `Test against Bash (Docker)` | `RSHELL_BASH_TEST=1 go test -v -run TestShellScenariosAgainstBash ./tests/` |
| `compliance.yml` | `compliance` | `RSHELL_COMPLIANCE_TEST=1 go test -v -run TestCompliance ./tests/` |
| `fuzz.yml` | `Fuzz (<name>)` | Runs each `Fuzz*` function for 30 s per function; matrix across all builtin packages |

Classify each failure:

Expand All @@ -83,6 +84,7 @@ Classify each failure:
| **Bash comparison failure** | YAML scenario output differs from bash | Use the `fix-tests` skill workflow (determine what bash does, then fix) |
| **Compliance failure** | Compliance check fails | Read the compliance test to understand the rule, then fix the violation |
| **Platform-specific failure** | Passes on some OSes but not others | Check for platform-dependent behavior (path separators, line endings, etc.) |
| **Fuzz failure** | A `Fuzz*` test found an input that caused an unexpected exit code or error | See fuzz fix workflow below |

### 4. Reproduce failures locally

Expand Down Expand Up @@ -146,6 +148,23 @@ For each failure, apply the appropriate fix:
2. Use `stdout_windows`/`stderr_windows` fields in YAML scenarios for Windows-specific output
3. Use build tags (`//go:build unix` / `//go:build windows`) for platform-specific test files

**Fuzz failures:**

The CI logs will contain the failing input inline, e.g.:
```
--- FAIL: FuzzGrepFixedStrings
grep_fuzz_test.go:240: grep -F unexpected exit code 2
Failing input written to testdata/fuzz/FuzzGrepFixedStrings/abc123
To re-run: go test -run=FuzzGrepFixedStrings/abc123
```

1. Read the failing input from the log (it is printed as a `go test fuzz v1` file)
2. Create the corpus file manually at `interp/builtins/tests/<pkg>/testdata/fuzz/<FuzzFuncName>/<hash>` with that content
3. Reproduce locally: `go test -run=FuzzFuncName/hash ./interp/builtins/tests/<pkg>/`
4. Fix the bug in the implementation (never weaken the fuzz filter to hide the bug)
5. Verify the corpus entry now passes: `go test -run=FuzzFuncName/hash ./interp/builtins/tests/<pkg>/`
6. **Commit the corpus file** — it becomes a permanent regression test

### 7. Verify all fixes

Run the full test suite locally:
Expand Down
19 changes: 18 additions & 1 deletion .claude/skills/fix-local-tests/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,24 @@ For failures where the test expectation is wrong (not matching bash):
RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash/<scenario> -timeout 120s -v
```

### 6. Verify all fixes
### 6. Fix fuzz failures

If a `Fuzz*` test is failing (either a fuzzer-discovered corpus entry or a seed):

1. Run it to see the error: `go test -v -run FuzzFuncName/corpushash ./interp/builtins/tests/<pkg>/`
2. Fix the **implementation** — never weaken the fuzz input filter to hide the bug
3. If the fix is to the input filter (e.g. the input is legitimately unsupported), that is also acceptable, but the reason must be clear from a comment
4. **Always commit the failing corpus file** at `testdata/fuzz/<FuzzFuncName>/<hash>` — it becomes a permanent regression test

To reproduce a fuzzer-found crash from a log message, create the corpus file manually:
```
go test fuzz v1
[]byte("...")
string("...")
```
Place it at `interp/builtins/tests/<pkg>/testdata/fuzz/<FuzzFuncName>/<hash>` and re-run.

### 7. Verify all fixes

After all fixes are applied, run the full test suite:

Expand Down
103 changes: 99 additions & 4 deletions .claude/skills/implement-posix-command/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ You MUST follow this execution protocol. Skipping steps has caused defects in ev

### 1. Create the full task list FIRST

Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 9 times, once for each step below (Steps 1–9). Use these exact subjects:
Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 10 times, once for each step below (Steps 1–10). Use these exact subjects:

1. "Step 1: Research the command"
2. "Step 2: User confirms which flags to implement"
Expand All @@ -24,7 +24,8 @@ Your very first action — before reading ANY files, before writing ANY code —
6. "Step 6: Verify and Harden"
7. "Step 7: Code review"
8. "Step 8: Exploratory pentest"
9. "Step 9: Update documentation"
9. "Step 9: Write fuzz tests"
10. "Step 10: Update documentation"

### 2. Execution order and gating

Expand All @@ -38,7 +39,7 @@ Step 1 → Step 2 → Steps 3 + 4 + 5 (parallel) → Step 6 → Step 7 → Step

**Parallel steps (3, 4, 5):** Once Step 2 is `completed`, set Steps 3, 4, and 5 all to `in_progress` at the same time and work on all three concurrently. The implementation (Step 5) and the tests (Steps 3, 4) are all guided by the approved spec from Step 2 — they do not need to wait for each other.

**Convergence (6 → 7 → 8 → 9):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9.
**Convergence (6 → 7 → 8 → 9 → 10):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9 → 10.

Before marking any step as `completed`:
- Re-read the step description and verify every sub-bullet is satisfied
Expand Down Expand Up @@ -495,10 +496,104 @@ For any case where behaviour differs from expectation, run the equivalent `gtail
2. **Safer than GNU** — document; generally keep our behaviour
3. **Worse than GNU** — fix it

## Step 9: Update documentation
## Step 9: Write fuzz tests

**GATE CHECK**: Call TaskList. Step 8 must be `completed` before starting this step. Set this step to `in_progress` now.

Create `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_fuzz_test.go` (`package $ARGUMENTS_test`).

Fuzz tests run seed corpus entries as normal tests (without `-fuzz=`), making them free to run in CI. Their job is to verify that the implementation never panics, crashes, or returns unexpected exit codes across a wide variety of inputs. Exit codes 0 and 1 are always acceptable; exit code 2 (usage error) is acceptable for commands that use it (e.g. `test`); any other code or a panic is a failure.

### Structure

Each `Fuzz*` function follows this pattern:

```go
func FuzzCmdSomething(f *testing.F) {
// Seed corpus entries — each f.Add() is a test case run in non-fuzz mode
f.Add([]byte("normal input\n"))
f.Add([]byte{})
// ... more seeds ...

f.Fuzz(func(t *testing.T, input []byte /* + any extra args */) {
if len(input) > 1<<20 { return } // cap at 1 MiB
// filter out inputs that would cause shell parse errors
// create temp dir, write input file
// run the command with a 5-second timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
_, _, code := cmdRunCtxFuzz(ctx, t, "...", dir)
if code != 0 && code != 1 {
t.Errorf("unexpected exit code %d", code)
}
})
}
```

Define `cmdRunCtxFuzz` (not `cmdRunCtx`, to avoid redeclaration conflicts with any existing test file in the package) at the top of the fuzz test file:

```go
func cmdRunCtxFuzz(ctx context.Context, t *testing.T, script, dir string) (string, string, int) {
t.Helper()
return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir}))
}
```

Write one `Fuzz*` function per distinct mode of the command (e.g. `FuzzCmdLines`, `FuzzCmdBytes`, `FuzzCmdStdin`, `FuzzCmdFlags`). For commands with multiple flags, write one fuzz function per mode rather than jamming all flags into a single function — this keeps the seed corpus focused and makes failures easier to reproduce.

### Seed corpus sources

Build the seed corpus from **all three** of these sources. Do not skip any source — each catches different classes of bugs.

**Source A: Implementation edge cases.** Read `interp/builtins/$ARGUMENTS.go` and identify every named constant, boundary check, special case, and clamp. Each one needs at least one seed:
- Memory safety constants (e.g. `MaxLineBytes = 1 << 20`, `maxStringLen = 1 << 20`)
- Counter/allocation clamps (e.g. `MaxCount = 1<<31-1`)
- Buffer sizes and chunk boundaries (e.g. scanner init=4096, read chunks=32KiB)
- Input encoding edge cases the implementation handles (CRLF, null bytes, invalid UTF-8, bare CR)
- Boundary values: exactly at a limit, one below, one above
- Degenerate inputs: empty, single byte, no trailing newline, all-identical lines, all-unique lines

**Source B: CVE and security history.** Research which CVEs and security issues have affected the GNU implementation of `$ARGUMENTS` (and related tools like binutils for `strings`). For each vulnerability, add a seed that exercises the same class of input — even though our implementation may not share the same code path, these are the inputs real attackers will try:
- Integer overflow inputs (very large `-n`/`-c` values: `MaxInt32`, `MaxInt64`, `MaxInt64+1`, `UINT64_MAX`)
- Long-line inputs near and past historical buffer limits (4KB, 64KB, 1 MiB)
- Null bytes embedded in content (triggered stack overflows in distro-patched versions of `uniq`, `sort`, `join`)
- CRLF line endings (many CVEs involve incorrect line-ending handling)
- Invalid UTF-8 sequences (surrogates, overlong encodings, bare continuation bytes)
- Binary format magic bytes (ELF `\x7fELF`, PE `MZ`, ZIP `PK\x03\x04`) for commands that process file content
- ANSI/terminal escape sequences in content (for commands that output filenames or text to a terminal)
- ReDoS-class regex patterns for `grep` (e.g. `(a+)+`, `a*a*b`, `([a-z]+)*`)

**Source C: Existing test coverage.** Read through `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_test.go` and `tests/scenarios/cmd/$ARGUMENTS/`. Every distinct input value, file content, or flag combination that appears in those tests should also appear as a seed corpus entry. This ensures that known-good cases are always in the fuzz corpus baseline, and that regressions found by the unit tests cannot escape fuzz coverage.

### Verify

Run all fuzz seed tests before committing:

```bash
go test ./interp/builtins/tests/$ARGUMENTS/ -run 'Fuzz' -count=1
```

All seeds must pass. Also run gofmt:

```bash
gofmt -l interp/builtins/tests/$ARGUMENTS/
```

No output means clean. Fix any formatting issues with `gofmt -w`.

### CI integration

Add an entry for the new fuzz package to `.github/workflows/fuzz.yml` under the `matrix.package` list so the fuzzer runs in CI:

```yaml
- package: interp/builtins/tests/$ARGUMENTS
fuzz: Fuzz$ARGUMENTS # use the most broadly applicable fuzz function
```

## Step 10: Update documentation

**GATE CHECK**: Call TaskList. Step 9 must be `completed` before starting this step. Set this step to `in_progress` now.

Verify that `SHELL_FEATURES.md` in the repository root does not need updates (e.g. if a new category of feature is added).

After updating, verify the file looks correct, then commit everything together if not already committed, or amend/add to the existing commit.
88 changes: 88 additions & 0 deletions .github/workflows/fuzz.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
name: Fuzz Tests

on:
push:
branches: ['**']
pull_request:

permissions:
contents: read

jobs:
fuzz:
name: Fuzz (${{ matrix.name }})
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- pkg: ./interp/builtins/tests/head/
name: head
- pkg: ./interp/builtins/tests/cat/
name: cat
- pkg: ./interp/builtins/tests/wc/
name: wc
- pkg: ./interp/builtins/tests/tail/
name: tail
- pkg: ./interp/builtins/tests/grep/
name: grep
- pkg: ./interp/builtins/tests/cut/
name: cut
- pkg: ./interp/builtins/tests/echo/
name: echo
- pkg: ./interp/builtins/tests/uniq/
name: uniq
- pkg: ./interp/builtins/tests/strings_cmd/
name: strings_cmd
- pkg: ./interp/builtins/tests/testcmd/
name: testcmd
- pkg: ./interp/builtins/tests/ls/
name: ls
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-go@4b73464bb391d4059bd26b0524d20df3927bd417 # v6.3.0
with:
Comment thread
thieman marked this conversation as resolved.
go-version-file: .go-version

# Restore corpus from previous runs
- name: Restore fuzz corpus
uses: actions/cache@v4
with:
path: |
interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/
key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }}
restore-keys: |
fuzz-corpus-${{ matrix.name }}-

# Run seed corpus as normal tests (fast, deterministic)
- name: Run fuzz seed corpus
run: |
# Find all Fuzz* functions in the package (excluding differential ones that need RSHELL_BASH_TEST)
FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u | tr '\n' '|' | sed 's/|$//')
if [ -n "$FUZZ_FUNCS" ]; then
go test -run "^(${FUZZ_FUNCS})$" ${{ matrix.pkg }} -timeout 120s
else
echo "No non-differential fuzz functions found in ${{ matrix.pkg }}, skipping"
fi

# Run actual fuzzing for a short duration
- name: Fuzz (${{ matrix.name }})
run: |
FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Run differential fuzz targets instead of filtering them out

The workflow excludes any fuzz function containing Differential, but every fuzz test added in this commit is named Fuzz*Differential*, so the head/cat/wc/tail matrix entries never execute those tests and always skip with "No fuzz targets found." This leaves the new differential coverage effectively unused in CI.

Useful? React with 👍 / 👎.

if [ -z "$FUZZ_FUNCS" ]; then
echo "No fuzz targets found in ${{ matrix.pkg }}, skipping"
exit 0
fi
for FUNC in $FUZZ_FUNCS; do
echo "Fuzzing $FUNC..."
go test -fuzz="^${FUNC}$" -fuzztime=30s ${{ matrix.pkg }} -timeout 300s
done

# Save corpus
- name: Save fuzz corpus
uses: actions/cache/save@v4
if: always()
Comment thread
thieman marked this conversation as resolved.
with:
path: |
interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/
key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }}
15 changes: 15 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ jobs:
go-version-file: .go-version
- name: Run tests with race detector
run: go test -race -v ./...
- name: Run fuzz seed corpus (regression test)
run: go test -run '^Fuzz' ./interp/builtins/... -timeout 120s

test-against-bash:
name: Test against Bash (Docker)
Expand All @@ -37,3 +39,16 @@ jobs:
env:
RSHELL_BASH_TEST: "1"
run: go test -v -run TestShellScenariosAgainstBash ./tests/
- name: Fuzz differential tests against GNU tools
env:
RSHELL_BASH_TEST: "1"
run: |
OVERALL_STATUS=0
for PKG in ./interp/builtins/tests/cat/ ./interp/builtins/tests/head/ ./interp/builtins/tests/tail/ ./interp/builtins/tests/wc/; do
FUZZ_FUNCS=$(grep -r '^func Fuzz.*Differential' $PKG 2>/dev/null | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u)
for FUNC in $FUZZ_FUNCS; do
echo "Fuzzing $FUNC in $PKG..."
go test -fuzz="^${FUNC}$" -fuzztime=30s $PKG -timeout 300s || OVERALL_STATUS=1
done
done
exit $OVERALL_STATUS
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,7 @@
/rshell

.DS_Store

# Fuzz corpus: keep checked in for regression testing.
# Uncomment the line below if corpus grows too large:
# interp/builtins/tests/*/testdata/fuzz/*/corpus-*
Loading
Loading