diff --git a/.claude/skills/fix-ci-tests/SKILL.md b/.claude/skills/fix-ci-tests/SKILL.md index fc0f38e2..3af3f7a5 100644 --- a/.claude/skills/fix-ci-tests/SKILL.md +++ b/.claude/skills/fix-ci-tests/SKILL.md @@ -72,6 +72,7 @@ This repo has the following CI jobs (defined in `.github/workflows/`): | `test.yml` | `Test (windows-latest)` | `go test -race -v ./...` on Windows | | `test.yml` | `Test against Bash (Docker)` | `RSHELL_BASH_TEST=1 go test -v -run TestShellScenariosAgainstBash ./tests/` | | `compliance.yml` | `compliance` | `RSHELL_COMPLIANCE_TEST=1 go test -v -run TestCompliance ./tests/` | +| `fuzz.yml` | `Fuzz ()` | Runs each `Fuzz*` function for 30 s per function; matrix across all builtin packages | Classify each failure: @@ -83,6 +84,7 @@ Classify each failure: | **Bash comparison failure** | YAML scenario output differs from bash | Use the `fix-tests` skill workflow (determine what bash does, then fix) | | **Compliance failure** | Compliance check fails | Read the compliance test to understand the rule, then fix the violation | | **Platform-specific failure** | Passes on some OSes but not others | Check for platform-dependent behavior (path separators, line endings, etc.) | +| **Fuzz failure** | A `Fuzz*` test found an input that caused an unexpected exit code or error | See fuzz fix workflow below | ### 4. Reproduce failures locally @@ -146,6 +148,23 @@ For each failure, apply the appropriate fix: 2. Use `stdout_windows`/`stderr_windows` fields in YAML scenarios for Windows-specific output 3. Use build tags (`//go:build unix` / `//go:build windows`) for platform-specific test files +**Fuzz failures:** + +The CI logs will contain the failing input inline, e.g.: +``` +--- FAIL: FuzzGrepFixedStrings + grep_fuzz_test.go:240: grep -F unexpected exit code 2 + Failing input written to testdata/fuzz/FuzzGrepFixedStrings/abc123 + To re-run: go test -run=FuzzGrepFixedStrings/abc123 +``` + +1. Read the failing input from the log (it is printed as a `go test fuzz v1` file) +2. Create the corpus file manually at `interp/builtins/tests//testdata/fuzz//` with that content +3. Reproduce locally: `go test -run=FuzzFuncName/hash ./interp/builtins/tests//` +4. Fix the bug in the implementation (never weaken the fuzz filter to hide the bug) +5. Verify the corpus entry now passes: `go test -run=FuzzFuncName/hash ./interp/builtins/tests//` +6. **Commit the corpus file** — it becomes a permanent regression test + ### 7. Verify all fixes Run the full test suite locally: diff --git a/.claude/skills/fix-local-tests/SKILL.md b/.claude/skills/fix-local-tests/SKILL.md index 17d1104b..e3567e91 100644 --- a/.claude/skills/fix-local-tests/SKILL.md +++ b/.claude/skills/fix-local-tests/SKILL.md @@ -90,7 +90,24 @@ For failures where the test expectation is wrong (not matching bash): RSHELL_BASH_TEST=1 go test ./tests/ -run TestShellScenariosAgainstBash/ -timeout 120s -v ``` -### 6. Verify all fixes +### 6. Fix fuzz failures + +If a `Fuzz*` test is failing (either a fuzzer-discovered corpus entry or a seed): + +1. Run it to see the error: `go test -v -run FuzzFuncName/corpushash ./interp/builtins/tests//` +2. Fix the **implementation** — never weaken the fuzz input filter to hide the bug +3. If the fix is to the input filter (e.g. the input is legitimately unsupported), that is also acceptable, but the reason must be clear from a comment +4. **Always commit the failing corpus file** at `testdata/fuzz//` — it becomes a permanent regression test + +To reproduce a fuzzer-found crash from a log message, create the corpus file manually: +``` +go test fuzz v1 +[]byte("...") +string("...") +``` +Place it at `interp/builtins/tests//testdata/fuzz//` and re-run. + +### 7. Verify all fixes After all fixes are applied, run the full test suite: diff --git a/.claude/skills/implement-posix-command/SKILL.md b/.claude/skills/implement-posix-command/SKILL.md index 9afff5f9..30bd39be 100644 --- a/.claude/skills/implement-posix-command/SKILL.md +++ b/.claude/skills/implement-posix-command/SKILL.md @@ -14,7 +14,7 @@ You MUST follow this execution protocol. Skipping steps has caused defects in ev ### 1. Create the full task list FIRST -Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 9 times, once for each step below (Steps 1–9). Use these exact subjects: +Your very first action — before reading ANY files, before writing ANY code — is to call TaskCreate exactly 10 times, once for each step below (Steps 1–10). Use these exact subjects: 1. "Step 1: Research the command" 2. "Step 2: User confirms which flags to implement" @@ -24,7 +24,8 @@ Your very first action — before reading ANY files, before writing ANY code — 6. "Step 6: Verify and Harden" 7. "Step 7: Code review" 8. "Step 8: Exploratory pentest" -9. "Step 9: Update documentation" +9. "Step 9: Write fuzz tests" +10. "Step 10: Update documentation" ### 2. Execution order and gating @@ -38,7 +39,7 @@ Step 1 → Step 2 → Steps 3 + 4 + 5 (parallel) → Step 6 → Step 7 → Step **Parallel steps (3, 4, 5):** Once Step 2 is `completed`, set Steps 3, 4, and 5 all to `in_progress` at the same time and work on all three concurrently. The implementation (Step 5) and the tests (Steps 3, 4) are all guided by the approved spec from Step 2 — they do not need to wait for each other. -**Convergence (6 → 7 → 8 → 9):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9. +**Convergence (6 → 7 → 8 → 9 → 10):** Before starting Step 6, call TaskList and verify Steps 3, 4, AND 5 are all `completed`. Then proceed sequentially through 6 → 7 → 8 → 9 → 10. Before marking any step as `completed`: - Re-read the step description and verify every sub-bullet is satisfied @@ -495,10 +496,104 @@ For any case where behaviour differs from expectation, run the equivalent `gtail 2. **Safer than GNU** — document; generally keep our behaviour 3. **Worse than GNU** — fix it -## Step 9: Update documentation +## Step 9: Write fuzz tests **GATE CHECK**: Call TaskList. Step 8 must be `completed` before starting this step. Set this step to `in_progress` now. +Create `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_fuzz_test.go` (`package $ARGUMENTS_test`). + +Fuzz tests run seed corpus entries as normal tests (without `-fuzz=`), making them free to run in CI. Their job is to verify that the implementation never panics, crashes, or returns unexpected exit codes across a wide variety of inputs. Exit codes 0 and 1 are always acceptable; exit code 2 (usage error) is acceptable for commands that use it (e.g. `test`); any other code or a panic is a failure. + +### Structure + +Each `Fuzz*` function follows this pattern: + +```go +func FuzzCmdSomething(f *testing.F) { + // Seed corpus entries — each f.Add() is a test case run in non-fuzz mode + f.Add([]byte("normal input\n")) + f.Add([]byte{}) + // ... more seeds ... + + f.Fuzz(func(t *testing.T, input []byte /* + any extra args */) { + if len(input) > 1<<20 { return } // cap at 1 MiB + // filter out inputs that would cause shell parse errors + // create temp dir, write input file + // run the command with a 5-second timeout + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _, _, code := cmdRunCtxFuzz(ctx, t, "...", dir) + if code != 0 && code != 1 { + t.Errorf("unexpected exit code %d", code) + } + }) +} +``` + +Define `cmdRunCtxFuzz` (not `cmdRunCtx`, to avoid redeclaration conflicts with any existing test file in the package) at the top of the fuzz test file: + +```go +func cmdRunCtxFuzz(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} +``` + +Write one `Fuzz*` function per distinct mode of the command (e.g. `FuzzCmdLines`, `FuzzCmdBytes`, `FuzzCmdStdin`, `FuzzCmdFlags`). For commands with multiple flags, write one fuzz function per mode rather than jamming all flags into a single function — this keeps the seed corpus focused and makes failures easier to reproduce. + +### Seed corpus sources + +Build the seed corpus from **all three** of these sources. Do not skip any source — each catches different classes of bugs. + +**Source A: Implementation edge cases.** Read `interp/builtins/$ARGUMENTS.go` and identify every named constant, boundary check, special case, and clamp. Each one needs at least one seed: +- Memory safety constants (e.g. `MaxLineBytes = 1 << 20`, `maxStringLen = 1 << 20`) +- Counter/allocation clamps (e.g. `MaxCount = 1<<31-1`) +- Buffer sizes and chunk boundaries (e.g. scanner init=4096, read chunks=32KiB) +- Input encoding edge cases the implementation handles (CRLF, null bytes, invalid UTF-8, bare CR) +- Boundary values: exactly at a limit, one below, one above +- Degenerate inputs: empty, single byte, no trailing newline, all-identical lines, all-unique lines + +**Source B: CVE and security history.** Research which CVEs and security issues have affected the GNU implementation of `$ARGUMENTS` (and related tools like binutils for `strings`). For each vulnerability, add a seed that exercises the same class of input — even though our implementation may not share the same code path, these are the inputs real attackers will try: +- Integer overflow inputs (very large `-n`/`-c` values: `MaxInt32`, `MaxInt64`, `MaxInt64+1`, `UINT64_MAX`) +- Long-line inputs near and past historical buffer limits (4KB, 64KB, 1 MiB) +- Null bytes embedded in content (triggered stack overflows in distro-patched versions of `uniq`, `sort`, `join`) +- CRLF line endings (many CVEs involve incorrect line-ending handling) +- Invalid UTF-8 sequences (surrogates, overlong encodings, bare continuation bytes) +- Binary format magic bytes (ELF `\x7fELF`, PE `MZ`, ZIP `PK\x03\x04`) for commands that process file content +- ANSI/terminal escape sequences in content (for commands that output filenames or text to a terminal) +- ReDoS-class regex patterns for `grep` (e.g. `(a+)+`, `a*a*b`, `([a-z]+)*`) + +**Source C: Existing test coverage.** Read through `interp/builtins/tests/$ARGUMENTS/$ARGUMENTS_test.go` and `tests/scenarios/cmd/$ARGUMENTS/`. Every distinct input value, file content, or flag combination that appears in those tests should also appear as a seed corpus entry. This ensures that known-good cases are always in the fuzz corpus baseline, and that regressions found by the unit tests cannot escape fuzz coverage. + +### Verify + +Run all fuzz seed tests before committing: + +```bash +go test ./interp/builtins/tests/$ARGUMENTS/ -run 'Fuzz' -count=1 +``` + +All seeds must pass. Also run gofmt: + +```bash +gofmt -l interp/builtins/tests/$ARGUMENTS/ +``` + +No output means clean. Fix any formatting issues with `gofmt -w`. + +### CI integration + +Add an entry for the new fuzz package to `.github/workflows/fuzz.yml` under the `matrix.package` list so the fuzzer runs in CI: + +```yaml +- package: interp/builtins/tests/$ARGUMENTS + fuzz: Fuzz$ARGUMENTS # use the most broadly applicable fuzz function +``` + +## Step 10: Update documentation + +**GATE CHECK**: Call TaskList. Step 9 must be `completed` before starting this step. Set this step to `in_progress` now. + Verify that `SHELL_FEATURES.md` in the repository root does not need updates (e.g. if a new category of feature is added). After updating, verify the file looks correct, then commit everything together if not already committed, or amend/add to the existing commit. diff --git a/.github/workflows/fuzz.yml b/.github/workflows/fuzz.yml new file mode 100644 index 00000000..b5d12ac0 --- /dev/null +++ b/.github/workflows/fuzz.yml @@ -0,0 +1,88 @@ +name: Fuzz Tests + +on: + push: + branches: ['**'] + pull_request: + +permissions: + contents: read + +jobs: + fuzz: + name: Fuzz (${{ matrix.name }}) + runs-on: ubuntu-latest + strategy: + fail-fast: false + matrix: + include: + - pkg: ./interp/builtins/tests/head/ + name: head + - pkg: ./interp/builtins/tests/cat/ + name: cat + - pkg: ./interp/builtins/tests/wc/ + name: wc + - pkg: ./interp/builtins/tests/tail/ + name: tail + - pkg: ./interp/builtins/tests/grep/ + name: grep + - pkg: ./interp/builtins/tests/cut/ + name: cut + - pkg: ./interp/builtins/tests/echo/ + name: echo + - pkg: ./interp/builtins/tests/uniq/ + name: uniq + - pkg: ./interp/builtins/tests/strings_cmd/ + name: strings_cmd + - pkg: ./interp/builtins/tests/testcmd/ + name: testcmd + - pkg: ./interp/builtins/tests/ls/ + name: ls + steps: + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 + - uses: actions/setup-go@4b73464bb391d4059bd26b0524d20df3927bd417 # v6.3.0 + with: + go-version-file: .go-version + + # Restore corpus from previous runs + - name: Restore fuzz corpus + uses: actions/cache@v4 + with: + path: | + interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/ + key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }} + restore-keys: | + fuzz-corpus-${{ matrix.name }}- + + # Run seed corpus as normal tests (fast, deterministic) + - name: Run fuzz seed corpus + run: | + # Find all Fuzz* functions in the package (excluding differential ones that need RSHELL_BASH_TEST) + FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u | tr '\n' '|' | sed 's/|$//') + if [ -n "$FUZZ_FUNCS" ]; then + go test -run "^(${FUZZ_FUNCS})$" ${{ matrix.pkg }} -timeout 120s + else + echo "No non-differential fuzz functions found in ${{ matrix.pkg }}, skipping" + fi + + # Run actual fuzzing for a short duration + - name: Fuzz (${{ matrix.name }}) + run: | + FUZZ_FUNCS=$(grep -r '^func Fuzz' ${{ matrix.pkg }} 2>/dev/null | grep -v 'Differential' | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u) + if [ -z "$FUZZ_FUNCS" ]; then + echo "No fuzz targets found in ${{ matrix.pkg }}, skipping" + exit 0 + fi + for FUNC in $FUZZ_FUNCS; do + echo "Fuzzing $FUNC..." + go test -fuzz="^${FUNC}$" -fuzztime=30s ${{ matrix.pkg }} -timeout 300s + done + + # Save corpus + - name: Save fuzz corpus + uses: actions/cache/save@v4 + if: always() + with: + path: | + interp/builtins/tests/${{ matrix.name }}/testdata/fuzz/ + key: fuzz-corpus-${{ matrix.name }}-${{ github.sha }} diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 6ec5b00e..ad073ee4 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -24,6 +24,8 @@ jobs: go-version-file: .go-version - name: Run tests with race detector run: go test -race -v ./... + - name: Run fuzz seed corpus (regression test) + run: go test -run '^Fuzz' ./interp/builtins/... -timeout 120s test-against-bash: name: Test against Bash (Docker) @@ -37,3 +39,16 @@ jobs: env: RSHELL_BASH_TEST: "1" run: go test -v -run TestShellScenariosAgainstBash ./tests/ + - name: Fuzz differential tests against GNU tools + env: + RSHELL_BASH_TEST: "1" + run: | + OVERALL_STATUS=0 + for PKG in ./interp/builtins/tests/cat/ ./interp/builtins/tests/head/ ./interp/builtins/tests/tail/ ./interp/builtins/tests/wc/; do + FUZZ_FUNCS=$(grep -r '^func Fuzz.*Differential' $PKG 2>/dev/null | sed 's/.*func \(Fuzz[^(]*\).*/\1/' | sort -u) + for FUNC in $FUZZ_FUNCS; do + echo "Fuzzing $FUNC in $PKG..." + go test -fuzz="^${FUNC}$" -fuzztime=30s $PKG -timeout 300s || OVERALL_STATUS=1 + done + done + exit $OVERALL_STATUS diff --git a/.gitignore b/.gitignore index 3a8c62d8..b102bc82 100644 --- a/.gitignore +++ b/.gitignore @@ -4,3 +4,7 @@ /rshell .DS_Store + +# Fuzz corpus: keep checked in for regression testing. +# Uncomment the line below if corpus grows too large: +# interp/builtins/tests/*/testdata/fuzz/*/corpus-* diff --git a/interp/builtins/tests/cat/cat_differential_fuzz_test.go b/interp/builtins/tests/cat/cat_differential_fuzz_test.go new file mode 100644 index 00000000..cc7cccdd --- /dev/null +++ b/interp/builtins/tests/cat/cat_differential_fuzz_test.go @@ -0,0 +1,102 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package cat_test + +import ( + "bytes" + "context" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + "time" +) + +// runGNUInDir runs a GNU command under LC_ALL=C.UTF-8 with its working +// directory set to dir. args[0] is the command name; args[1:] are arguments. +func runGNUInDir(t *testing.T, dir string, args []string) (stdout string, exitCode int) { + t.Helper() + if _, err := exec.LookPath(args[0]); err != nil { + t.Skipf("%s not found: %v", args[0], err) + } + + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + cmd.Env = append(os.Environ(), "LC_ALL=C.UTF-8") + + var outBuf bytes.Buffer + cmd.Stdout = &outBuf + + err := cmd.Run() + exitCode = 0 + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + exitCode = exitErr.ExitCode() + } else { + t.Logf("gnu exec error: %v", err) + return "", -1 + } + } + return outBuf.String(), exitCode +} + +func isSandboxError(stderr string) bool { + lower := strings.ToLower(stderr) + return strings.Contains(lower, "permission denied") || + strings.Contains(lower, "not allowed") || + strings.Contains(lower, "sandbox") +} + +// FuzzCatDifferential compares rshell cat output against GNU cat. +func FuzzCatDifferential(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("hello\nworld\n")) + f.Add([]byte("")) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add([]byte{0xff, 0xfe, 0x00, 0x01}) + f.Add([]byte("line1\nline2\nline3\n")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 64*1024 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, "cat input.txt", dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"cat", "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("stdout mismatch:\nrshell: %q\ngnu: %q\ninput: %q", rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("exit code mismatch: rshell=%d gnu=%d", rshellCode, gnuCode) + } + }) +} diff --git a/interp/builtins/tests/cat/cat_fuzz_test.go b/interp/builtins/tests/cat/cat_fuzz_test.go new file mode 100644 index 00000000..f97e3023 --- /dev/null +++ b/interp/builtins/tests/cat/cat_fuzz_test.go @@ -0,0 +1,218 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package cat_test + +import ( + "bytes" + "context" + "os" + "path/filepath" + "testing" + "time" +) + +// FuzzCat fuzzes cat with arbitrary file content and verifies output equals input. +func FuzzCat(f *testing.F) { + // Basic + f.Add([]byte("hello\nworld\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("\n\n\n")) + // Null bytes — passed through unchanged (binary safety) + f.Add([]byte("a\x00b\n")) + f.Add([]byte{0x00, 0x00, 0x00}) + // High bytes / non-UTF-8 (M- notation only in -v mode; raw pass-through here) + f.Add([]byte{0xff, 0xfe, 0x00, 0x01}) + f.Add([]byte{0x80, 0x9f, 0xa0, 0xfe, 0xff, '\n'}) + // Invalid UTF-8 sequences (CVE-class: must not crash on bad encoding) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add([]byte{0xed, 0xa0, 0x80}) // surrogate half + // CRLF — must be preserved exactly + f.Add([]byte("line1\r\nline2\r\n")) + f.Add([]byte("a\r\nb\n")) + // Near scanner buffer boundaries (init=4096, max=1MiB) + f.Add(bytes.Repeat([]byte("x"), 4095)) + f.Add(bytes.Repeat([]byte("x"), 4096)) + f.Add(bytes.Repeat([]byte("x"), 4097)) + // Lines near the 1 MiB cap + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n')) + // DEL and other control chars + f.Add([]byte{0x7f, '\n'}) + f.Add([]byte{0x01, 0x1f, 0x7f, '\n'}) + // Mixed binary and text + f.Add([]byte("text\x00\x01\x02more text\n")) + // ANSI/terminal escape sequences (terminal injection class — cat passes through unchanged) + f.Add([]byte("\x1b[31mRED\x1b[0m\n")) // ANSI color codes + f.Add([]byte("\x1b]2;malicious title\x07\n")) // OSC 2: terminal title injection + f.Add([]byte("\x1b[2J\n")) // clear screen + f.Add([]byte("\x1b[9D\n")) // cursor back 9 columns + f.Add([]byte("\x1bP...string...\x1b\\\n")) // DCS device control sequence + f.Add([]byte("\x1b]50;fontname\x07\n")) // OSC 50 font query (xterm CVE class) + // Bare CR (old Mac line endings) + f.Add([]byte("a\rb\rc\r")) + // ELF magic bytes (binary format detection) + f.Add([]byte{0x7f, 'E', 'L', 'F', 0x02, 0x01, 0x01, 0x00}) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, "cat input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("unexpected exit code %d", code) + } + + // cat must output exactly the file contents + if code == 0 && stdout != string(input) { + t.Errorf("cat output differs from input: got %d bytes, want %d bytes", len(stdout), len(input)) + } + }) +} + +// FuzzCatNumberLines fuzzes cat -n with arbitrary file content. +// Edge cases: line number formatting at width 6, blank lines, no trailing newline. +func FuzzCatNumberLines(f *testing.F) { + f.Add([]byte("line1\nline2\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add([]byte("\n\n\n")) + // Lines at scanner cap boundary — should error, not panic + f.Add(append(bytes.Repeat([]byte("a"), 1<<20), '\n')) // over cap: error + f.Add(append(bytes.Repeat([]byte("b"), 1<<20-1), '\n')) // just under cap: ok + // Blank-line interactions + f.Add([]byte("a\n\n\nb\n")) + // CRLF must be preserved + f.Add([]byte("a\r\nb\r\n")) + // Null bytes in line + f.Add([]byte("x\x00y\nz\n")) + // High bytes in line + f.Add([]byte{0x80, 0x81, '\n'}) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "cat -n input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("cat -n unexpected exit code %d", code) + } + }) +} + +// FuzzCatDisplayFlags fuzzes cat with display-transformation flags (-v/-E/-T/-A). +// Edge cases: M- notation for high bytes, ^X notation for controls, CRLF rendering. +func FuzzCatDisplayFlags(f *testing.F) { + // Non-printing chars: must render as ^X + f.Add([]byte{0x00, 0x01, 0x1f, '\n'}, true, false, false) + // DEL → ^? + f.Add([]byte{0x7f, '\n'}, true, false, false) + // High bytes 0x80-0xff → M- notation + f.Add([]byte{0x80, 0x9f, 0xa0, 0xff, '\n'}, true, false, false) + // Tab handling: -v preserves tab, -T converts to ^I + f.Add([]byte("a\tb\n"), true, false, false) + f.Add([]byte("a\tb\n"), false, false, true) + // -E: CRLF → ^M$ before the newline + f.Add([]byte("a\r\nb\n"), false, true, false) + // Combined -v -E: both transformations + f.Add([]byte{0x00, '\r', '\n'}, true, true, false) + // Empty lines with -E → just "$\n" + f.Add([]byte("\n\n\n"), false, true, false) + // Null bytes with -v + f.Add([]byte{0x00, 0x00, '\n'}, true, false, false) + // Surrogate / bad UTF-8 with -v + f.Add([]byte{0xed, 0xa0, 0x80, '\n'}, true, false, false) + + f.Fuzz(func(t *testing.T, input []byte, flagV, flagE, flagT bool) { + if len(input) > 1<<20 { + return + } + if !flagV && !flagE && !flagT { + return // plain cat is covered by FuzzCat + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.bin"), input, 0644); err != nil { + t.Fatal(err) + } + + flags := "" + if flagV { + flags += " -v" + } + if flagE { + flags += " -E" + } + if flagT { + flags += " -T" + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "cat"+flags+" input.bin", dir) + if code != 0 && code != 1 { + t.Errorf("cat%s unexpected exit code %d", flags, code) + } + }) +} + +// FuzzCatStdin fuzzes cat reading from stdin via shell redirection. +func FuzzCatStdin(f *testing.F) { + f.Add([]byte("hello\nworld\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add([]byte{0xff, 0xfe, 0x00, 0x01}) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add([]byte("line1\r\nline2\r\n")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, "cat < stdin.txt", dir) + if code != 0 && code != 1 { + t.Errorf("cat stdin unexpected exit code %d", code) + } + + if code == 0 && stdout != string(input) { + t.Errorf("cat stdin output differs from input: got %d bytes, want %d bytes", len(stdout), len(input)) + } + }) +} diff --git a/interp/builtins/tests/cat/helpers_test.go b/interp/builtins/tests/cat/helpers_test.go new file mode 100644 index 00000000..791f3267 --- /dev/null +++ b/interp/builtins/tests/cat/helpers_test.go @@ -0,0 +1,53 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package cat_test + +import ( + "bytes" + "context" + "errors" + "strings" + "testing" + + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + if err != nil { + t.Fatal(err) + } + var outBuf, errBuf bytes.Buffer + allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...) + runner, err := interp.New(allOpts...) + if err != nil { + t.Fatal(err) + } + defer runner.Close() + if dir != "" { + runner.Dir = dir + } + runErr := runner.Run(ctx, prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if errors.As(runErr, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return runScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} diff --git a/interp/builtins/tests/cat/testdata/fuzz/.gitkeep b/interp/builtins/tests/cat/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/cut/cut_fuzz_test.go b/interp/builtins/tests/cut/cut_fuzz_test.go new file mode 100644 index 00000000..eb1003f6 --- /dev/null +++ b/interp/builtins/tests/cut/cut_fuzz_test.go @@ -0,0 +1,316 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package cut_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "testing" + "time" + "unicode/utf8" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +// cmdRunCtxFuzz provides the test helper for fuzz tests. +// The cut package already has cmdRunCtx in the existing test file, +// but that uses a different (inline) implementation. We use a +// differently-named function to avoid redeclaration. +func cmdRunCtxFuzz(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzCutFields fuzzes cut -f with arbitrary file content and field specs. +// Edge cases: MaxLineBytes (1 MiB) cap, CRLF (\r preserved as content byte), +// null bytes, empty fields, complement, suppress, no trailing newline. +func FuzzCutFields(f *testing.F) { + f.Add([]byte("a\tb\tc\n"), "1") + f.Add([]byte("a\tb\tc\n"), "1,3") + f.Add([]byte("a\tb\tc\n"), "2-") + f.Add([]byte("a\tb\tc\n"), "-2") + f.Add([]byte("a\tb\tc\n"), "1-3") + f.Add([]byte{}, "1") + f.Add([]byte("no tab\n"), "1") + f.Add([]byte("a\x00b\tc\n"), "2") + f.Add(bytes.Repeat([]byte("x\t"), 100), "1,50,100") + f.Add([]byte("\n\n\n"), "1") + // Open-ended ranges — math.MaxInt32 sentinel in implementation + f.Add([]byte("a\tb\tc\n"), "2-") + f.Add([]byte("a\tb\tc\n"), "-2") + // Empty fields (consecutive delimiters) + f.Add([]byte(":::\n"), "1-3") + f.Add([]byte("\t\t\t\n"), "2") + // CRLF: \r is preserved as content byte, only \n is stripped + f.Add([]byte("a\tb\tc\r\n"), "3") + f.Add([]byte("a\tb\tc\r\n"), "2") + // No trailing newline + f.Add([]byte("a\tb\tc"), "1") + f.Add([]byte("a:1\nb:2"), "1") + // Lines near 1 MiB cap + f.Add(append(bytes.Repeat([]byte("a\t"), (1<<20-1)/2), "b\n"...), "1") + f.Add(append(bytes.Repeat([]byte("x"), 1<<20-1), "\n"...), "1") + // Null bytes in content (treated as regular content bytes) + f.Add([]byte("a\x00b\tc\n"), "1") + // Field at and beyond end + f.Add([]byte("a:b:c\n"), "4") + // Trailing delimiter + f.Add([]byte("a:b:\n"), "3") + // Overlapping ranges + f.Add([]byte("abcdef\n"), "1-3,2-4") + // Multiline input + f.Add([]byte("a\tb\nc\td\n"), "1") + f.Add([]byte("a\tb\nc\td\n"), "2") + + f.Fuzz(func(t *testing.T, input []byte, fieldSpec string) { + if len(input) > 1<<20 { + return + } + if len(fieldSpec) == 0 || len(fieldSpec) > 50 { + return + } + if !utf8.ValidString(fieldSpec) { + return + } + // Only allow characters valid in field specs. + for _, c := range fieldSpec { + if !((c >= '0' && c <= '9') || c == ',' || c == '-') { + return + } + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtxFuzz(ctx, t, fmt.Sprintf("cut -f %s input.txt", fieldSpec), dir) + if code != 0 && code != 1 { + t.Errorf("cut -f %s unexpected exit code %d", fieldSpec, code) + } + }) +} + +// FuzzCutBytes fuzzes cut -b with arbitrary file content and byte specs. +// Edge cases: open-ended ranges, complement, output delimiter, +// boundary positions (1st byte, last byte, beyond line), multibyte UTF-8. +func FuzzCutBytes(f *testing.F) { + f.Add([]byte("hello world\n"), "1-5") + f.Add([]byte("hello world\n"), "1,3,5") + f.Add([]byte("hello world\n"), "6-") + f.Add([]byte{}, "1") + f.Add([]byte("a\x00b\nc\n"), "1-3") + f.Add(bytes.Repeat([]byte("x"), 4097), "1-100") + // Open-start range + f.Add([]byte("abcdef\n"), "-3") + // Beyond line end + f.Add([]byte("abc\n"), "4") + f.Add([]byte("abc\n"), "5-") + // CRLF: \r is byte 3 (regular content) + f.Add([]byte("ab\r\n"), "3") + // No trailing newline + f.Add([]byte("abcdef"), "1-3") + // Lines near MaxLineBytes (1 MiB) + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), "1") + f.Add(append(bytes.Repeat([]byte("a"), 1<<20), '\n'), "1") + // Empty line + f.Add([]byte("\n"), "1") + // Multibyte UTF-8 (treated byte-by-byte) + f.Add([]byte("\xce\xb1\xce\xb2\xce\xb3\n"), "1") // α (first byte only) + f.Add([]byte("\xce\xb1\xce\xb2\xce\xb3\n"), "1-2") // full α character + // Null bytes + f.Add([]byte{0x00, 0x01, 0x02, '\n'}, "1-3") + // Large position well beyond line + f.Add([]byte("abc\n"), "1234567890") + + f.Fuzz(func(t *testing.T, input []byte, byteSpec string) { + if len(input) > 1<<20 { + return + } + if len(byteSpec) == 0 || len(byteSpec) > 50 { + return + } + if !utf8.ValidString(byteSpec) { + return + } + for _, c := range byteSpec { + if !((c >= '0' && c <= '9') || c == ',' || c == '-') { + return + } + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtxFuzz(ctx, t, fmt.Sprintf("cut -b %s input.txt", byteSpec), dir) + if code != 0 && code != 1 { + t.Errorf("cut -b %s unexpected exit code %d", byteSpec, code) + } + }) +} + +// FuzzCutDelimiter fuzzes cut -f with a custom delimiter. +// Edge cases: no-delimiter lines (printed as-is or suppressed with -s), +// consecutive delimiters (empty fields), tab delimiter. +func FuzzCutDelimiter(f *testing.F) { + f.Add([]byte("a:b:c\n"), ":", "1,3") + f.Add([]byte("a,b,c\n"), ",", "2") + f.Add([]byte("a|b|c\n"), "|", "1-2") + f.Add([]byte("no delim\n"), ":", "1") + // Empty fields from consecutive delimiters + f.Add([]byte(":::\n"), ":", "1-4") + f.Add([]byte("a::b\n"), ":", "2") + // Trailing delimiter + f.Add([]byte("a:b:\n"), ":", "3") + // CRLF: \r preserved as part of last field + f.Add([]byte("a:b:c\r\n"), ":", "3") + // Null bytes in line + f.Add([]byte("a\x00b:c\n"), ":", "1") + // Single field (no delimiter in line) + f.Add([]byte("abc\n"), ":", "1") + // Space as delimiter + f.Add([]byte("a b c\n"), " ", "2") + + f.Fuzz(func(t *testing.T, input []byte, delim string, fieldSpec string) { + if len(input) > 1<<20 { + return + } + if len(delim) != 1 { + return + } + if len(fieldSpec) == 0 || len(fieldSpec) > 50 { + return + } + if !utf8.ValidString(fieldSpec) || !utf8.ValidString(delim) { + return + } + // Delimiter must be shell-safe. + d := delim[0] + if d == '\'' || d == '\x00' || d == '\n' || d == '\\' || d == '"' || d == '`' || d == '$' { + return + } + for _, c := range fieldSpec { + if !((c >= '0' && c <= '9') || c == ',' || c == '-') { + return + } + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("cut -d '%s' -f %s input.txt", delim, fieldSpec) + _, _, code := cmdRunCtxFuzz(ctx, t, script, dir) + if code != 0 && code != 1 { + t.Errorf("cut -d '%s' -f %s unexpected exit code %d", delim, fieldSpec, code) + } + }) +} + +// FuzzCutComplement fuzzes cut --complement with -b and -f modes. +// Edge cases: complement of entire range (empty output), complement of nothing +// (full output), non-contiguous complement ranges. +func FuzzCutComplement(f *testing.F) { + f.Add([]byte("abcdef\n"), "3-4") + f.Add([]byte("9_1\n8_2\n"), "2") + // Complement of a single byte + f.Add([]byte("abcdef\n"), "1") + f.Add([]byte("abcdef\n"), "6") + // Complement of entire line (empty output) + f.Add([]byte("abc\n"), "1-") + // Complement with multiple ranges + f.Add([]byte("a:b:c:d\n"), "2,3") + // CRLF + f.Add([]byte("abcdef\r\n"), "3-4") + // No trailing newline + f.Add([]byte("abcdef"), "2") + // Empty input + f.Add([]byte{}, "1") + // Lines at 1 MiB cap + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), "1") + + f.Fuzz(func(t *testing.T, input []byte, byteSpec string) { + if len(input) > 1<<20 { + return + } + if len(byteSpec) == 0 || len(byteSpec) > 50 { + return + } + if !utf8.ValidString(byteSpec) { + return + } + for _, c := range byteSpec { + if !((c >= '0' && c <= '9') || c == ',' || c == '-') { + return + } + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtxFuzz(ctx, t, fmt.Sprintf("cut --complement -b %s input.txt", byteSpec), dir) + if code != 0 && code != 1 { + t.Errorf("cut --complement -b %s unexpected exit code %d", byteSpec, code) + } + }) +} + +// FuzzCutStdin fuzzes cut reading from stdin. +func FuzzCutStdin(f *testing.F) { + f.Add([]byte("a\tb\tc\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + // Null bytes + f.Add([]byte("a\x00b\tc\n")) + // CRLF + f.Add([]byte("a\tb\r\n")) + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, '\t', 0x80, '\n'}) + // Empty fields + f.Add([]byte("\t\t\n")) + // Lines at 1 MiB + f.Add(append(bytes.Repeat([]byte("x"), 1<<20-1), '\n')) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtxFuzz(ctx, t, "cut -f 1 < stdin.txt", dir) + if code != 0 && code != 1 { + t.Errorf("cut stdin unexpected exit code %d", code) + } + }) +} diff --git a/interp/builtins/tests/echo/echo_fuzz_test.go b/interp/builtins/tests/echo/echo_fuzz_test.go new file mode 100644 index 00000000..24c24388 --- /dev/null +++ b/interp/builtins/tests/echo/echo_fuzz_test.go @@ -0,0 +1,170 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package echo_test + +import ( + "context" + "testing" + "time" + "unicode/utf8" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzEcho fuzzes echo with arbitrary arguments (no escape processing). +func FuzzEcho(f *testing.F) { + f.Add("hello world") + f.Add("") + f.Add("a\tb\tc") + // Backslash passthrough (no -e, so \n is literal) + f.Add("no\\nnewline") + f.Add("back\\\\slash") + // Unicode + f.Add("héllo wörld") + f.Add("日本語") + f.Add("😀 emoji") + // Long argument + f.Add("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") + + f.Fuzz(func(t *testing.T, arg string) { + if len(arg) > 1000 { + return + } + if !utf8.ValidString(arg) { + return + } + for _, c := range arg { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + + dir := t.TempDir() + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "echo '"+arg+"'", dir) + if code != 0 { + t.Errorf("echo unexpected exit code %d", code) + } + }) +} + +// FuzzEchoEscapes fuzzes echo -e with arbitrary escape sequences. +// Edge cases: \c stops output, \0nnn octal, \xHH hex, \uHHHH unicode, +// surrogates replaced with U+FFFD, values >0x10FFFF silently dropped. +func FuzzEchoEscapes(f *testing.F) { + f.Add("hello\\nworld") + f.Add("\\t\\t\\t") + // Hex escapes: \xHH (up to 2 hex digits) + f.Add("\\x41\\x42\\x43") // "ABC" + f.Add("\\x0") // incomplete hex — outputs \x0 literally? no: valid 1-digit + f.Add("\\xgg") // invalid hex digits — outputs \x literally + f.Add("\\x") // no hex digits — outputs \x literally + // Octal: \0nnn (up to 3 octal digits after 0) + f.Add("\\0101") // 'A' + f.Add("\\0377") // 0xff + f.Add("\\0400") // > 0377: takes only 3 digits + f.Add("\\08") // 8 is not octal — stops after \00 + // Unicode: \uHHHH (4 hex) and \UHHHHHHHH (8 hex) + f.Add("\\u0041") // 'A' + f.Add("\\u00e9") // 'é' + f.Add("\\u4e2d") // '中' + f.Add("\\uD800") // surrogate — replaced with U+FFFD (intentional divergence from bash) + f.Add("\\uDFFF") // low surrogate — replaced with U+FFFD + f.Add("\\U00010000") // first supplementary plane + f.Add("\\U0010FFFF") // max valid codepoint + f.Add("\\U00110000") // > max — silently dropped + f.Add("\\UFFFFFFFF") // way over max — silently dropped + // \c stops further output (including trailing newline) + f.Add("hello\\cworld") + f.Add("\\c") + // Standard escapes + f.Add("\\a\\b\\e\\E\\f\\r\\v") + f.Add("\\\\") // literal backslash + // Unrecognized escape: output backslash + char literally + f.Add("\\q\\z\\j") + // Mixed + f.Add("tab:\\there\\nnewline:\\nend") + // Long sequence to stress output buffering + f.Add("\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n") + + f.Fuzz(func(t *testing.T, arg string) { + if len(arg) > 1000 { + return + } + if !utf8.ValidString(arg) { + return + } + for _, c := range arg { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + + dir := t.TempDir() + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "echo -e '"+arg+"'", dir) + if code != 0 { + t.Errorf("echo -e unexpected exit code %d", code) + } + }) +} + +// FuzzEchoFlagInteraction fuzzes echo with mixed -n/-e/-E flag combinations. +// Edge cases: last flag wins for -e/-E; -n suppresses trailing newline. +func FuzzEchoFlagInteraction(f *testing.F) { + f.Add("hello", true, false, false) // -n + f.Add("hello\\n", false, true, false) // -e + f.Add("hello\\n", false, false, true) // -E (disables escapes) + f.Add("hi\\n", false, true, true) // -e -E: -E wins (last) + f.Add("hi\\n", true, true, false) // -n -e + + f.Fuzz(func(t *testing.T, arg string, flagN, flagE, flagBigE bool) { + if len(arg) > 500 { + return + } + if !utf8.ValidString(arg) { + return + } + for _, c := range arg { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + + flags := "" + if flagN { + flags += " -n" + } + if flagE { + flags += " -e" + } + if flagBigE { + flags += " -E" + } + if flags == "" { + return + } + + dir := t.TempDir() + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "echo"+flags+" '"+arg+"'", dir) + if code != 0 { + t.Errorf("echo%s unexpected exit code %d", flags, code) + } + }) +} diff --git a/interp/builtins/tests/echo/testdata/fuzz/.gitkeep b/interp/builtins/tests/echo/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/grep/grep_fuzz_test.go b/interp/builtins/tests/grep/grep_fuzz_test.go new file mode 100644 index 00000000..c61d1626 --- /dev/null +++ b/interp/builtins/tests/grep/grep_fuzz_test.go @@ -0,0 +1,316 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package grep_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "testing" + "time" + "unicode/utf8" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzGrepFileContent fuzzes grep with a fixed pattern and arbitrary file content. +// Edge cases: binary content, null bytes, lines at 1 MiB cap, invalid UTF-8. +func FuzzGrepFileContent(f *testing.F) { + f.Add([]byte("apple\nbanana\ncherry\n"), "banana") + f.Add([]byte{}, "anything") + f.Add([]byte("no newline"), "new") + f.Add([]byte("a\x00b\nc\n"), "a") + f.Add(bytes.Repeat([]byte("x"), 4097), "x") + f.Add([]byte("\n\n\n"), ".") + f.Add([]byte("hello world\nfoo bar\n"), "foo") + f.Add([]byte{0xff, 0xfe}, "a") + // Lines at/over 1 MiB cap + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), "a") + f.Add(append(bytes.Repeat([]byte("a"), 1<<20), '\n'), "a") + // CRLF + f.Add([]byte("hello\r\nworld\r\n"), "hello") + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}, "a") + f.Add([]byte{0xed, 0xa0, 0x80, '\n'}, "a") + // Null bytes in content + f.Add([]byte{0x00, 0x00, '\n'}, "a") + // BRE special chars in content being matched + f.Add([]byte("a.b\na*b\na[b\n"), "a.b") + f.Add([]byte("(test)\n[bracket]\n"), "test") + // Word-boundary content + f.Add([]byte("foo foobar barfoo\n"), "foo") + // Multibyte content + f.Add([]byte("héllo\nmünchen\n"), "l") + + f.Fuzz(func(t *testing.T, input []byte, pattern string) { + if len(input) > 1<<20 { + return + } + if !utf8.ValidString(pattern) { + return + } + for _, c := range pattern { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + if len(pattern) == 0 { + return + } + if len(pattern) > 100 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := "grep '" + pattern + "' input.txt" + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("grep unexpected exit code %d", code) + } + }) +} + +// FuzzGrepPatterns fuzzes grep with arbitrary regex patterns on fixed content. +// Edge cases: BRE→ERE conversion, pathological backtracking patterns, empty patterns. +func FuzzGrepPatterns(f *testing.F) { + // BRE special chars + f.Add([]byte("hello world\nfoo bar\n"), "hel+o") + f.Add([]byte("aaa\nbbb\n"), "a*") + f.Add([]byte("test123\n"), "[0-9]+") + f.Add([]byte("(parens)\n"), "[(]") + // Anchors + f.Add([]byte("hello\nworld\n"), "^hello") + f.Add([]byte("hello\nworld\n"), "world$") + f.Add([]byte("hello\n"), "^hello$") + // Pathological backtracking patterns (ReDoS class) + f.Add([]byte("aaaaaaaaaaaaaab\n"), "a*a*b") + f.Add([]byte("aaaaaaaaaaaaaaaa\n"), "(a+)+") + // BRE escaping: \( is group in BRE; ( is literal + f.Add([]byte("(test)\n"), "\\(test\\)") + // Dot matches everything except newline + f.Add([]byte("abc\n"), ".") + f.Add([]byte("\n"), ".") + // Character classes + f.Add([]byte("hello123\n"), "[[:alpha:]]") + f.Add([]byte("hello123\n"), "[[:digit:]]") + f.Add([]byte("HELLO\n"), "[[:upper:]]") + // Empty match + f.Add([]byte("hello\n"), "") + // Very long pattern + f.Add([]byte("aaaa\n"), "a{1,4}") + + f.Fuzz(func(t *testing.T, input []byte, pattern string) { + if len(input) > 1<<20 { + return + } + if !utf8.ValidString(pattern) { + return + } + for _, c := range pattern { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + if len(pattern) > 100 { + return + } + if len(pattern) == 0 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "grep '"+pattern+"' input.txt", dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("grep pattern %q unexpected exit code %d", pattern, code) + } + }) +} + +// FuzzGrepStdin fuzzes grep reading from stdin with arbitrary content. +func FuzzGrepStdin(f *testing.F) { + f.Add([]byte("apple\nbanana\ncherry\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add([]byte("line1\r\nline2\r\n")) + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n')) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "grep '.' < stdin.txt", dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("grep stdin unexpected exit code %d", code) + } + }) +} + +// FuzzGrepFixedStrings fuzzes grep -F (fixed string mode) with arbitrary content and patterns. +// CVE-2015-1345 affected bmexec_trans in kwset.c when using -F; out-of-bounds heap read +// triggered by crafted input+pattern combinations in Boyer-Moore-Horspool matching. +// CVE-2012-5667 was an integer overflow triggered by lines >= 2^31 bytes (we cap at 1 MiB). +func FuzzGrepFixedStrings(f *testing.F) { + f.Add([]byte("hello world\nfoo bar\n"), "hello") + f.Add([]byte{}, "pattern") + f.Add([]byte("no newline"), "no") + f.Add([]byte("a\x00b\nc\n"), "a") + // Patterns that look like regex metacharacters (treated as literals with -F) + f.Add([]byte("(parens)\n[bracket]\na.b\na*b\n"), "(parens)") + f.Add([]byte("(parens)\n[bracket]\na.b\na*b\n"), "[bracket]") + f.Add([]byte("a.b\naab\n"), "a.b") // dot is literal, not wildcard + f.Add([]byte("a*b\nab\n"), "a*b") // star is literal, not quantifier + f.Add([]byte("a+b\nab\n"), "a+b") // plus is literal + f.Add([]byte("a?b\nab\n"), "a?b") // question mark is literal + f.Add([]byte("^start\n"), "^start") // caret is literal + f.Add([]byte("end$\n"), "end$") // dollar is literal + // Backslash in pattern (treated as literal with -F) + f.Add([]byte("a\\b\nab\n"), "a\\b") + // Empty pattern match + f.Add([]byte("hello\nworld\n"), "") + // Binary content with printable pattern + f.Add([]byte{0xff, 0xfe, 'h', 'i', '\n'}, "hi") + // CRLF + f.Add([]byte("hello\r\nworld\r\n"), "hello") + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, 'h', 'i', '\n'}, "hi") + // Near 1 MiB line cap (CVE-2012-5667 was 2^31; we test our 1 MiB boundary) + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), "a") + + f.Fuzz(func(t *testing.T, input []byte, pattern string) { + if len(input) > 1<<20 { + return + } + if !utf8.ValidString(pattern) { + return + } + if len(pattern) > 100 { + return + } + for _, c := range pattern { + if c == '\'' || c == '\x00' || c == '\n' { + return + } + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "grep -F '"+pattern+"' input.txt", dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("grep -F unexpected exit code %d", code) + } + }) +} + +// FuzzGrepFlags fuzzes grep with various flag combinations and arbitrary file content. +// Edge cases: context line clamping (MaxContextLines=1000), -q early exit, -o empty match. +func FuzzGrepFlags(f *testing.F) { + f.Add([]byte("Hello\nworld\nHELLO\n"), true, false, false, false, int64(0), int64(0)) + f.Add([]byte("line1\nline2\n"), false, true, false, false, int64(0), int64(0)) + f.Add([]byte{}, true, true, false, false, int64(0), int64(0)) + f.Add([]byte("no newline"), false, false, false, false, int64(0), int64(0)) + f.Add(bytes.Repeat([]byte("abc\n"), 100), true, false, false, false, int64(0), int64(0)) + // Context lines + f.Add([]byte("a\nb\nc\nd\ne\n"), false, false, false, false, int64(2), int64(0)) + f.Add([]byte("a\nb\nc\nd\ne\n"), false, false, false, false, int64(0), int64(2)) + // Context clamping at MaxContextLines=1000 + f.Add([]byte("a\nb\n"), false, false, false, false, int64(1001), int64(0)) + // -c (count) mode + f.Add([]byte("a\na\nb\n"), false, false, true, false, int64(0), int64(0)) + // -q (quiet) mode: exits on first match + f.Add([]byte("a\nb\nc\n"), false, false, false, true, int64(0), int64(0)) + // Binary content + f.Add([]byte{0xff, 0xfe, '\n'}, true, false, false, false, int64(0), int64(0)) + + f.Fuzz(func(t *testing.T, input []byte, caseInsensitive, invertMatch, countOnly, quiet bool, afterCtx, beforeCtx int64) { + if len(input) > 1<<20 { + return + } + if afterCtx < 0 || afterCtx > 100 { + return + } + if beforeCtx < 0 || beforeCtx > 100 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + flags := "" + if caseInsensitive { + flags += " -i" + } + if invertMatch { + flags += " -v" + } + if countOnly { + flags += " -c" + } + if quiet { + flags += " -q" + } + if afterCtx > 0 { + flags += " -A " + fmt.Sprintf("%d", afterCtx) + } + if beforeCtx > 0 { + flags += " -B " + fmt.Sprintf("%d", beforeCtx) + } + + script := "grep" + flags + " 'a' input.txt" + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("grep%s unexpected exit code %d", flags, code) + } + }) +} diff --git a/interp/builtins/tests/grep/testdata/fuzz/.gitkeep b/interp/builtins/tests/grep/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/grep/testdata/fuzz/FuzzGrepFixedStrings/4d2cb17569d4b172 b/interp/builtins/tests/grep/testdata/fuzz/FuzzGrepFixedStrings/4d2cb17569d4b172 new file mode 100644 index 00000000..b61ead4a --- /dev/null +++ b/interp/builtins/tests/grep/testdata/fuzz/FuzzGrepFixedStrings/4d2cb17569d4b172 @@ -0,0 +1,3 @@ +go test fuzz v1 +[]byte("0") +string("\x85") diff --git a/interp/builtins/tests/head/head_differential_fuzz_test.go b/interp/builtins/tests/head/head_differential_fuzz_test.go new file mode 100644 index 00000000..724d2ec7 --- /dev/null +++ b/interp/builtins/tests/head/head_differential_fuzz_test.go @@ -0,0 +1,160 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package head_test + +import ( + "bytes" + "context" + "fmt" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + "time" +) + +// runGNUInDir runs a GNU command under LC_ALL=C.UTF-8 with its working +// directory set to dir. args[0] is the command name; args[1:] are arguments. +func runGNUInDir(t *testing.T, dir string, args []string) (stdout string, exitCode int) { + t.Helper() + if _, err := exec.LookPath(args[0]); err != nil { + t.Skipf("%s not found: %v", args[0], err) + } + + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + cmd.Env = append(os.Environ(), "LC_ALL=C.UTF-8") + + var outBuf bytes.Buffer + cmd.Stdout = &outBuf + + err := cmd.Run() + exitCode = 0 + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + exitCode = exitErr.ExitCode() + } else { + t.Logf("gnu exec error: %v", err) + return "", -1 + } + } + return outBuf.String(), exitCode +} + +func isSandboxError(stderr string) bool { + lower := strings.ToLower(stderr) + return strings.Contains(lower, "permission denied") || + strings.Contains(lower, "not allowed") || + strings.Contains(lower, "sandbox") +} + +// FuzzHeadDifferentialLines compares rshell head -n N output against GNU head. +func FuzzHeadDifferentialLines(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte(""), int64(0)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\nb\nc\n"), int64(100)) + f.Add([]byte("\n\n\n"), int64(2)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add([]byte("single line\n"), int64(1)) + f.Add([]byte("a\nb\nc\nd\ne\n"), int64(3)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 64*1024 { + return + } + if n < 0 || n > 10000 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + nStr := fmt.Sprintf("%d", n) + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, fmt.Sprintf("head -n %s input.txt", nStr), dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"head", "-n", nStr, "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("stdout mismatch for n=%d:\nrshell: %q\ngnu: %q\ninput: %q", n, rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("exit code mismatch for n=%d: rshell=%d gnu=%d", n, rshellCode, gnuCode) + } + }) +} + +// FuzzHeadDifferentialBytes compares rshell head -c N output against GNU head. +func FuzzHeadDifferentialBytes(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("line1\nline2\nline3\n"), int64(5)) + f.Add([]byte(""), int64(0)) + f.Add([]byte("no newline"), int64(3)) + f.Add([]byte("a\x00b\nc\n"), int64(4)) + f.Add([]byte("\n\n\n"), int64(2)) + f.Add([]byte("hello world\n"), int64(5)) + f.Add([]byte("abcdef\n"), int64(6)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 64*1024 { + return + } + if n < 0 || n > 10000 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + nStr := fmt.Sprintf("%d", n) + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, fmt.Sprintf("head -c %s input.txt", nStr), dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"head", "-c", nStr, "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("stdout mismatch for -c %d:\nrshell: %q\ngnu: %q\ninput: %q", n, rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("exit code mismatch for -c %d: rshell=%d gnu=%d", n, rshellCode, gnuCode) + } + }) +} diff --git a/interp/builtins/tests/head/head_fuzz_test.go b/interp/builtins/tests/head/head_fuzz_test.go new file mode 100644 index 00000000..05cf3c1a --- /dev/null +++ b/interp/builtins/tests/head/head_fuzz_test.go @@ -0,0 +1,182 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package head_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "strings" + "testing" + "time" +) + +// FuzzHeadLines fuzzes head -n N with arbitrary file content. +// Edge cases: MaxCount clamp (2^31-1), line-length cap (1 MiB), no trailing newline. +func FuzzHeadLines(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte{}, int64(0)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(1)) + f.Add([]byte("\n\n\n"), int64(5)) + f.Add(bytes.Repeat([]byte("y"), 4096), int64(1)) + f.Add([]byte("hello\nworld\n"), int64(10)) + // MaxCount boundary — must be clamped, not OOM + f.Add([]byte("tiny\n"), int64(1<<31-1)) + f.Add([]byte("tiny\n"), int64(9999999999)) + // n=0 must produce no output + f.Add([]byte("a\nb\nc\n"), int64(0)) + // Exactly at line scanner cap (1 MiB - 1) — should succeed + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), int64(1)) + // Over line scanner cap — should error, not panic + f.Add(append(bytes.Repeat([]byte("a"), 1<<20), '\n'), int64(1)) + // Binary / null bytes + f.Add([]byte("a\x00b\x00c\n"), int64(1)) + // CRLF — must be preserved + f.Add([]byte("line1\r\nline2\r\nline3\r\n"), int64(2)) + // Invalid UTF-8 (CVE-class: must not panic) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}, int64(1)) + // Leading + sign on count (handled as positive, not error) + // (tested by passing n directly; shell arg would be "+N" which head accepts) + // Multiple blank lines + f.Add([]byte("\n\n\n\n\n"), int64(3)) + // No trailing newline on last output line + f.Add([]byte("line1\nline2"), int64(2)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("head -n %d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("unexpected exit code %d", code) + } + + // If successful, output line count must be <= n + if code == 0 && n >= 0 { + lineCount := strings.Count(stdout, "\n") + if int64(lineCount) > n { + t.Errorf("head -n %d produced %d newlines in output", n, lineCount) + } + } + }) +} + +// FuzzHeadBytes fuzzes head -c N with arbitrary file content. +// Edge cases: MaxCount clamp, 32 KiB chunk boundary, binary content. +func FuzzHeadBytes(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(5)) + f.Add([]byte{}, int64(0)) + f.Add([]byte("no newline"), int64(3)) + f.Add([]byte("a\x00b\nc\n"), int64(4)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(4096)) + f.Add([]byte("\n\n\n"), int64(2)) + // Chunk boundary (32 KiB) + f.Add(bytes.Repeat([]byte("z"), 32*1024), int64(32*1024)) + f.Add(bytes.Repeat([]byte("z"), 32*1024+1), int64(32*1024)) + // MaxCount boundary + f.Add([]byte("tiny"), int64(1<<31-1)) + f.Add([]byte("tiny"), int64(9999999999)) + // n=0 → no output + f.Add([]byte("abc"), int64(0)) + // Binary content + f.Add([]byte{0x00, 0x01, 0x02, 0x03, 0xff, 0xfe}, int64(4)) + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf}, int64(6)) + // CRLF + f.Add([]byte("a\r\nb\r\n"), int64(3)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("head -c %d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("unexpected exit code %d", code) + } + + // If successful, output byte count must be <= n + if code == 0 { + outLen := int64(len(stdout)) + if outLen > n { + t.Errorf("head -c %d produced %d bytes of output", n, outLen) + } + } + }) +} + +// FuzzHeadStdin fuzzes head -n N reading from stdin via shell redirection. +func FuzzHeadStdin(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte{}, int64(1)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(1)) + f.Add([]byte("\n\n\n"), int64(3)) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}, int64(1)) + f.Add([]byte("line1\r\nline2\r\n"), int64(1)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("head -n %d < stdin.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("unexpected exit code %d (stdin mode)", code) + } + }) +} diff --git a/interp/builtins/tests/head/helpers_test.go b/interp/builtins/tests/head/helpers_test.go new file mode 100644 index 00000000..95caab35 --- /dev/null +++ b/interp/builtins/tests/head/helpers_test.go @@ -0,0 +1,53 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package head_test + +import ( + "bytes" + "context" + "errors" + "strings" + "testing" + + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + if err != nil { + t.Fatal(err) + } + var outBuf, errBuf bytes.Buffer + allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...) + runner, err := interp.New(allOpts...) + if err != nil { + t.Fatal(err) + } + defer runner.Close() + if dir != "" { + runner.Dir = dir + } + runErr := runner.Run(ctx, prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if errors.As(runErr, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return runScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} diff --git a/interp/builtins/tests/head/testdata/fuzz/.gitkeep b/interp/builtins/tests/head/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/ls/ls_fuzz_test.go b/interp/builtins/tests/ls/ls_fuzz_test.go new file mode 100644 index 00000000..15c7afea --- /dev/null +++ b/interp/builtins/tests/ls/ls_fuzz_test.go @@ -0,0 +1,252 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package ls_test + +import ( + "context" + "os" + "path/filepath" + "testing" + "time" + "unicode/utf8" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzLsFlags fuzzes ls with various flag combinations on directories with random filenames. +// Edge cases: hidden files (-a/-A), -d flag, last sort flag wins (-S vs -t), +// -F indicator, -p append-slash, -l long format with -h human-readable. +func FuzzLsFlags(f *testing.F) { + f.Add("file1.txt", true, false, false, false, false) + f.Add(".hidden", false, true, false, false, false) + f.Add("file.txt", false, false, true, false, false) + f.Add("file.txt", false, false, false, true, false) + f.Add("file.txt", false, false, false, false, true) + // Hidden file with -a (shows it) + f.Add(".dotfile", true, false, false, false, false) + // Hidden file without any flag (hidden) + f.Add(".hidden2", false, false, false, false, false) + // File with -F indicator (-F appends * for executables) + f.Add("script.sh", false, false, false, false, true) + // -l long format with -h human-readable sizes + f.Add("data.bin", true, false, false, false, false) + // -S sort by size + f.Add("small.txt", false, false, true, false, false) + // Unicode filename + f.Add("日本語.txt", false, false, false, false, false) + f.Add("héllo.txt", false, false, false, false, false) + // Various common filenames + f.Add("README.md", false, false, false, false, false) + f.Add("Makefile", false, false, false, false, false) + + f.Fuzz(func(t *testing.T, filename string, flagL, flagA, flagR, flagS, flagF bool) { + if len(filename) == 0 || len(filename) > 100 { + return + } + if !utf8.ValidString(filename) { + return + } + // Skip filenames with characters problematic for shell or filesystem. + for _, c := range filename { + if c == '\'' || c == '\x00' || c == '\n' || c == '/' || c == '\\' || c == '"' || c == '`' || c == '$' { + return + } + } + // Skip filenames starting with - (would be treated as flags). + if filename[0] == '-' { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, filename), []byte("content"), 0644); err != nil { + // Some filenames may be invalid on the OS. + return + } + + flags := "" + if flagL { + flags += " -l" + } + if flagA { + flags += " -a" + } + if flagR { + flags += " -r" + } + if flagS { + flags += " -S" + } + if flagF { + flags += " -F" + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "ls"+flags, dir) + if code != 0 && code != 1 { + t.Errorf("ls%s unexpected exit code %d", flags, code) + } + }) +} + +// FuzzLsRecursive fuzzes ls -R on nested directories. +// Edge cases: maxRecursionDepth=256 (depth 255 is last valid, 256 should error), +// empty subdirectories, hidden subdirectories. +func FuzzLsRecursive(f *testing.F) { + f.Add(int64(1)) + f.Add(int64(3)) + f.Add(int64(5)) + // Near recursion depth limit (maxRecursionDepth=256) + f.Add(int64(254)) + f.Add(int64(255)) + f.Add(int64(256)) + f.Add(int64(257)) + // Zero and negative handled by guard + f.Add(int64(0)) + + f.Fuzz(func(t *testing.T, depth int64) { + if depth < 0 || depth > 10 { + return + } + + dir := t.TempDir() + current := dir + for i := int64(0); i < depth; i++ { + subdir := filepath.Join(current, "sub") + if err := os.Mkdir(subdir, 0755); err != nil { + t.Fatal(err) + } + if err := os.WriteFile(filepath.Join(current, "file.txt"), []byte("x"), 0644); err != nil { + t.Fatal(err) + } + current = subdir + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "ls -R", dir) + if code != 0 && code != 1 { + t.Errorf("ls -R unexpected exit code %d", code) + } + }) +} + +// FuzzLsHumanReadable fuzzes ls -lh (long format with human-readable sizes). +// Edge cases: humanSize thresholds (< 1024 bytes, ~1K, ~1M, ~1G), +// zero-byte files, files at exact power-of-1024 boundaries. +func FuzzLsHumanReadable(f *testing.F) { + // Below 1024 (shown as raw bytes) + f.Add(int64(0)) + f.Add(int64(1)) + f.Add(int64(1023)) + // At 1K boundary + f.Add(int64(1024)) + f.Add(int64(1025)) + // Below 10K (shown as %.1fK format) + f.Add(int64(1024 * 9)) + // At 10K (shown as %.0fK format) + f.Add(int64(1024 * 10)) + f.Add(int64(1024 * 100)) + // At 1M boundary + f.Add(int64(1024 * 1024)) + f.Add(int64(1024*1024 - 1)) + // At 1G boundary + f.Add(int64(1024 * 1024 * 1024)) + // Negative size (shouldn't happen but check robustness) + f.Add(int64(512)) + + f.Fuzz(func(t *testing.T, fileSize int64) { + // Clamp to 1 MiB to avoid slow file creation. + if fileSize < 0 || fileSize > 1<<20 { + return + } + + dir := t.TempDir() + // Create a file with the specified size using Truncate. + fpath := filepath.Join(dir, "testfile.bin") + fh, err := os.Create(fpath) + if err != nil { + t.Fatal(err) + } + if fileSize > 0 { + if err := fh.Truncate(fileSize); err != nil { + fh.Close() + t.Fatal(err) + } + } + fh.Close() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "ls -lh testfile.bin", dir) + if code != 0 && code != 1 { + t.Errorf("ls -lh unexpected exit code %d", code) + } + }) +} + +// FuzzLsMultipleFiles fuzzes ls with multiple files and mixed file types. +// Edge cases: files listed before dirs (GNU ls ordering), -d flag (no dir expansion), +// non-existent targets, sorting with -t (time) and -S (size). +func FuzzLsMultipleFiles(f *testing.F) { + f.Add(true, false, false, false) // -l + f.Add(false, true, false, false) // -a + f.Add(false, false, true, false) // -t sort by time + f.Add(false, false, false, true) // -S sort by size + // Combined flags + f.Add(true, true, false, false) // -la + f.Add(true, false, false, true) // -lS + f.Add(true, false, true, false) // -lt + + f.Fuzz(func(t *testing.T, flagL, flagA, flagT, flagS bool) { + dir := t.TempDir() + + // Create a mix of files and a subdirectory. + files := []struct { + name string + content string + }{ + {"file1.txt", "short"}, + {"file2.txt", "this is longer content"}, + {".hidden", "hidden"}, + } + for _, f := range files { + _ = os.WriteFile(filepath.Join(dir, f.name), []byte(f.content), 0644) + } + _ = os.Mkdir(filepath.Join(dir, "subdir"), 0755) + + flags := "" + if flagL { + flags += " -l" + } + if flagA { + flags += " -a" + } + if flagT { + flags += " -t" + } + if flagS { + flags += " -S" + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "ls"+flags, dir) + if code != 0 && code != 1 { + t.Errorf("ls%s unexpected exit code %d", flags, code) + } + }) +} diff --git a/interp/builtins/tests/ls/testdata/fuzz/.gitkeep b/interp/builtins/tests/ls/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/strings_cmd/strings_fuzz_test.go b/interp/builtins/tests/strings_cmd/strings_fuzz_test.go new file mode 100644 index 00000000..a4dd17ba --- /dev/null +++ b/interp/builtins/tests/strings_cmd/strings_fuzz_test.go @@ -0,0 +1,221 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package strings_cmd_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "testing" + "time" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzStrings fuzzes strings with arbitrary file content. +// Edge cases: isPrintable boundary bytes (0x1f not printable, 0x20 yes; +// 0x7e yes, 0x7f not; 0x09 tab yes, 0x0a newline not), defaultMinLen=4, +// maxStringLen=1 MiB cap, chunk boundary at 32 KiB. +func FuzzStrings(f *testing.F) { + f.Add([]byte("hello world\x00\x01\x02binary\x00readable text\n")) + f.Add([]byte{}) + f.Add([]byte{0x00, 0x01, 0x02, 0x03}) + f.Add([]byte("all printable text\n")) + f.Add(bytes.Repeat([]byte{0xff}, 4097)) + f.Add(bytes.Repeat([]byte("abcd"), 1024)) + f.Add([]byte("short\x00ab\x00longer string here\x00")) + // isPrintable boundary: 0x1f (not printable) vs 0x20 (space, printable) + f.Add([]byte{0x1f, 'a', 'b', 'c', 'd', 0x1f}) + f.Add([]byte{0x20, 'a', 'b', 'c', 'd', 0x20}) + // 0x7e (~) is printable, 0x7f (DEL) is not + f.Add([]byte{0x7e, 'a', 'b', 'c', 'd', 0x7e}) + f.Add([]byte{0x7f, 'a', 'b', 'c', 'd', 0x7f}) + // 0x09 (tab) is printable, 0x08 (backspace) is not + f.Add([]byte{'\t', 'a', 'b', 'c', 'd', '\t'}) + f.Add([]byte{0x08, 'a', 'b', 'c', 'd', 0x08}) + // Exactly 4 bytes (default minimum length — boundary) + f.Add([]byte("abcd")) + // Exactly 3 bytes (below minimum — should not print) + f.Add([]byte("abc")) + // maxStringLen: printable run at 1 MiB boundary (capped, then continues) + f.Add(bytes.Repeat([]byte("x"), 1<<20-1)) + f.Add(bytes.Repeat([]byte("x"), 1<<20)) + f.Add(bytes.Repeat([]byte("x"), 1<<20+1)) + // Chunk boundary at 32 KiB: string spanning two chunks + f.Add(append(bytes.Repeat([]byte("a"), 32*1024-2), []byte("bc\x00rest")...)) + // Alternating printable/non-printable + f.Add(bytes.Repeat([]byte{'a', 0x00}, 100)) + // Only tab characters (printable) + f.Add(bytes.Repeat([]byte{'\t'}, 10)) + // High bytes (all non-printable) + f.Add(bytes.Repeat([]byte{0x80}, 100)) + f.Add(bytes.Repeat([]byte{0xff}, 100)) + // Null bytes as non-printable terminators + f.Add([]byte{0x00, 'h', 'e', 'l', 'l', 'o', 0x00}) + // Mixed printable sequences of various lengths + f.Add([]byte("ab\x00abc\x00abcd\x00abcde\x00")) + // ELF magic bytes (CVE-2014-8485: crafted ELF triggers libbfd on old binutils; + // our implementation scans raw bytes without libbfd, so no CVE exposure, + // but good to confirm graceful handling of binary format magic numbers) + f.Add([]byte{0x7f, 'E', 'L', 'F', 0x02, 0x01, 0x01, 0x00, 0x00, 0x00}) + // PE/COFF magic (Windows executables) + f.Add([]byte{'M', 'Z', 0x90, 0x00, 0x03, 0x00, 0x00, 0x00}) + // ZIP magic + f.Add([]byte{'P', 'K', 0x03, 0x04}) + // PDF magic with printable sequences inside + f.Add([]byte("%PDF-1.4\x00\x00\x00binary\x00more text here\x00")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.bin"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "strings input.bin", dir) + if code != 0 && code != 1 { + t.Errorf("strings unexpected exit code %d", code) + } + }) +} + +// FuzzStringsMinLen fuzzes strings -n N with arbitrary file content and min length. +// Edge cases: n=1 (every single printable), n=maxStringLen (1 MiB), +// sequences exactly at boundary, below boundary. +func FuzzStringsMinLen(f *testing.F) { + f.Add([]byte("hello world\x00\x01\x02binary\n"), int64(4)) + f.Add([]byte("ab\x00cdef\x00gh\n"), int64(1)) + f.Add([]byte("ab\x00cdef\x00gh\n"), int64(10)) + f.Add([]byte{}, int64(4)) + f.Add(bytes.Repeat([]byte("x"), 100), int64(50)) + // n=1: every printable byte reported individually + f.Add([]byte("a\x00b\x00c\x00"), int64(1)) + // n=3 vs 4 (default): boundary between short/long sequences + f.Add([]byte("abc\x00abcd\x00"), int64(3)) + f.Add([]byte("abc\x00abcd\x00"), int64(4)) + // Sequence exactly at minLen boundary + f.Add([]byte("abcde\x00"), int64(5)) + f.Add([]byte("abcde\x00"), int64(6)) + // Large minLen: only very long sequences match + f.Add(bytes.Repeat([]byte("x"), 1000), int64(999)) + f.Add(bytes.Repeat([]byte("x"), 1000), int64(1000)) + f.Add(bytes.Repeat([]byte("x"), 1000), int64(1001)) + // Tab as printable (contributes to sequence length) + f.Add([]byte("ab\tcd\x00"), int64(4)) + + f.Fuzz(func(t *testing.T, input []byte, minLen int64) { + if len(input) > 1<<20 { + return + } + if minLen < 1 || minLen > 1000 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.bin"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("strings -n %d input.bin", minLen), dir) + if code != 0 && code != 1 { + t.Errorf("strings -n %d unexpected exit code %d", minLen, code) + } + }) +} + +// FuzzStringsRadix fuzzes strings -t with offset radix formatting. +// Edge cases: 7-char field width (offsets > 9999999 overflow), large files, +// offsets at octal/decimal/hex field boundaries. +func FuzzStringsRadix(f *testing.F) { + f.Add([]byte("hello\x00world\x00text\n"), "o") + f.Add([]byte("hello\x00world\x00text\n"), "d") + f.Add([]byte("hello\x00world\x00text\n"), "x") + // Large offset: test 7-char field formatting + // At offset >= 8388608 (octal 40000000), octal offset exceeds 7 chars + f.Add(append(bytes.Repeat([]byte{0x00}, 8388608), []byte("hello")...), "o") + // Offset at decimal 9999999 (7 chars), 10000000 (8 chars — overflows field) + f.Add(append(bytes.Repeat([]byte{0x00}, 9999995), []byte("abcde")...), "d") + // Hex offset boundary: 0xfffffff = 268435455 (8 hex chars) + f.Add(append(bytes.Repeat([]byte{0x00}, 16), []byte("hello")...), "x") + // Empty input + f.Add([]byte{}, "d") + // All non-printable (no output) + f.Add(bytes.Repeat([]byte{0x00}, 100), "x") + // Multiple strings with increasing offsets + f.Add([]byte("hello\x00world\x00foo\x00bar\x00"), "d") + + f.Fuzz(func(t *testing.T, input []byte, radix string) { + if len(input) > 1<<20 { + return + } + if radix != "o" && radix != "d" && radix != "x" { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.bin"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("strings -t %s input.bin", radix), dir) + if code != 0 && code != 1 { + t.Errorf("strings -t %s unexpected exit code %d", radix, code) + } + }) +} + +// FuzzStringsStdin fuzzes strings reading from stdin. +func FuzzStringsStdin(f *testing.F) { + f.Add([]byte("hello\x00\x01\x02world\n")) + f.Add([]byte{}) + f.Add([]byte{0x00, 0x01, 0x02, 0x03}) + // Printable boundary bytes + f.Add([]byte{0x1f, 'a', 'b', 'c', 'd', 0x20}) + f.Add([]byte{0x7e, 'a', 'b', 'c', 'd', 0x7f}) + // Tab printable + f.Add([]byte{'\t', 'a', 'b', 'c', '\t'}) + // Chunk boundary + f.Add(append(bytes.Repeat([]byte("a"), 32*1024-1), 0x00)) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "stdin.bin"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "strings < stdin.bin", dir) + if code != 0 && code != 1 { + t.Errorf("strings stdin unexpected exit code %d", code) + } + }) +} diff --git a/interp/builtins/tests/strings_cmd/testdata/fuzz/.gitkeep b/interp/builtins/tests/strings_cmd/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/tail/helpers_test.go b/interp/builtins/tests/tail/helpers_test.go new file mode 100644 index 00000000..b8c88401 --- /dev/null +++ b/interp/builtins/tests/tail/helpers_test.go @@ -0,0 +1,53 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package tail_test + +import ( + "bytes" + "context" + "errors" + "strings" + "testing" + + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + if err != nil { + t.Fatal(err) + } + var outBuf, errBuf bytes.Buffer + allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...) + runner, err := interp.New(allOpts...) + if err != nil { + t.Fatal(err) + } + defer runner.Close() + if dir != "" { + runner.Dir = dir + } + runErr := runner.Run(ctx, prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if errors.As(runErr, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return runScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} diff --git a/interp/builtins/tests/tail/tail_differential_fuzz_test.go b/interp/builtins/tests/tail/tail_differential_fuzz_test.go new file mode 100644 index 00000000..3d6dd2f9 --- /dev/null +++ b/interp/builtins/tests/tail/tail_differential_fuzz_test.go @@ -0,0 +1,109 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package tail_test + +import ( + "bytes" + "context" + "fmt" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + "time" +) + +// runGNUInDir runs a GNU command under LC_ALL=C.UTF-8 with its working +// directory set to dir. args[0] is the command name; args[1:] are arguments. +func runGNUInDir(t *testing.T, dir string, args []string) (stdout string, exitCode int) { + t.Helper() + if _, err := exec.LookPath(args[0]); err != nil { + t.Skipf("%s not found: %v", args[0], err) + } + + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + cmd.Env = append(os.Environ(), "LC_ALL=C.UTF-8") + + var outBuf bytes.Buffer + cmd.Stdout = &outBuf + + err := cmd.Run() + exitCode = 0 + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + exitCode = exitErr.ExitCode() + } else { + t.Logf("gnu exec error: %v", err) + return "", -1 + } + } + return outBuf.String(), exitCode +} + +func isSandboxError(stderr string) bool { + lower := strings.ToLower(stderr) + return strings.Contains(lower, "permission denied") || + strings.Contains(lower, "not allowed") || + strings.Contains(lower, "sandbox") +} + +// FuzzTailDifferential compares rshell tail -n N output against GNU tail. +func FuzzTailDifferential(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte(""), int64(0)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\nb\nc\n"), int64(100)) + f.Add([]byte("\n\n\n"), int64(2)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add([]byte("single line\n"), int64(1)) + f.Add([]byte("a\nb\nc\nd\ne\n"), int64(3)) + f.Add(bytes.Repeat([]byte("line\n"), 20), int64(5)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 64*1024 { + return + } + if n < 0 || n > 10000 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + nStr := fmt.Sprintf("%d", n) + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, fmt.Sprintf("tail -n %s input.txt", nStr), dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"tail", "-n", nStr, "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("tail -n %d stdout mismatch:\nrshell: %q\ngnu: %q\ninput: %q", n, rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("tail -n %d exit code mismatch: rshell=%d gnu=%d", n, rshellCode, gnuCode) + } + }) +} diff --git a/interp/builtins/tests/tail/tail_fuzz_test.go b/interp/builtins/tests/tail/tail_fuzz_test.go new file mode 100644 index 00000000..579c5366 --- /dev/null +++ b/interp/builtins/tests/tail/tail_fuzz_test.go @@ -0,0 +1,272 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package tail_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "strings" + "testing" + "time" +) + +// FuzzTailLines fuzzes tail -n N with arbitrary file content. +// Edge cases: ring buffer limits (100K lines, 64 MiB), MaxCount clamp (2^31-1), +// negative values treated as absolute, no-trailing-newline preservation. +func FuzzTailLines(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte{}, int64(0)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(1)) + f.Add([]byte("\n\n\n"), int64(5)) + f.Add(bytes.Repeat([]byte("y"), 4096), int64(1)) + f.Add([]byte("hello\nworld\n"), int64(10)) + // MaxCount boundary — clamp prevents allocation + f.Add([]byte("tiny\n"), int64(1<<31-1)) + f.Add([]byte("tiny\n"), int64(9999999999)) + // n=0 → no output + f.Add([]byte("a\nb\nc\n"), int64(0)) + // Binary / null bytes in line + f.Add([]byte("a\x00b\x00c\n"), int64(1)) + // CRLF lines + f.Add([]byte("line1\r\nline2\r\nline3\r\n"), int64(2)) + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}, int64(1)) + // Lines at 1 MiB cap boundary + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n'), int64(1)) + f.Add(append(bytes.Repeat([]byte("b"), 1<<20), '\n'), int64(1)) + // Chunk-boundary straddle (ring buffer 32 KiB chunks) + f.Add(bytes.Repeat([]byte("z\n"), 32*1024/2), int64(5)) + // No trailing newline on last line + f.Add([]byte("line1\nline2"), int64(1)) + // Many blank lines (stress ring buffer) + f.Add(bytes.Repeat([]byte("\n"), 1000), int64(5)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("tail -n %d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("tail -n %d unexpected exit code %d", n, code) + } + + // If successful, output line count must be <= n + if code == 0 && n >= 0 { + lineCount := strings.Count(stdout, "\n") + if int64(lineCount) > n { + t.Errorf("tail -n %d produced %d newlines in output", n, lineCount) + } + } + }) +} + +// FuzzTailBytes fuzzes tail -c N with arbitrary file content. +// Edge cases: circular byte buffer (32 MiB), MaxCount clamp, binary content. +func FuzzTailBytes(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(5)) + f.Add([]byte{}, int64(0)) + f.Add([]byte("no newline"), int64(3)) + f.Add([]byte("a\x00b\nc\n"), int64(4)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(4096)) + f.Add([]byte("\n\n\n"), int64(2)) + // MaxCount boundary + f.Add([]byte("tiny"), int64(1<<31-1)) + f.Add([]byte("tiny"), int64(9999999999)) + // n=0 → no output + f.Add([]byte("abc"), int64(0)) + // Binary content (null bytes, high bytes) + f.Add([]byte{0x00, 0x01, 0x02, 0x03, 0xff, 0xfe}, int64(4)) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf}, int64(6)) + // CRLF + f.Add([]byte("a\r\nb\r\n"), int64(3)) + // Chunk boundary (32 KiB) + f.Add(bytes.Repeat([]byte("z"), 32*1024+1), int64(1)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + stdout, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("tail -c %d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("tail -c %d unexpected exit code %d", n, code) + } + + // If successful, output byte count must be <= n + if code == 0 { + outLen := int64(len(stdout)) + if outLen > n { + t.Errorf("tail -c %d produced %d bytes of output", n, outLen) + } + } + }) +} + +// FuzzTailStdin fuzzes tail -n N reading from stdin via shell redirection. +// Stdin is treated as a non-regular file — MaxTotalReadBytes (256 MiB) applies. +func FuzzTailStdin(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte{}, int64(1)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(1)) + f.Add([]byte("\n\n\n"), int64(3)) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}, int64(1)) + f.Add([]byte("line1\r\nline2\r\n"), int64(1)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 0 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("tail -n %d < stdin.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("tail stdin unexpected exit code %d", code) + } + }) +} + +// FuzzTailLinesOffset fuzzes tail -n +N (skip-first-N-lines offset mode). +// Edge cases: +1 streams entire file, very large +N skips everything. +func FuzzTailLinesOffset(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n"), int64(1)) + f.Add([]byte("line1\nline2\nline3\n"), int64(2)) + f.Add([]byte{}, int64(1)) + f.Add([]byte("no newline"), int64(1)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(1)) + f.Add([]byte("\n\n\n"), int64(5)) + f.Add([]byte("hello\nworld\n"), int64(100)) + // +1 streams entire file + f.Add([]byte("a\nb\nc\n"), int64(1)) + // +N > line count → empty output + f.Add([]byte("a\nb\n"), int64(1000)) + // Binary + f.Add([]byte("a\x00b\nc\n"), int64(1)) + // CRLF + f.Add([]byte("a\r\nb\r\nc\r\n"), int64(2)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 1 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("tail -n +%d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("tail -n +%d unexpected exit code %d", n, code) + } + }) +} + +// FuzzTailBytesOffset fuzzes tail -c +N (skip-first-N-bytes offset mode). +func FuzzTailBytesOffset(f *testing.F) { + f.Add([]byte("hello\nworld\n"), int64(1)) + f.Add([]byte("hello\nworld\n"), int64(6)) + f.Add([]byte{}, int64(1)) + f.Add([]byte("no newline"), int64(3)) + f.Add([]byte("a\x00b\nc\n"), int64(2)) + f.Add(bytes.Repeat([]byte("x"), 4097), int64(4096)) + f.Add([]byte("\n\n\n"), int64(2)) + f.Add([]byte("hello\nworld\n"), int64(100)) + // +1 = stream from byte 0 (entire file) + f.Add([]byte("abc"), int64(1)) + // +N > file size → empty + f.Add([]byte("abc"), int64(1000)) + // Binary content + f.Add([]byte{0x00, 0x01, 0x02, 0xff, 0xfe}, int64(2)) + + f.Fuzz(func(t *testing.T, input []byte, n int64) { + if len(input) > 1<<20 { + return + } + if n < 1 { + return + } + if n > 10000 { + n = 10000 + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, fmt.Sprintf("tail -c +%d input.txt", n), dir) + if code != 0 && code != 1 { + t.Errorf("tail -c +%d unexpected exit code %d", n, code) + } + }) +} diff --git a/interp/builtins/tests/tail/testdata/fuzz/.gitkeep b/interp/builtins/tests/tail/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/testcmd/testcmd_fuzz_test.go b/interp/builtins/tests/testcmd/testcmd_fuzz_test.go new file mode 100644 index 00000000..fde6c456 --- /dev/null +++ b/interp/builtins/tests/testcmd/testcmd_fuzz_test.go @@ -0,0 +1,291 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package testcmd_test + +import ( + "context" + "fmt" + "os" + "path/filepath" + "testing" + "time" + "unicode/utf8" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzTestStringOps fuzzes test with string comparison operators. +// Edge cases: empty strings, strings that look like operators, +// unicode strings, strings with leading/trailing spaces. +func FuzzTestStringOps(f *testing.F) { + f.Add("hello", "hello", "=") + f.Add("hello", "world", "!=") + f.Add("", "", "=") + f.Add("abc", "def", "=") + f.Add("a", "b", "!=") + // Strings that look like operators (POSIX disambiguation edge cases) + f.Add("-n", "hello", "=") + f.Add("-z", "", "!=") + f.Add("-e", "file", "=") + f.Add("!", "hello", "!=") + // Lexicographic ordering with < and > + f.Add("abc", "abd", "<") + f.Add("z", "a", ">") + f.Add("A", "a", "<") // uppercase sorts before lowercase in ASCII + // Unicode strings + f.Add("héllo", "héllo", "=") + f.Add("日本語", "日本語", "=") + f.Add("😀", "😀", "=") + // Strings with spaces (shell-safe within single quotes) + f.Add("hello world", "hello world", "=") + f.Add("a b", "a c", "!=") + // == operator (same as =) + f.Add("x", "x", "==") + + f.Fuzz(func(t *testing.T, left, right, op string) { + if len(left) > 100 || len(right) > 100 { + return + } + if op != "=" && op != "!=" && op != "==" && op != "<" && op != ">" { + return + } + if !utf8.ValidString(left) || !utf8.ValidString(right) { + return + } + for _, s := range []string{left, right} { + for _, c := range s { + if c == '\'' || c == '\x00' || c == '\n' || c == ']' { + return + } + // C0/DEL/C1 control chars confuse the shell script parser. + if c < 0x20 || c == 0x7f || (c >= 0x80 && c < 0xa0) { + return + } + } + } + // < and > are shell redirection operators — must use = or != in fuzz body. + if op == "<" || op == ">" { + return + } + + dir := t.TempDir() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("test '%s' %s '%s'", left, op, right) + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("test string op unexpected exit code %d", code) + } + }) +} + +// FuzzTestIntegerOps fuzzes test with integer comparison operators. +// Edge cases: integer overflow (clamped to MaxInt64/MinInt64), +// leading/trailing spaces (trimmed), very large values. +func FuzzTestIntegerOps(f *testing.F) { + f.Add(int64(1), int64(2), "-lt") + f.Add(int64(5), int64(5), "-eq") + f.Add(int64(10), int64(3), "-gt") + f.Add(int64(0), int64(0), "-le") + f.Add(int64(-1), int64(1), "-ne") + // Boundary values + f.Add(int64(0), int64(0), "-eq") + f.Add(int64(-1), int64(0), "-lt") + f.Add(int64(1), int64(0), "-gt") + // int32 boundaries + f.Add(int64(1<<31-1), int64(1<<31-1), "-eq") + f.Add(int64(-(1 << 31)), int64(-(1 << 31)), "-eq") + // Values near int64 max/min + f.Add(int64(1<<31), int64(1<<31), "-eq") + f.Add(int64(-(1<<31 + 1)), int64(0), "-lt") + // int64 max (clamped on overflow per GNU test behavior) + f.Add(int64(1<<31-1), int64(1<<31-1), "-ge") + + f.Fuzz(func(t *testing.T, left, right int64, op string) { + switch op { + case "-eq", "-ne", "-lt", "-le", "-gt", "-ge": + default: + return + } + // Clamp to reasonable range. + if left > 1<<31 || left < -(1<<31) || right > 1<<31 || right < -(1<<31) { + return + } + + dir := t.TempDir() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("test %d %s %d", left, op, right) + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("test %d %s %d unexpected exit code %d", left, op, right, code) + } + }) +} + +// FuzzTestFileOps fuzzes test with file test operators on random filenames. +// Edge cases: -nt/-ot comparison, non-existent files, empty paths. +func FuzzTestFileOps(f *testing.F) { + f.Add("-e", true) + f.Add("-f", true) + f.Add("-d", false) + f.Add("-s", true) + f.Add("-r", true) + f.Add("-z", false) + // File exists but is empty (-s should be false) + f.Add("-s", false) + // Directory test on a file (should be false) + f.Add("-d", true) + // Regular file test on non-existent (should be false) + f.Add("-f", false) + + f.Fuzz(func(t *testing.T, op string, createFile bool) { + switch op { + case "-e", "-f", "-d", "-s", "-r", "-w", "-x", "-h", "-L", "-p": + default: + return + } + + dir := t.TempDir() + target := "testfile.txt" + if createFile { + if err := os.WriteFile(filepath.Join(dir, target), []byte("content"), 0644); err != nil { + t.Fatal(err) + } + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("test %s %s", op, target) + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("test %s unexpected exit code %d", op, code) + } + }) +} + +// FuzzTestStringUnary fuzzes test with -z and -n string tests. +// Edge cases: empty string, single char, strings that look like operators. +func FuzzTestStringUnary(f *testing.F) { + f.Add("hello", "-z") + f.Add("", "-z") + f.Add("hello", "-n") + f.Add("", "-n") + // Strings that look like flags (tested as strings here) + f.Add("-e", "-n") + f.Add("-z", "-n") + f.Add("-n", "-n") + f.Add("-f", "-z") + // Single whitespace char + f.Add(" ", "-n") + f.Add(" ", "-z") + // Unicode + f.Add("日本語", "-n") + f.Add("😀", "-n") + + f.Fuzz(func(t *testing.T, arg, op string) { + if len(arg) > 200 { + return + } + if op != "-z" && op != "-n" { + return + } + if !utf8.ValidString(arg) { + return + } + for _, c := range arg { + if c == '\'' || c == '\x00' || c == '\n' || c == ']' { + return + } + // C0/DEL/C1 control chars confuse the shell script parser. + if c < 0x20 || c == 0x7f || (c >= 0x80 && c < 0xa0) { + return + } + } + + dir := t.TempDir() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("test %s '%s'", op, arg) + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("test %s unexpected exit code %d", op, code) + } + }) +} + +// FuzzTestNesting fuzzes test with logical -a/-o operators and compound expressions. +// Edge cases: short-circuit evaluation, ! as final token (treated as non-empty +// string = true), -o as unary shell option (always false in restricted shell), +// strings that look like operators. +// Note: parentheses are shell metacharacters and cannot be passed unescaped +// here; ( ) grouping is covered by the unit tests. +func FuzzTestNesting(f *testing.F) { + // Simple -a and -o + f.Add("1 -eq 1 -a 2 -eq 2") + f.Add("1 -eq 1 -o 1 -eq 2") + f.Add("1 -eq 2 -a 2 -eq 2") + // ! negation + f.Add("! 1 -eq 2") + f.Add("! -z hello") + // ! as final token: treated as non-empty string (always true) + f.Add("!") + // Boolean chains + f.Add("-z '' -a -n hello") + f.Add("-n hello -o -z hello") + // -o as unary shell option: always false in restricted shell + f.Add("-o anyopt") + // String comparison chained + f.Add("abc = abc -a def != xyz") + // Chain of -a + f.Add("1 -eq 1 -a 2 -eq 2 -a 3 -eq 3") + // Chain of -o + f.Add("1 -eq 2 -o 2 -eq 2 -o 3 -eq 4") + // Mixed -a and -o + f.Add("1 -eq 1 -o 1 -eq 2 -a 2 -eq 2") + + f.Fuzz(func(t *testing.T, expr string) { + if len(expr) > 200 { + return + } + if !utf8.ValidString(expr) { + return + } + for _, c := range expr { + // Filter shell metacharacters that would be interpreted by the shell + // parser rather than passed to the test builtin. + if c == '\'' || c == '\x00' || c == '\n' || c == '\\' || + c == '"' || c == '`' || c == '$' || c == '(' || c == ')' || + c == '<' || c == '>' || c == '|' || c == '&' || c == ';' { + return + } + } + + dir := t.TempDir() + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + script := fmt.Sprintf("test %s", expr) + _, _, code := cmdRunCtx(ctx, t, script, dir) + if code != 0 && code != 1 && code != 2 { + t.Errorf("test %q unexpected exit code %d", expr, code) + } + }) +} diff --git a/interp/builtins/tests/testcmd/testdata/fuzz/.gitkeep b/interp/builtins/tests/testcmd/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/testcmd/testdata/fuzz/FuzzTestStringOps/dd59814d28fa0a6d b/interp/builtins/tests/testcmd/testdata/fuzz/FuzzTestStringOps/dd59814d28fa0a6d new file mode 100644 index 00000000..2244c07d --- /dev/null +++ b/interp/builtins/tests/testcmd/testdata/fuzz/FuzzTestStringOps/dd59814d28fa0a6d @@ -0,0 +1,4 @@ +go test fuzz v1 +string("") +string("\u0080") +string("=") diff --git a/interp/builtins/tests/uniq/testdata/fuzz/.gitkeep b/interp/builtins/tests/uniq/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/uniq/uniq_fuzz_test.go b/interp/builtins/tests/uniq/uniq_fuzz_test.go new file mode 100644 index 00000000..140ca791 --- /dev/null +++ b/interp/builtins/tests/uniq/uniq_fuzz_test.go @@ -0,0 +1,216 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package uniq_test + +import ( + "bytes" + "context" + "fmt" + "os" + "path/filepath" + "testing" + "time" + + "github.com/DataDog/rshell/interp" + "github.com/DataDog/rshell/interp/builtins/testutil" +) + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return testutil.RunScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} + +// FuzzUniq fuzzes uniq with arbitrary file content. +// Edge cases: MaxLineBytes (1 MiB) cap, no-trailing-newline, null bytes, CRLF. +func FuzzUniq(f *testing.F) { + f.Add([]byte("a\na\nb\nb\nc\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add(bytes.Repeat([]byte("x\n"), 100)) + f.Add([]byte("\n\n\n")) + f.Add([]byte("AAA\naaa\nAAA\n")) + // All identical lines + f.Add(bytes.Repeat([]byte("same\n"), 1000)) + // All unique lines + f.Add([]byte("a\nb\nc\nd\ne\n")) + // Single line, no newline + f.Add([]byte("single")) + // CRLF lines + f.Add([]byte("a\r\na\r\nb\r\n")) + // Lines near the 1 MiB cap + f.Add(append(bytes.Repeat([]byte("a"), 1<<20-1), '\n')) + f.Add(append(bytes.Repeat([]byte("a"), 1<<20), '\n')) + // Null bytes in lines + f.Add([]byte("a\x00b\na\x00b\nc\n")) + // Invalid UTF-8 + f.Add([]byte{0xfc, 0x80, 0x80, '\n', 0xfc, 0x80, 0x80, '\n'}) + // countFieldWidth=7: count > 9999999 would overflow field + f.Add(bytes.Repeat([]byte("x\n"), 10000000/2)) + // CVE-2013-0222 pattern: long line with embedded null bytes followed by CRLF. + // The SUSE i18n patch used alloca() sized by line length → stack overflow at 50MB. + // Our implementation uses fixed buffers; test at our MaxLineBytes (1 MiB) boundary. + f.Add(append(append([]byte("1"), bytes.Repeat([]byte{0x00}, 1<<19)...), '\n')) + // CRLF duplicate detection: lines identical except for trailing \r + f.Add([]byte("a\r\na\r\n")) + f.Add([]byte("a\r\na\n")) // CRLF vs LF — how are these compared? + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "uniq input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("uniq unexpected exit code %d", code) + } + }) +} + +// FuzzUniqCount fuzzes uniq -c with arbitrary file content. +// Edge cases: countFieldWidth=7, very large repeat counts, overflow formatting. +func FuzzUniqCount(f *testing.F) { + f.Add([]byte("a\na\nb\nb\nc\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\na\na\n")) + // Many duplicates — count field must not overflow + f.Add(bytes.Repeat([]byte("x\n"), 9999998)) + // Single occurrence + f.Add([]byte("unique\n")) + // CRLF + f.Add([]byte("a\r\na\r\nb\r\n")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "uniq -c input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("uniq -c unexpected exit code %d", code) + } + }) +} + +// FuzzUniqFlags fuzzes uniq with various flag combinations. +// Edge cases: -f/-s/-w field/char skipping with MaxCount clamp, -i case folding, +// -D/-d deduplication modes, -z NUL delimiter. +func FuzzUniqFlags(f *testing.F) { + f.Add([]byte("a\na\nb\nb\nc\n"), true, false, false, false, int64(0), int64(0), int64(0)) + f.Add([]byte("AAA\naaa\nAAA\n"), false, true, false, false, int64(0), int64(0), int64(0)) + f.Add([]byte(" a x\n a y\n b x\n"), false, false, false, false, int64(1), int64(0), int64(0)) + f.Add([]byte("aaa\naab\naac\n"), false, false, false, false, int64(0), int64(2), int64(0)) + f.Add([]byte("a\na\nb\n"), false, false, true, false, int64(0), int64(0), int64(0)) + // -w with skip + f.Add([]byte("abc123\nabc456\ndef\n"), false, false, false, false, int64(0), int64(0), int64(3)) + // -z NUL delimiter + f.Add([]byte("a\x00a\x00b\x00"), false, false, false, true, int64(0), int64(0), int64(0)) + // MaxCount clamp: skipFields/skipChars/checkChars at int32 max + f.Add([]byte("a b c\na b c\n"), false, false, false, false, int64(1<<31-1), int64(0), int64(0)) + f.Add([]byte("abcdef\nabcdef\n"), false, false, false, false, int64(0), int64(1<<31-1), int64(0)) + // -f large value (beyond any line): all lines unique + f.Add([]byte("a b\na b\n"), false, false, false, false, int64(100), int64(0), int64(0)) + // -s large value: skips entire comparison key + f.Add([]byte("abcdef\nabcdef\n"), false, false, false, false, int64(0), int64(100), int64(0)) + // -d: only print duplicate lines + f.Add([]byte("a\na\nb\nc\nc\n"), true, false, false, false, int64(0), int64(0), int64(0)) + + f.Fuzz(func(t *testing.T, input []byte, repeated, ignoreCase, unique, nulDelim bool, skipFields, skipChars, checkChars int64) { + if len(input) > 1<<20 { + return + } + if skipFields < 0 || skipFields > 100 { + return + } + if skipChars < 0 || skipChars > 100 { + return + } + if checkChars < 0 || checkChars > 100 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + flags := "" + if repeated { + flags += " -d" + } + if ignoreCase { + flags += " -i" + } + if unique { + flags += " -u" + } + if nulDelim { + flags += " -z" + } + if skipFields > 0 { + flags += fmt.Sprintf(" -f %d", skipFields) + } + if skipChars > 0 { + flags += fmt.Sprintf(" -s %d", skipChars) + } + if checkChars > 0 { + flags += fmt.Sprintf(" -w %d", checkChars) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "uniq"+flags+" input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("uniq%s unexpected exit code %d", flags, code) + } + }) +} + +// FuzzUniqStdin fuzzes uniq reading from stdin. +func FuzzUniqStdin(f *testing.F) { + f.Add([]byte("a\na\nb\nb\nc\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte{0xfc, 0x80, 0x80, '\n', 0xfc, 0x80, 0x80, '\n'}) + f.Add([]byte("line1\r\nline1\r\nline2\r\n")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "uniq < stdin.txt", dir) + if code != 0 && code != 1 { + t.Errorf("uniq stdin unexpected exit code %d", code) + } + }) +} diff --git a/interp/builtins/tests/wc/helpers_test.go b/interp/builtins/tests/wc/helpers_test.go new file mode 100644 index 00000000..954207ee --- /dev/null +++ b/interp/builtins/tests/wc/helpers_test.go @@ -0,0 +1,53 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package wc_test + +import ( + "bytes" + "context" + "errors" + "strings" + "testing" + + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +func runScriptCtx(ctx context.Context, t *testing.T, script, dir string, opts ...interp.RunnerOption) (string, string, int) { + t.Helper() + parser := syntax.NewParser() + prog, err := parser.Parse(strings.NewReader(script), "") + if err != nil { + t.Fatal(err) + } + var outBuf, errBuf bytes.Buffer + allOpts := append([]interp.RunnerOption{interp.StdIO(nil, &outBuf, &errBuf)}, opts...) + runner, err := interp.New(allOpts...) + if err != nil { + t.Fatal(err) + } + defer runner.Close() + if dir != "" { + runner.Dir = dir + } + runErr := runner.Run(ctx, prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if errors.As(runErr, &es) { + exitCode = int(es) + } else if ctx.Err() == nil { + t.Fatalf("unexpected error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func cmdRunCtx(ctx context.Context, t *testing.T, script, dir string) (string, string, int) { + t.Helper() + return runScriptCtx(ctx, t, script, dir, interp.AllowedPaths([]string{dir})) +} diff --git a/interp/builtins/tests/wc/testdata/fuzz/.gitkeep b/interp/builtins/tests/wc/testdata/fuzz/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/1c6e2e9cd7371f3e b/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/1c6e2e9cd7371f3e new file mode 100644 index 00000000..9c6170db --- /dev/null +++ b/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/1c6e2e9cd7371f3e @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("\U00089249") diff --git a/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/ee500f173c25a234 b/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/ee500f173c25a234 new file mode 100644 index 00000000..e92700f2 --- /dev/null +++ b/interp/builtins/tests/wc/testdata/fuzz/FuzzWcDifferentialWords/ee500f173c25a234 @@ -0,0 +1,2 @@ +go test fuzz v1 +[]byte("\x1a") diff --git a/interp/builtins/tests/wc/wc_differential_fuzz_test.go b/interp/builtins/tests/wc/wc_differential_fuzz_test.go new file mode 100644 index 00000000..47e8c603 --- /dev/null +++ b/interp/builtins/tests/wc/wc_differential_fuzz_test.go @@ -0,0 +1,197 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +package wc_test + +import ( + "bytes" + "context" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + "time" +) + +// runGNUInDir runs a GNU command under LC_ALL=C.UTF-8 with its working +// directory set to dir. args[0] is the command name; args[1:] are arguments. +func runGNUInDir(t *testing.T, dir string, args []string) (stdout string, exitCode int) { + t.Helper() + if _, err := exec.LookPath(args[0]); err != nil { + t.Skipf("%s not found: %v", args[0], err) + } + + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + cmd.Env = append(os.Environ(), "LC_ALL=C.UTF-8") + + var outBuf bytes.Buffer + cmd.Stdout = &outBuf + + err := cmd.Run() + exitCode = 0 + if err != nil { + if exitErr, ok := err.(*exec.ExitError); ok { + exitCode = exitErr.ExitCode() + } else { + t.Logf("gnu exec error: %v", err) + return "", -1 + } + } + return outBuf.String(), exitCode +} + +func isSandboxError(stderr string) bool { + lower := strings.ToLower(stderr) + return strings.Contains(lower, "permission denied") || + strings.Contains(lower, "not allowed") || + strings.Contains(lower, "sandbox") +} + +// FuzzWcDifferentialLines compares rshell wc -l output against GNU wc. +func FuzzWcDifferentialLines(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("line1\nline2\nline3\n")) + f.Add([]byte("")) + f.Add([]byte("no newline")) + f.Add([]byte("a\nb\nc\n")) + f.Add([]byte("\n\n\n")) + f.Add([]byte("a\x00b\nc\n")) + f.Add([]byte("single line\n")) + f.Add(bytes.Repeat([]byte("x\n"), 100)) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 64*1024 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, "wc -l input.txt", dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"wc", "-l", "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("wc -l stdout mismatch:\nrshell: %q\ngnu: %q\ninput: %q", rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("wc -l exit code mismatch: rshell=%d gnu=%d", rshellCode, gnuCode) + } + }) +} + +// FuzzWcDifferentialWords compares rshell wc -w output against GNU wc. +func FuzzWcDifferentialWords(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("hello world\n")) + f.Add([]byte("")) + f.Add([]byte(" spaces \n")) + f.Add([]byte("one\ntwo three\n")) + f.Add([]byte("\t\ttabs\t\n")) + f.Add([]byte("a\x00b c\n")) + f.Add([]byte("word")) + f.Add(bytes.Repeat([]byte("a b "), 50)) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 64*1024 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, "wc -w input.txt", dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"wc", "-w", "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("wc -w stdout mismatch:\nrshell: %q\ngnu: %q\ninput: %q", rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("wc -w exit code mismatch: rshell=%d gnu=%d", rshellCode, gnuCode) + } + }) +} + +// FuzzWcDifferentialBytes compares rshell wc -c output against GNU wc. +func FuzzWcDifferentialBytes(f *testing.F) { + if os.Getenv("RSHELL_BASH_TEST") == "" { + f.Skip("set RSHELL_BASH_TEST=1 to run differential fuzz tests") + } + + f.Add([]byte("hello\nworld\n")) + f.Add([]byte("")) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add([]byte{0xff, 0xfe, 0x00, 0x01}) + f.Add(bytes.Repeat([]byte("x"), 100)) + f.Add([]byte("\n\n\n")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 64*1024 { + return + } + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644); err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + rshellOut, rshellErr, rshellCode := cmdRunCtx(ctx, t, "wc -c input.txt", dir) + + if isSandboxError(rshellErr) { + t.Skip("skipping: sandbox restriction") + } + + gnuOut, gnuCode := runGNUInDir(t, dir, []string{"wc", "-c", "input.txt"}) + if gnuCode == -1 { + return + } + + if rshellOut != gnuOut { + t.Errorf("wc -c stdout mismatch:\nrshell: %q\ngnu: %q\ninput: %q", rshellOut, gnuOut, input) + } + if rshellCode != gnuCode { + t.Errorf("wc -c exit code mismatch: rshell=%d gnu=%d", rshellCode, gnuCode) + } + }) +} diff --git a/interp/builtins/tests/wc/wc_fuzz_test.go b/interp/builtins/tests/wc/wc_fuzz_test.go new file mode 100644 index 00000000..3d6b35dd --- /dev/null +++ b/interp/builtins/tests/wc/wc_fuzz_test.go @@ -0,0 +1,207 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package wc_test + +import ( + "bytes" + "context" + "os" + "path/filepath" + "testing" + "time" +) + +// FuzzWc fuzzes wc (default mode: lines, words, bytes) with arbitrary file content. +// Edge cases: UTF-8 chunk boundary carry-over, wide chars, tab stops, CRLF. +func FuzzWc(f *testing.F) { + f.Add([]byte("hello world\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add(bytes.Repeat([]byte("word "), 100)) + // Tab stops: wc -L counts tab as advancing to next 8-column boundary + f.Add([]byte("a\tb\tc\n")) + f.Add([]byte("\t\t\t\n")) + // CRLF: \r resets word state without starting newline + f.Add([]byte("a\r\nb\r\n")) + f.Add([]byte("word1\r\nword2\r\n")) + // Multibyte UTF-8: wc -m counts runes; wc -c counts bytes + f.Add([]byte("héllo\n")) // 2-byte é + f.Add([]byte("日本語\n")) // 3-byte CJK + f.Add([]byte("😀\n")) // 4-byte emoji + f.Add([]byte("こんにちは\n")) // wide chars (width 2 each for -L) + // UTF-8 split at 32 KiB chunk boundary (carry-over bytes logic) + f.Add(append(bytes.Repeat([]byte("a"), 32*1024-1), []byte("é")...)) + // Invalid UTF-8 (must not crash — processed as replacement char) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add([]byte{0xed, 0xa0, 0x80, '\n'}) // surrogate + f.Add([]byte{0x80, '\n'}) // continuation byte without lead + // Null bytes + f.Add([]byte{0x00, 0x00, '\n'}) + // High bytes + f.Add([]byte{0x80, 0x9f, 0xa0, 0xff, '\n'}) + // Only whitespace + f.Add([]byte(" \t \n")) + f.Add([]byte("\n\n\n\n\n")) + // Long line (tests -L max-line-length tracking) + f.Add(append(bytes.Repeat([]byte("a"), 1000), '\n')) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "wc input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("wc unexpected exit code %d", code) + } + }) +} + +// FuzzWcLines fuzzes wc -l with arbitrary file content. +func FuzzWcLines(f *testing.F) { + f.Add([]byte("line1\nline2\nline3\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add([]byte("line1\r\nline2\r\n")) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add(bytes.Repeat([]byte("a\n"), 10000)) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "wc -l input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("wc -l unexpected exit code %d", code) + } + }) +} + +// FuzzWcBytes fuzzes wc -c with arbitrary file content. +func FuzzWcBytes(f *testing.F) { + f.Add([]byte("hello\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\nc\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte{0x00, 0x01, 0x02, 0xff, 0xfe}) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf}) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "wc -c input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("wc -c unexpected exit code %d", code) + } + }) +} + +// FuzzWcChars fuzzes wc -m (character/rune count) with multibyte and invalid UTF-8. +// Edge cases: carry-over bytes at chunk boundaries, replacement chars for bad sequences. +func FuzzWcChars(f *testing.F) { + f.Add([]byte("hello\n")) + f.Add([]byte("héllo\n")) + f.Add([]byte("日本語\n")) + f.Add([]byte("😀\n")) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + f.Add([]byte{0x80, '\n'}) + f.Add([]byte{0xed, 0xa0, 0x80, '\n'}) + // Chunk boundary split: 3-byte rune straddling 32 KiB boundary + f.Add(append(bytes.Repeat([]byte("a"), 32*1024-1), []byte("日")...)) + // 4-byte emoji straddling boundary + f.Add(append(bytes.Repeat([]byte("a"), 32*1024-1), []byte("😀")...)) + f.Add([]byte{}) + f.Add([]byte("no newline")) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "input.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "wc -m input.txt", dir) + if code != 0 && code != 1 { + t.Errorf("wc -m unexpected exit code %d", code) + } + }) +} + +// FuzzWcStdin fuzzes wc reading from stdin via shell redirection. +func FuzzWcStdin(f *testing.F) { + f.Add([]byte("hello world\n")) + f.Add([]byte{}) + f.Add([]byte("no newline")) + f.Add([]byte("a\x00b\n")) + f.Add(bytes.Repeat([]byte("x"), 4097)) + f.Add([]byte("\n\n\n")) + f.Add([]byte("héllo\n")) + f.Add([]byte{0xfc, 0x80, 0x80, 0x80, 0x80, 0xaf, '\n'}) + + f.Fuzz(func(t *testing.T, input []byte) { + if len(input) > 1<<20 { + return + } + + dir := t.TempDir() + err := os.WriteFile(filepath.Join(dir, "stdin.txt"), input, 0644) + if err != nil { + t.Fatal(err) + } + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + _, _, code := cmdRunCtx(ctx, t, "wc < stdin.txt", dir) + if code != 0 && code != 1 { + t.Errorf("wc stdin unexpected exit code %d", code) + } + }) +} diff --git a/interp/builtins/wc/wc.go b/interp/builtins/wc/wc.go index 1b71418b..dd420d55 100644 --- a/interp/builtins/wc/wc.go +++ b/interp/builtins/wc/wc.go @@ -257,7 +257,6 @@ func countReader(ctx context.Context, r io.Reader) (counts, error) { tail = 0 } } - c.chars += int64(utf8.RuneCount(chunk)) // carryN bytes are subtracted here and will be re-added via // n += carryN at the top of the next iteration. c.bytes -= int64(carryN) @@ -265,6 +264,12 @@ func countReader(ctx context.Context, r io.Reader) (counts, error) { for i := 0; i < len(chunk); { r, size := utf8.DecodeRune(chunk[i:]) i += size + // Invalid UTF-8 byte: not a character in C.UTF-8 locale. + // Skip entirely — no char count, no word effect. + if r == utf8.RuneError && size == 1 { + continue + } + c.chars++ if r == '\n' { c.lines++ if lineLen > c.maxLineLen { @@ -281,6 +286,17 @@ func countReader(ctx context.Context, r io.Reader) (counts, error) { } else if r == ' ' || r == '\v' || r == '\f' { lineLen++ inWord = false + } else if unicode.IsControl(r) { + // Non-whitespace control chars (C0, DEL, C1) are transparent: + // they do not start or end words, matching GNU wc in POSIX locale. + } else if unicode.Is(unicode.Zs, r) { + // Unicode space separators (NBSP, thin space, etc.) end words, + // matching GNU wc behaviour under C.UTF-8 locale. + lineLen++ + inWord = false + } else if !unicode.IsGraphic(r) && !unicode.Is(unicode.Cf, r) && !unicode.Is(unicode.Co, r) { + // Cn (unassigned codepoints): transparent like control chars -- + // they do not start or end words, matching GNU wc under C.UTF-8. } else { if !inWord { c.words++ @@ -292,7 +308,7 @@ func countReader(ctx context.Context, r io.Reader) (counts, error) { } if err == io.EOF { if carryN > 0 { - c.chars += int64(utf8.RuneCount(carry[:carryN])) + // Incomplete UTF-8 sequence at EOF: counts as bytes but not chars. c.bytes += int64(carryN) carryN = 0 } diff --git a/interp/builtins/wc/wc_gnu_compat_test.go b/interp/builtins/wc/wc_gnu_compat_test.go index 90966364..4d2255e4 100644 --- a/interp/builtins/wc/wc_gnu_compat_test.go +++ b/interp/builtins/wc/wc_gnu_compat_test.go @@ -148,16 +148,19 @@ func TestGNUCompatCharsMultibyte(t *testing.T) { assert.Equal(t, "5 file.txt\n", stdout) } -// TestGNUCompatControlCharIsWord — control byte \x01 counts as a word. +// TestGNUCompatControlCharIsWord — control byte \x01 does not count as a word. // -// GNU command: printf '\x01\n' | gwc -w -// Expected: "1\n" +// GNU wc in POSIX locale treats C0 control characters as transparent: +// they neither start nor end words. Only printable chars form words. +// +// GNU command (Debian/Ubuntu POSIX locale): printf '\x01\n' | wc -w +// Expected: "0\n" func TestGNUCompatControlCharIsWord(t *testing.T) { dir := t.TempDir() writeFile(t, dir, "file.txt", "\x01\n") stdout, _, code := cmdRun(t, "wc -w file.txt", dir) assert.Equal(t, 0, code) - assert.Equal(t, "1 file.txt\n", stdout) + assert.Equal(t, "0 file.txt\n", stdout) } // TestGNUCompatRejectedFlag — unknown flag exits 1. diff --git a/interp/builtins/wc/wc_test.go b/interp/builtins/wc/wc_test.go index 4707b0dd..090a1cca 100644 --- a/interp/builtins/wc/wc_test.go +++ b/interp/builtins/wc/wc_test.go @@ -140,7 +140,7 @@ func TestWcWordsControlChar(t *testing.T) { writeFile(t, dir, "file.txt", "\x01\n") stdout, _, code := cmdRun(t, "wc -w file.txt", dir) assert.Equal(t, 0, code) - assert.Equal(t, "1 file.txt\n", stdout) + assert.Equal(t, "0 file.txt\n", stdout) } // --- Bytes --- diff --git a/tests/allowed_symbols_test.go b/tests/allowed_symbols_test.go index 59d56f88..410df468 100644 --- a/tests/allowed_symbols_test.go +++ b/tests/allowed_symbols_test.go @@ -126,8 +126,16 @@ var builtinAllowedSymbols = []string{ "unicode.Cc", // unicode.Cf — format character category range table; pure data, no I/O. "unicode.Cf", + // unicode.Co — private-use character category range table; pure data, no I/O. + "unicode.Co", // unicode.Is — checks if rune belongs to a range table; pure function, no I/O. "unicode.Is", + // unicode.IsControl — reports whether rune is a control character; pure function, no I/O. + "unicode.IsControl", + // unicode.IsGraphic — reports whether rune is defined as a graphic character; pure function, no I/O. + "unicode.IsGraphic", + // unicode.Zs — Unicode space separator category range table; pure data, no I/O. + "unicode.Zs", // unicode.Me — enclosing mark category range table; pure data, no I/O. "unicode.Me", // unicode.Mn — nonspacing mark category range table; pure data, no I/O. @@ -140,8 +148,8 @@ var builtinAllowedSymbols = []string{ "unicode.RangeTable", // unicode/utf8.DecodeRune — decodes first UTF-8 rune from a byte slice; pure function, no I/O. "unicode/utf8.DecodeRune", - // unicode/utf8.RuneCount — counts UTF-8 runes in a byte slice; pure function, no I/O. - "unicode/utf8.RuneCount", + // unicode/utf8.RuneError — replacement character returned for invalid UTF-8; constant, no I/O. + "unicode/utf8.RuneError", // unicode/utf8.UTFMax — maximum number of bytes in a UTF-8 encoding; constant, no I/O. "unicode/utf8.UTFMax", // unicode/utf8.Valid — checks if a byte slice is valid UTF-8; pure function, no I/O.