Conversation
Adds testing.F fuzz tests for head, cat, wc, tail, and grep builtins. Each command gets seed corpus covering empty input, no trailing newline, NUL bytes, buffer boundaries (4097 bytes), very long single lines, and all-newlines input. Fuzz functions use context.WithTimeout to catch hangs, assert exit codes are 0 or 1, and verify output invariants (e.g. head -n K produces at most K lines). Both file-based and stdin-via-redirection variants are included for each command. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
thieman
left a comment
There was a problem hiding this comment.
Review Summary
This PR adds native Go testing.F fuzz tests for five builtin commands: cat, grep, head, tail, and wc. Each command gets a dedicated <cmd>_fuzz_test.go in its own package under interp/builtins/tests/<cmd>/. The tests follow the existing testutil conventions, use context.WithTimeout to catch hangs, and assert invariants on output.
All seed corpus entries pass. No security issues are introduced — the PR is purely test code that exercises existing builtins through the sandbox via the normal interpreter path.
Overall assessment: safe to merge, with two minor quality issues flagged below.
|
I can only run on private repositories. |
thieman
left a comment
There was a problem hiding this comment.
Code Review — PR #63: Native Go fuzz tests for builtin commands
Overall assessment: safe to merge — the PR adds native Go fuzz tests (testing.F) for cat, grep, head, tail, and wc builtins. No security issues, no bash compatibility concerns, and no correctness bugs were found. All findings are P3 (style/hardening suggestions).
Finding Summary
Coverage Summary
| Code path | Scenario test | Fuzz test | Status |
|---|---|---|---|
| cat: raw streaming | existing | FuzzCat (strong: output == input) |
Covered |
| cat: line numbering (-n) | existing | FuzzCatNumberLines (exit code only) |
Partially covered |
| cat: stdin via redirect | existing | FuzzCatStdin (strong: output == input) |
Covered |
| grep: file content + arbitrary pattern | existing | FuzzGrepFileContent (exit code only) |
Partially covered |
| grep: stdin | existing | FuzzGrepStdin (exit code only) |
Partially covered |
| grep: -i/-v flags | existing | FuzzGrepFlags (exit code only) |
Partially covered |
| head -n N: file | existing | FuzzHeadLines (lineCount ≤ n) |
Covered |
| head -c N: file | existing | FuzzHeadBytes (byteCount ≤ n) |
Covered |
| head: stdin | existing | FuzzHeadStdin (exit code only) |
Partially covered |
| tail -n N: file | existing | FuzzTailLines (lineCount ≤ n) |
Covered |
| tail -c N: file | existing | FuzzTailBytes (byteCount ≤ n) |
Covered |
| tail +N: offset mode | existing | — | Missing |
| tail: stdin | existing | FuzzTailStdin (exit code only) |
Partially covered |
| wc: all modes | existing | FuzzWc, FuzzWcLines, FuzzWcBytes, FuzzWcStdin (exit code only) |
Partially covered |
Positive Observations
- All fuzz targets use
context.WithTimeout(5s)to prevent hangs — good practice - Input size cap of
1<<20(1 MiB) prevents excessive fuzz corpus I/O n > 10000clamping in head/tail fuzz prevents slow tests when the engine picks huge N values- Single-quote wrapping in
FuzzGrepFileContentwith filtering of',\x00,\nis a correct and sufficient approach to prevent shell-level injection - The assertion
stdout == string(input)inFuzzCatis a strong semantic invariant — good - Sandbox is properly respected: all tests use
interp.AllowedPaths([]string{dir})with temp directories
|
Iteration 1 self-review result: COMMENT
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 054f2637c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Remove unused cmdRun function from all 5 fuzz test files (cat, grep, head, tail, wc); only cmdRunCtx (with context for timeout) is needed - Remove dead fixedPatterns() function and stale comment from grep fuzz tests (artifact of an earlier design approach) - Add utf8.ValidString guard in FuzzGrepFileContent to skip non-UTF-8 patterns that would be rejected by the shell parser before reaching the grep builtin, ensuring the fuzz corpus exercises grep logic - Add FuzzTailLinesOffset and FuzzTailBytesOffset to cover the +N offset code paths in tail (skip-first-N-lines/bytes mode), which were not previously fuzzed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
thieman
left a comment
There was a problem hiding this comment.
Review Summary
This PR adds native Go fuzz tests for 5 builtin commands (cat, grep, head, tail, wc). The commit under review (7edf529) addresses previous review comments by removing dead cmdRun functions, removing the unused fixedPatterns() stub, adding utf8.ValidString guard in FuzzGrepFileContent, and adding FuzzTailLinesOffset / FuzzTailBytesOffset to cover the +N offset code paths.
Overall assessment: safe to merge. No security findings. The only issues are two minor style/correctness items (P3) that do not affect CI or functionality.
Coverage Summary
| Code path | Fuzz test | Status |
|---|---|---|
cat arbitrary file content |
FuzzCat |
Covered |
cat -n numbered output |
FuzzCatNumberLines |
Covered |
cat stdin redirection |
FuzzCatStdin |
Covered |
grep arbitrary file + pattern |
FuzzGrepFileContent |
Covered |
grep stdin |
FuzzGrepStdin |
Covered |
grep -i/-v flags |
FuzzGrepFlags |
Covered |
head -n N last-N lines |
FuzzHeadLines |
Covered |
head -c N last-N bytes |
FuzzHeadBytes |
Covered |
head -n N stdin |
FuzzHeadStdin |
Covered |
tail -n N last-N lines |
FuzzTailLines |
Covered |
tail -c N last-N bytes |
FuzzTailBytes |
Covered |
tail -n N stdin |
FuzzTailStdin |
Covered |
tail -n +N offset lines |
FuzzTailLinesOffset |
Covered |
tail -c +N offset bytes |
FuzzTailBytesOffset |
Covered |
wc default (lines/words/bytes) |
FuzzWc |
Covered |
wc -l lines |
FuzzWcLines |
Covered |
wc -c bytes |
FuzzWcBytes |
Covered |
wc stdin |
FuzzWcStdin |
Covered |
Coverage: Adequate — all code paths in the new fuzz tests are backed by meaningful seed corpus and proper assertions.
Positive Observations
- All fuzz functions use
cmdRunCtxwith a per-iteration 5-second timeout, correctly protecting against hangs. - The 1<<20 byte input cap prevents out-of-memory conditions during fuzzing.
- The
utf8.ValidStringguard inFuzzGrepFileContentcorrectly prevents the fuzz corpus from exercising the parser error path instead of grep. FuzzTailLinesOffsetandFuzzTailBytesOffsetprovide meaningful coverage of theskipLines/skipBytescode paths which were entirely absent from previous fuzz coverage.- The
FuzzCatoutput equality assertion (stdout == string(input)) is stronger than a simple exit code check.
| } | ||
| }) | ||
| } | ||
|
|
There was a problem hiding this comment.
| "testing" | ||
| "time" | ||
|
|
||
| "unicode/utf8" |
There was a problem hiding this comment.
unicode/utf8 import in its own group — should be grouped with other stdlib imports
unicode/utf8 is a standard library package but is placed in a separate import group between the other stdlib packages and the third-party packages. goimports convention is: stdlib packages together, then a blank line, then third-party. gofmt doesn't flag this (it doesn't reorder imports), but editors with goimports configured will reformat on save.
| "unicode/utf8" | |
| import ( | |
| "bytes" | |
| "context" | |
| "os" | |
| "path/filepath" | |
| "testing" | |
| "time" | |
| "unicode/utf8" | |
| "github.com/DataDog/rshell/interp" | |
| "github.com/DataDog/rshell/interp/builtins/testutil" | |
| ) |
| }) | ||
| } | ||
|
|
||
| // FuzzCatNumberLines fuzzes cat -n with arbitrary file content. |
There was a problem hiding this comment.
Output equality assertion can fire as a false positive on context timeout
When the 5-second context deadline is exceeded, testutil.RunScriptCtx returns (partialStdout, "", 0) — exit code 0 with potentially incomplete output (because ctx.Err() != nil suppresses the t.Fatalf call but the exit code stays 0). The subsequent assertion code == 0 && stdout != string(input) would then record the timeout as a fuzz failure, potentially creating false-positive corpus entries.
This only affects extended fuzz runs (not the CI seed-corpus check), but could cause spurious findings during local fuzzing. Consider skipping the output assertion when the context is done:
| // FuzzCatNumberLines fuzzes cat -n with arbitrary file content. | |
| if code == 0 && ctx.Err() == nil && stdout != string(input) { | |
| t.Errorf("cat output differs from input: got %d bytes, want %d bytes", len(stdout), len(input)) | |
| } |
The same pattern applies to FuzzCatStdin at line 124.
Remove duplicate cmdRunCtx declarations by making helpers_test.go cross-platform (drop !windows build tag) and reusing it from native fuzz test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Consolidated into #62 which now includes both differential and native fuzz tests. |
…eutils (#62) * Add GitHub Actions workflow for continuous fuzz testing in CI - Add .github/workflows/fuzz.yml: runs each Fuzz* target for 30s per push/PR across head, cat, wc, tail, and grep packages; caches corpus between runs; skips gracefully when no fuzz targets exist yet in a package. - Update .github/workflows/test.yml: add fuzz seed corpus regression step so any checked-in corpus entries that crash are caught on every PR. - Add testdata/fuzz/.gitkeep placeholders so corpus cache paths are consistent. - Document corpus retention policy in .gitignore. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add memory benchmark assertions for streaming builtins (head, cat, wc, tail) Each benchmark file adds both standard BenchmarkXxx functions (runnable with go test -bench) and TestXxxMemoryBounded functions that call testing.Benchmark internally and assert AllocedBytesPerOp stays below a documented ceiling: - head: < 512 KB for 10 MB input (truly O(1): ~17 KB observed) - wc: < 512 KB for 10 MB input (truly O(1): ~44 KB observed) - cat: < 6 MB for 1 MB input (O(n) output buffer: ~3 MB observed) - tail: < 24 MB for 10 MB input (O(n) allocs, O(1) live ring: ~11 MB observed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add differential fuzz tests comparing rshell builtins against GNU coreutils Implements FuzzHeadDifferentialLines, FuzzHeadDifferentialBytes, FuzzCatDifferential, FuzzWcDifferentialLines, FuzzWcDifferentialWords, FuzzWcDifferentialBytes, and FuzzTailDifferential. Each test is gated behind RSHELL_BASH_TEST=1, uses exec.LookPath to skip if the GNU tool is not available, and runs GNU coreutils in the same working directory as rshell so filenames in output match exactly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address review comments: fix build failures and CI issues - Add helpers_test.go to tests/cat, tests/head, tests/tail, tests/wc defining cmdRunCtx and runScriptCtx, fixing the undefined-symbol build failures that blocked all four differential fuzz packages from compiling - Remove silent skip on "too large"/"exceeds" in tail fuzz test; the existing n<=10000 and len(input)<=64KB guards already keep inputs within rshell limits, so these skips mask real differential failures - Remove grep from fuzz.yml matrix (no Go test files exist yet) - Remove "|| true" from fuzz step so fuzzer-found crashes surface as CI failures - Increase fuzz job timeout from 120s to 300s to accommodate wc's 3 targets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add native Go fuzz tests for builtin commands Adds testing.F fuzz tests for head, cat, wc, tail, and grep builtins. Each command gets seed corpus covering empty input, no trailing newline, NUL bytes, buffer boundaries (4097 bytes), very long single lines, and all-newlines input. Fuzz functions use context.WithTimeout to catch hangs, assert exit codes are 0 or 1, and verify output invariants (e.g. head -n K produces at most K lines). Both file-based and stdin-via-redirection variants are included for each command. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Address review comments on fuzz tests PR - Remove unused cmdRun function from all 5 fuzz test files (cat, grep, head, tail, wc); only cmdRunCtx (with context for timeout) is needed - Remove dead fixedPatterns() function and stale comment from grep fuzz tests (artifact of an earlier design approach) - Add utf8.ValidString guard in FuzzGrepFileContent to skip non-UTF-8 patterns that would be rejected by the shell parser before reaching the grep builtin, ensuring the fuzz corpus exercises grep logic - Add FuzzTailLinesOffset and FuzzTailBytesOffset to cover the +N offset code paths in tail (skip-first-N-lines/bytes mode), which were not previously fuzzed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Consolidate native fuzz tests from PR #63 into differential fuzz PR Remove duplicate cmdRunCtx declarations by making helpers_test.go cross-platform (drop !windows build tag) and reusing it from native fuzz test files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add native fuzz tests for all remaining builtins Add fuzz tests for echo, cut, uniq, strings, testcmd, and ls. Update fuzz.yml workflow matrix to include all new packages. Tests cover: - echo: basic args and -e escape sequence parsing - cut: field/byte selection with specs, custom delimiters, stdin - uniq: basic, count, flag combinations, field/char skipping, stdin - strings: file content, min length, radix offset, stdin - test: string ops, integer ops, file ops, unary string tests - ls: flag combinations with random filenames, recursive traversal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Expand fuzz seed corpuses across all builtins with implementation edge cases and CVE-class inputs Added targeted seed corpus entries to all 11 fuzz test packages based on: - Implementation constants and boundary conditions (MaxLineBytes=1MiB, MaxCount=1<<31-1, maxRecursionDepth=256, maxStringLen=1MiB, maxParenDepth=128, countFieldWidth=7) - Security-relevant edge cases: CRLF, null bytes, invalid UTF-8, high bytes, integer overflow, ReDoS-class regex patterns, printable byte boundaries - Existing unit test scenarios (CRLF in cut, complement, suppress, output-delimiter) - New fuzz functions: FuzzCatDisplayFlags, FuzzWcChars, FuzzGrepPatterns, FuzzEchoFlagInteraction, FuzzStringsRadix, FuzzTestNesting, FuzzLsHumanReadable, FuzzLsMultipleFiles, FuzzCutComplement Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add CVE-derived fuzz seeds: terminal injection, fixed strings, format magic bytes Incorporates findings from GNU coreutils/binutils CVE research: - cat: ANSI/terminal escape injection sequences (OSC 2/50, ANSI color, DCS, cursor movement, ELF magic bytes) - grep: new FuzzGrepFixedStrings function targeting CVE-2015-1345 (-F mode Boyer-Moore-Horspool code path) and CVE-2012-5667 (integer overflow at line length boundary); regex metacharacters tested as fixed-string literals - uniq: CVE-2013-0222 pattern (long line with embedded null bytes near MaxLineBytes; distro alloca() stack overflow was triggered at 50MB — our fixed buffers prevent this class entirely); CRLF vs LF duplicate comparison edge case - strings: ELF/PE/ZIP/PDF binary format magic bytes (CVE-2014-8485 used crafted ELF headers to exploit libbfd; our raw-byte scanner is unaffected but confirms graceful handling) - testcmd: int64 max/min boundary seeds for integer comparison overflow behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add Step 9: Write fuzz tests to implement-posix-command skill Documents the three sources for fuzz seed corpus construction: a) Implementation edge cases — named constants, boundary checks, clamps, buffer sizes b) CVE and security history — integer overflow, long lines, null bytes, CRLF, invalid UTF-8, binary magic bytes, terminal escape injection, ReDoS patterns c) Existing test coverage — every input from unit tests and YAML scenarios Adds cmdRunCtxFuzz helper pattern (avoids redeclaration with existing test helpers), per-mode function guidance, seed verification commands, and CI fuzz.yml integration. Renumbers documentation update to Step 10. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Remove bench/memory tests — out of scope for this fuzz PR Memory benchmark tests belong in a separate PR per user request. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix fuzz input filters: add UTF-8 and C1 control char guards Fuzzer found two real bugs in seed filtering: - FuzzGrepFixedStrings: missing utf8.ValidString(pattern) check; non-UTF-8 byte (e.g. 0x85) embedded in shell script caused parse error - FuzzTestStringOps/FuzzTestStringUnary: C1 control chars (U+0080–U+009F) passed the utf8.Valid check but caused "EOF without closing quote" in the shell script parser Add corpus entries for fuzzer-discovered inputs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Add fuzz failure handling to fix-ci-tests and fix-local-tests skills Document that failing fuzzer-discovered inputs must be committed as corpus files (testdata/fuzz/<FuzzFuncName>/<hash>) to serve as permanent regression tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix fuzz seed corpus CI step: remove invalid -fuzztime=0s flag Go's -fuzztime flag does not accept 0s as a valid duration, causing the "Run fuzz seed corpus (regression test)" step to fail on all platforms. Remove the flag entirely — running go test -run '^Fuzz' without -fuzz is the correct way to execute seed corpus entries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix fuzz CI: remove invalid -fuzztime=0s and add differential fuzz to bash job - fuzz.yml seed corpus step: remove -fuzztime=0s (invalid duration) - test.yml bash job: add 30s differential fuzz runs for cat/head/tail/wc against GNU tools under RSHELL_BASH_TEST=1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix wc -w: C0 control chars are transparent in POSIX locale GNU wc in POSIX locale (Debian/Ubuntu) treats non-whitespace control characters (C0 0x00-0x1f, DEL 0x7f, C1 0x80-0x9f) as transparent: they neither start nor end words. Only printable chars form words. Previously rshell wc counted any non-whitespace byte as a word character, so a file containing just \x1a produced 1 word instead of 0. - wc.go: add unicode.IsControl branch in countReader word loop - wc_test.go, wc_gnu_compat_test.go: update expectations to 0 words - allowed_symbols_test.go: add unicode.IsControl to allowlist - add fuzz corpus entry ee500f173c25a234 (\x1a) as regression test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Restrict differential fuzz tests to Linux with LC_ALL=C.UTF-8 - Change build tag from !windows to linux: differential tests only make sense on Linux where GNU coreutils behaviour is well-defined and C.UTF-8 locale is available - Set LC_ALL=C.UTF-8 on every GNU tool invocation in runGNUInDir: pins character handling to UTF-8 while keeping POSIX whitespace rules, making results reproducible across all Linux environments - Remove gnuCmd / darwin "g" prefix logic: no longer needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix wc: invalid UTF-8 bytes don't count as chars or words in C.UTF-8 In C.UTF-8 locale, bytes that are not valid UTF-8 (e.g. \xff) are not characters — they contribute 0 to -m (char count) and are transparent to -w (word count), the same behaviour as C0 control chars. Previously, utf8.DecodeRune returned RuneError for invalid bytes and they fell through to the printable-char branch: counted as 1 word and 1 char. Also, incomplete UTF-8 sequences left in carry at EOF were being counted as chars. - Move char counting into the per-rune loop; skip RuneError/size==1 - Remove carry-at-EOF char count (incomplete sequence = not a char) - Remove utf8.RuneCount from allowlist (no longer called); add RuneError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix wc -w for Cn/unassigned and Zs/space-separator codepoints; fuzz all functions on failure - Treat Unicode space separators (Zs: NBSP, thin space, etc.) as word delimiters, matching GNU wc under LC_ALL=C.UTF-8 - Treat unassigned codepoints (Cn) as transparent like C0/C1 controls, matching GNU wc under LC_ALL=C.UTF-8 (U+89249 was counting as 1 word) - Add unicode.Co, unicode.IsGraphic, unicode.Zs to the import allowlist - Add regression corpus entry FuzzWcDifferentialWords/1c6e2e9cd7371f3e - CI: continue fuzzing all functions even if one fails (OVERALL_STATUS pattern), so a single CI run surfaces all outstanding bugs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
testing.Ffuzz tests forhead,cat,wc,tail, andgrepbuiltinsinterp/builtins/tests/<cmd>/<cmd>_fuzz_test.goWhat's included
For each builtin:
FuzzHeadLines,FuzzHeadBytes,FuzzHeadStdin— verifies exit codes 0/1 and that-n Kproduces at most K newlinesFuzzCat,FuzzCatNumberLines,FuzzCatStdin— verifies output exactly matches input for the plaincatcaseFuzzWc,FuzzWcLines,FuzzWcBytes,FuzzWcStdin— verifies exit codes are saneFuzzTailLines,FuzzTailBytes,FuzzTailStdin— verifies exit codes and output line/byte count invariantsFuzzGrepFileContent,FuzzGrepStdin,FuzzGrepFlags— verifies exit codes 0, 1, or 2Seed corpus
Each fuzz function covers edge cases:
[]byte{})[]byte("a\x00b\n"))[]byte("\n\n\n"))Safety guards
return)n < 0 → return,n > 10000 → n = 10000context.WithTimeout(5s)on every test to detect hangsTest plan
go build ./...passesgo test ./interp/builtins/tests/... -run Fuzz -timeout 60s— all seed corpus entries passgo test ./interp/builtins/tests/head/... -fuzz FuzzHeadLines -fuzztime 30sAgent context (for resuming work)
Original prompt: Research formal testing approaches for rshell builtin implementations. Implement Option 1: native Go fuzz tests using
testing.F.Research findings that shaped this PR:
testing.Fwithf.Add(seedValue)for seed corpus andf.Fuzz(func(t *testing.T, ...))for the fuzz body. Coverage-guided, persists findings intestdata/fuzz/<FuncName>/.go test -run '^FuzzXxx'(no-fuzz=) executes only seed corpus entries as regular tests — zero overhead in normal CI.go test -fuzz=FuzzXxx -fuzztime=30s ./pkg/— runs in-process, no external tools needed.head -n 5must produce ≤5 newlines;catmust produce output matching input byte-for-byte; grep exit code must be 0, 1, or 2.context.WithTimeout(5s)catches infinite loops / hangs that would otherwise block the fuzzer indefinitely.nclamped to 10000 prevents the fuzzer from generatinghead -n 99999999which would be valid but slow.Key design decisions:
interp/builtins/tests/<cmd>/alongside existing unit tests, sharing thecmdRun/cmdRunCtxhelper patterngrepfuzz tests useFuzzGrepFlagsto explore flag combinations (e.g.-i,-v,-c) — important since grep has the most flag surface areaMerge note: This PR and PR #62 (differential fuzz) both write to
interp/builtins/tests/<cmd>/. Merge one first to avoid conflicts on helper files.🤖 Generated with Claude Code