Conversation
- Add .github/workflows/fuzz.yml: runs each Fuzz* target for 30s per push/PR across head, cat, wc, tail, and grep packages; caches corpus between runs; skips gracefully when no fuzz targets exist yet in a package. - Update .github/workflows/test.yml: add fuzz seed corpus regression step so any checked-in corpus entries that crash are caught on every PR. - Add testdata/fuzz/.gitkeep placeholders so corpus cache paths are consistent. - Document corpus retention policy in .gitignore. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
thieman
left a comment
There was a problem hiding this comment.
Review Summary
This PR adds a GitHub Actions fuzz testing workflow (fuzz.yml) and a fuzz seed corpus regression step to the main CI (test.yml). The change is infrastructure-only — no shell interpreter code or builtin logic is modified, so there are no sandbox integrity, bash compatibility, or correctness dimensions to evaluate.
Overall assessment: safe to merge with minor improvements noted.
What's good
actions/checkoutandactions/setup-goare correctly pinned to exact SHAs, consistent with the other workflows in this repo.permissions: contents: readis explicitly declared and appropriately minimal.fail-fast: falsein the matrix strategy is correct: a crash in one package should not suppress fuzzing of others.- The corpus save step uses
if: always()so corpus entries are preserved even when upstream steps fail — this is important for crash reproduction. - The seed corpus replay design (run fuzz targets as unit tests on every push) means any checked-in crash-reproducing input will be caught deterministically and cannot be overlooked.
- Graceful skip when no
Fuzz*functions exist keeps the workflow non-breaking as new fuzz targets are added incrementally. - The
cutpackage is omitted from the matrix because it has noFuzz*functions — correct.
Findings
thieman
left a comment
There was a problem hiding this comment.
Review Summary
This PR adds a GitHub Actions fuzz testing workflow and seeds corpus infrastructure for the rshell builtin packages. The overall approach is sound — continuous fuzzing is a valuable addition to a safety-critical shell interpreter — but there are two issues that need fixing before merge.
Overall assessment: needs fixes
Findings
Positive Observations
- Corpus is correctly persisted via
actions/cachewith SHA-keyed entries and prefix-based restore. - Differential fuzz tests are correctly excluded from the non-Docker fuzz runs by filtering on
Differential. - Graceful skip when no fuzz targets exist prevents the workflow from ever being a blocking false-positive.
- The seed corpus regression step added to
test.ymlensures corpus crash entries are caught on every PR.
|
I can only run on private repositories. |
|
Iteration 1 self-review result: COMMENT 4 findings:
Proceeding to address-pr-comments to fix these issues. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 49e15305bf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - name: Run tests with race detector | ||
| run: go test -race -v ./... | ||
| - name: Run fuzz seed corpus (regression test) | ||
| run: go test -run '^Fuzz' -fuzztime=0s ./interp/builtins/... -timeout 120s |
There was a problem hiding this comment.
Use a valid -fuzztime value in CI regression test
The new CI step passes -fuzztime=0s, which is rejected by the Go test runner (invalid value "0s" for flag -test.fuzztime) and causes the main test job to fail before completing package tests; I reproduced this locally with the repo’s Go 1.25 toolchain. go help testflag documents -fuzztime as a duration or Nx, so this should be changed (or removed for seed-only runs), and the same invalid value is also present in .github/workflows/fuzz.yml.
Useful? React with 👍 / 👎.
| fi | ||
| for FUNC in $FUZZ_FUNCS; do | ||
| echo "Fuzzing $FUNC..." | ||
| go test -fuzz="^${FUNC}$" -fuzztime=30s ${{ matrix.pkg }} -timeout 120s || true |
There was a problem hiding this comment.
Remove
|| true so fuzz crashes fail the workflow
Each fuzz target invocation is wrapped with || true, so even if go test -fuzz finds a panic/crash and exits non-zero, the step still succeeds and the workflow remains green. That masks real fuzz regressions on push/PR runs and defeats the workflow’s purpose as a signal for newly discovered failures.
Useful? React with 👍 / 👎.
- Fix || true silencing fuzzer crashes: accumulate failures with FAILED=0/FAILED=1 and exit with the accumulated code so any crash fails the step while still running all fuzz targets in the loop (P1) - Pin actions/cache and actions/cache/save to SHA 0057852bfaa89a56745cba8c7296529d2fc39830 (v4.3.0) matching the supply-chain pinning discipline for checkout and setup-go (P2) - Restrict push trigger from branches: ['**'] to branches: [main] to match the scope convention in test.yml and avoid burning CI minutes on WIP branches (P3) - Use -timeout 60s per fuzz invocation instead of shared -timeout 120s so the timeout does not accumulate across multiple fuzz targets in one package (P3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
-fuzztime=0s is rejected by the Go test binary as an invalid duration when used without -fuzz=. Drop the flag; -run '^Fuzz' without -fuzz= is all that's needed to run seed corpus entries as regular tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Superseded by #62 which includes all CI workflow changes plus fuzz tests. |
Summary
.github/workflows/fuzz.yml: a new workflow that runs allFuzz*targets (excluding differential tests) for 30s each on every push and PR, across thehead,cat,wc,tail, andgrepbuiltin packages..github/workflows/test.yml: adds a fuzz seed corpus regression step to the main CI matrix so any checked-in corpus entries that cause crashes are caught on every PR, not just during fuzz runs.testdata/fuzz/.gitkeepplaceholders in each builtin test package to ensure corpus cache paths are stable..gitignore.The workflow is resilient: if no
Fuzz*functions exist yet in a package, steps skip gracefully with an informational message rather than failing. This means the workflow is immediately usable as new fuzz tests are added by other PRs.Corpus is persisted via
actions/cachekeyed bygithub.shawith a fallback prefix key, so each run benefits from previous fuzzing sessions.Test plan
Agent context (for resuming work)
Original prompt: Research formal testing approaches for rshell builtin implementations (memory benchmarking, property/fuzz testing). Implement Option 6: a GitHub Actions CI workflow for continuous fuzzing.
Research findings that shaped this PR:
testing.F/f.Fuzz()is coverage-guided and persists corpus intestdata/fuzz/<FuncName>/. Running with-run '^Fuzz'(no-fuzz=) executes only seed corpus entries as regular tests — ideal for CI regression checking.actions/cachewith SHA key + prefix fallback allows corpus to grow across runs without stale-key misses.|| truewas deliberately avoided: a failure accumulator (FAILED=0/FAILED=1) +exit $FAILEDcorrectly surfaces fuzzer crashes while still running all targets.-fuzztime=0sflag is invalid without-fuzz=— seed corpus regression step uses-run '^Fuzz'only.Key design decisions:
head,cat,wc,tail(grep removed — no fuzz tests yet, would be a no-op)fail-fast: falseso all targets run even if one crashespushtomain+pull_requestonly (not on schedules/forks to control cost)🤖 Generated with Claude Code