Skip to content

openclaw/clawsweeper

Repository files navigation

ClawSweeper

ClawSweeper is the conservative maintenance bot for OpenClaw repositories. It currently covers openclaw/openclaw, openclaw/clawhub, and self-review for openclaw/clawsweeper.

It has two independent lanes:

  • issue/PR sweeper: keeps one markdown report per open issue or PR, publishes one durable Codex automated review comment when useful, and only closes items when the evidence is strong
  • commit sweeper: reviews code-bearing commits that land on main, writes one canonical markdown report per commit, and optionally publishes a GitHub Check Run for that commit

Capabilities

  • Repository profiles: per-repository rules live in src/repository-profiles.ts, so OpenClaw, ClawHub, and ClawSweeper can share the same engine while keeping different apply limits.
  • Issue and PR intake: scheduled runs scan open issues and pull requests, while target repositories can forward exact issue/PR events with repository_dispatch for low-latency one-item reviews.
  • Codex review reports: each issue or PR becomes records/<repo-slug>/items/<number>.md with the decision, evidence, proposed maintainer-facing comment, runtime metadata, and GitHub snapshot hash.
  • Durable review comments: ClawSweeper syncs one marker-backed public review comment per item and edits it in place instead of posting repeated comments. When a review starts and no ClawSweeper comment exists yet, it posts a short crustacean-friendly status placeholder first, then replaces that same comment with the completed review. Completed comments include a dedicated security review section for supply-chain, permission, secret-handling, and code execution concerns. Pull request comments include hidden verdict markers, and actionable PR follow-up includes a hidden clawsweeper-action:fix-required marker for the trusted ClawSweeper repair loop. See docs/pr-review-comments.md.
  • Guarded apply: apply mode re-fetches live GitHub state, checks labels, maintainer authorship, paired issue/PR state, snapshot drift, and repository profile rules before commenting or closing anything.
  • Archive and reopen handling: closed or already-closed reports move to records/<repo-slug>/closed/<number>.md; reopened archived items move back to items/ as stale work.
  • Generated state: openclaw/clawsweeper-state stores durable records/, jobs/, results/, and rendered dashboard output so this repo stays focused on source, workflows, docs, and tests.
  • Workflow status state: pnpm run status updates tracked per-repository status JSON under results/sweep-status/ in the state repo so long-running workflows can publish progress without changing report data.
  • Audit: pnpm run audit compares live GitHub state with report storage and can publish audit state under results/audit/ in the state repo without mutating issues or PRs.
  • Reconcile: pnpm run reconcile repairs report placement drift such as reopened archived records or closed items still sitting in items/.
  • Work candidates: valid, narrow items can be marked as queue_fix_pr candidates for manual ClawSweeper repair promotion.
  • Commit review: push events on target main branches can dispatch to .github/workflows/commit-review.yml, which expands the commit range, skips non-code-only commits cheaply, starts one Codex worker per code-bearing commit, and writes records/<repo-slug>/commits/<sha>.md.
  • Manual reruns and backfills: both lanes support manual workflow dispatch. Commit review supports exact SHAs, historic ranges with before_sha, and an additional_prompt input for one-off review instructions.
  • Commit report queries: pnpm commit-reports -- --since 24h, --findings, --non-clean, --repo, and --author make the flat per-SHA commit storage easy to review by time window without date folders.
  • Optional commit checks: commit reports are the source of truth; target commit Check Runs are disabled by default and can be enabled per run or repo.
  • ClawSweeper repair dispatch: commit reports with result: findings can dispatch to the repair intake, where an audit record is written and a PR is created only when the finding is narrow, non-security, and still relevant on latest main.

Guardrails

ClawSweeper may propose a close only when the item is clearly one of these:

  • implemented on current main
  • not reproducible on current main
  • better suited for ClawHub skill/plugin work than core
  • duplicate or superseded by a canonical issue/PR
  • concrete but not actionable in this source repo
  • incoherent enough that no action can be taken
  • stale issue older than 60 days with too little data to verify

Maintainer-authored items are never auto-closed. Everything else stays open. Issues with an open PR that references them using GitHub closing syntax such as Fixes #123 stay open until that PR merges or is closed. Open issue/PR pairs from the same author stay open together unless the paired item is already resolved or a maintainer explicitly asks to close one side.

Repository profiles can further narrow apply. ClawHub and ClawSweeper self-review are intentionally stricter: they review issues and PRs, but apply may close only PRs where current main already implements the proposed change with source-backed evidence.

Maintainer Commands

Maintainers can steer ClawSweeper from target-repo issue and PR comments. The preferred form is @clawsweeper .... The router also accepts @clawsweeper[bot] ..., @openclaw-clawsweeper ..., @openclaw-clawsweeper[bot] ..., and legacy slash aliases such as /clawsweeper ..., /review, /automerge, and /autoclose <reason>.

Common commands:

@clawsweeper status
@clawsweeper re-review
@clawsweeper review
@clawsweeper fix ci
@clawsweeper address review
@clawsweeper rebase
@clawsweeper autofix
@clawsweeper automerge
@clawsweeper approve
@clawsweeper explain
@clawsweeper stop
@clawsweeper why did automerge stop here?
  • status and explain post a short target summary.
  • review and re-review dispatch a fresh ClawSweeper issue/PR review without starting repair.
  • Command status replies are marker-backed and edited in place per issue/PR, intent, and head SHA, so repeated review nudges do not leave a trail of duplicate lobster notes.
  • Freeform @clawsweeper ... mentions dispatch a read-only assist review that answers the maintainer request in the next ClawSweeper comment. Action-looking prose still maps through existing safe markers and deterministic gates.
  • fix ci, address review, and rebase dispatch the repair worker only for ClawSweeper PRs or PRs already opted into clawsweeper:autofix or clawsweeper:automerge.
  • autofix labels an open PR, creates or reuses the adopted job, dispatches review, and enters the bounded review/fix loop without merging.
  • automerge labels an open PR, creates or reuses the adopted job, dispatches review, and enters the bounded review/fix/merge loop. Draft PRs are fix-only until GitHub marks them ready for review.
  • User-facing OpenClaw fix, feat, and perf automerge PRs must include a CHANGELOG.md entry before ClawSweeper will merge them.
  • Security-sensitive findings can be repaired only after explicit autofix/automerge opt-in; ClawSweeper still will not merge until a later exact-head review is clean.
  • approve lets a maintainer clear a ClawSweeper human-review pause and merge only after the normal exact-head, checks, mergeability, and gate checks pass.
  • stop adds clawsweeper:human-review; /autoclose <reason> closes the item and bounded linked same-repo targets with an explicit maintainer reason.

Only maintainers are accepted. The router checks repository collaborator permission (admin, maintain, or write) and falls back to trusted author_association values when permission lookup is unavailable. Contributor commands are ignored without a reply. Scheduled comment routing is dry unless CLAWSWEEPER_COMMENT_ROUTER_EXECUTE=1; workflow dispatch with execute=true can be used for one-off live routing.

Dashboard

Live dashboard and generated state: https://github.com/openclaw/clawsweeper-state

How It Works

ClawSweeper is split into two operational systems:

  • issue/PR sweeper: scheduler, review lane, apply lane, audit, reconcile, and durable state publishing
  • commit sweeper: main-branch commit dispatch, cheap code/non-code classification, one Codex review worker per code-bearing commit, report publishing, and optional target commit checks

Scheduler

The issue/PR scheduler decides what to scan and how often. New and active items get more attention; older quiet items fall back to a slower cadence.

  • hot/new and recently active items are checked hourly, with a 5-minute intake schedule for the newest queue edge
  • target repositories can forward issue and PR events with repository_dispatch; those exact item runs use a dedicated single job to review one item, sync the durable comment, and apply only safe close proposals for that same item
  • pull requests and issues younger than 30 days are checked daily once they leave the hot window
  • older inactive issues are checked weekly
  • apply wakes every 15 minutes and exits quickly when there are no unchanged high-confidence close proposals

Review Lane

Review is proposal-only. It never closes items.

  • A planner scans open issues and PRs, then assigns exact item numbers to shards.
  • Manual runs can pass item_number or comma-separated item_numbers to review exact Audit Health findings without scanning for a normal batch.
  • Each shard checks out the selected target repository at main.
  • Codex reviews with gpt-5.5, high reasoning, fast service tier, and a 10-minute per-item timeout.
  • Each item becomes a flat report under records/<repo-slug>/items/<number>.md with the decision, evidence, Codex /review-style PR findings, suggested comment, runtime metadata, and GitHub snapshot hash.
  • High-confidence allowed close decisions become proposed_close.
  • After publish, the lane checks the selected items' single marker-backed Codex review comment. Missing comments and missing metadata are synced immediately; existing comments are refreshed only when stale, currently weekly.
  • PR review comments keep the top-level note concise, put source links and full evidence in collapsed details, and use hidden verdict/action markers for the trusted ClawSweeper repair loop; see docs/pr-review-comments.md.

Apply Lane

Apply reads existing reports and mutates GitHub only when the stored review is still valid.

  • Updates the single marker-backed Codex automated review comment in place.
  • Closes only unchanged high-confidence proposals.
  • Reuses the review comment when closing; no duplicate close comment.
  • Moves closed or already-closed reports to records/<repo-slug>/closed/<number>.md.
  • Moves reopened archived reports back to the repo’s items/ folder as stale.
  • Commits checkpoints and machine-readable status during long runs.

Apply wakes every 15 minutes, no-ops when there are no unchanged high-confidence close proposals, and narrows scheduled runs to the currently eligible proposal list so idle runs do not scan unrelated keep-open records. It defaults to all item kinds, no age floor, a 2-second close delay, and 50 fresh closes per checkpoint. If it reaches the requested limit, it queues another apply run with the same settings.

Exact event runs skip the bulk planner, shard matrix, artifact upload, and separate publish job. They still use the same review and apply code paths, but only for the selected item number and only with immediate-safe reasons enabled by default: implemented_on_main and duplicate_or_superseded. stale_insufficient_info is never applied to young items; apply requires those issue reports to be at least 30 days old unless a manual run explicitly changes the threshold.

The external state dashboard is fleet-scoped. Each configured repository gets its own record folder, status JSON, audit state, cadence counts, and recent activity section. The state repo aggregates those repository snapshots so event runs from one repo do not hide the state of another.

There is still one deterministic apply path for writes. Review can propose and sync stale public review comments, but closing remains guarded by apply so a fresh GitHub snapshot, labels, maintainer-authorship, and unchanged item state are checked immediately before mutation.

Commit Review Lane

Commit review is intentionally separate from issue/PR cleanup. It never closes items, writes comments, or fixes code.

  • Target repositories forward push events from main with repository_dispatch.
  • Manual runs can pass commit_sha, optional before_sha, optional additional_prompt, enabled, and create_checks.
  • The receiver verifies the selected commits are reachable from origin/main.
  • Before selecting and reviewing commits, the receiver waits 15 minutes by default (CLAWSWEEPER_COMMIT_REVIEW_SETTLE_SECONDS=900) so a push range has time to settle across GitHub and the runner.
  • The plan job expands ranges, pages large backfills at GitHub's matrix limit, and classifies each commit before Codex starts.
  • Pure documentation, changelog, README/license, and asset-only commits get a skipped report without spending Codex time.
  • Mixed commits and code-bearing commits start one Codex worker per commit. The worker checks out current target main and reviews the selected commit by SHA/range instead of detaching the whole repository at that commit.
  • Codex is prompted to read beyond the diff: changed files, callers/callees, runtime entry points, adjacent tests/docs, dependency manifests, release notes, advisories, web sources, and focused live tests when useful.
  • Each commit writes exactly one report at records/<repo-slug>/commits/<40-char-sha>.md.
  • Reruns overwrite the same report, including reruns with an additional_prompt.
  • Report results are nothing_found, findings, inconclusive, failed, or skipped_non_code.
  • Optional GitHub Checks use the ClawSweeper Commit Review name on the target commit. Clean or skipped reports are green; high-confidence high/critical findings fail; lower-severity, inconclusive, and failed reviews are neutral.
  • Finding reports are dispatched to the repair intake when CLAWSWEEPER_COMMIT_FINDINGS_ENABLED is not false. ClawSweeper owns the audit log and any repair PR.

Use pnpm commit-reports -- --since 24h to review recent reports and add --findings, --non-clean, --repo, or --author to narrow the list. The storage stays flat so a rerun can overwrite exactly one file for a commit without rediscovering a date bucket.

Safety Model

  • Maintainer-authored items are excluded from automated closes.
  • Protected labels block close proposals.
  • Open PRs with GitHub closing references block issue closes until the PR is resolved.
  • Open same-author issue/PR pairs block one-sided closes.
  • Codex runs without GitHub write tokens.
  • Issue/PR event jobs create target write and report-push credentials only after Codex exits.
  • Commit review workers give Codex only a read-scoped target token as GH_TOKEN so it can inspect mentioned issues, PRs, workflow runs, and commit metadata.
  • Commit write/check credentials are created only after Codex exits.
  • CI makes the target checkout read-only for reviews.
  • Reviews fail if Codex leaves tracked or untracked changes behind.
  • Snapshot changes block apply unless the only change is the bot’s own review comment.
  • Commit Check Runs are optional and disabled by default.

Audit

pnpm run audit compares live GitHub state with generated records without moving files. It reports missing open records, archived open records, stale records, duplicates, protected-label proposed closes, and stale review-status records. Protected proposed closes are reported only for active repo items/ records because archived repo closed/ records are historical and cannot be applied. Missing open records are classified as eligible, maintainer-authored, protected, or recently created so strict audit mode can flag actionable drift without treating expected queue lag or excluded items as failures. Use --update-dashboard to publish the latest audit state under results/audit/ in openclaw/clawsweeper-state without making every normal status update scan all open GitHub items. The state repo renders reviewable findings such as missing eligible records, reopened archived records, and stale reviews from that state. The workflow refreshes audit state on a separate six-hour schedule, and it can be run manually with audit_dashboard=true.

Local Run

Requires Node 24.

Issue/PR sweeper:

source ~/.profile
corepack enable
pnpm install
pnpm run build
pnpm run plan -- --target-repo openclaw/openclaw --batch-size 5 --shard-count 100 --max-pages 250 --codex-model gpt-5.5 --codex-reasoning-effort high --codex-service-tier fast
pnpm run review -- --target-repo openclaw/openclaw --target-dir ../openclaw --batch-size 5 --max-pages 250 --artifact-dir artifacts/reviews --codex-model gpt-5.5 --codex-reasoning-effort high --codex-service-tier fast --codex-timeout-ms 600000
pnpm run apply-artifacts -- --target-repo openclaw/openclaw --artifact-dir artifacts/reviews --skip-dashboard
pnpm run audit -- --target-repo openclaw/openclaw --max-pages 250 --sample-limit 25 --update-dashboard
pnpm run reconcile -- --target-repo openclaw/openclaw --dry-run

Apply unchanged proposals later:

source ~/.profile
corepack enable
pnpm run apply-decisions -- --target-repo openclaw/openclaw --limit 20 --apply-kind all --skip-dashboard

Sync durable review comments without closing:

source ~/.profile
corepack enable
pnpm run apply-decisions -- --target-repo openclaw/openclaw --sync-comments-only --comment-sync-min-age-days 7 --processed-limit 1000 --limit 0 --skip-dashboard

List commit reports:

source ~/.profile
corepack enable
pnpm run build
pnpm commit-reports -- --since 24h
pnpm commit-reports -- --since 24h --findings
pnpm commit-reports -- --repo openclaw/openclaw --author steipete --since 7d

Manually rerun commit review through GitHub Actions:

gh workflow run commit-review.yml \
  --repo openclaw/clawsweeper \
  --ref main \
  -f target_repo=openclaw/openclaw \
  -f commit_sha=<commit-sha> \
  -f before_sha=<parent-or-range-start-sha> \
  -f create_checks=false \
  -f enabled=true \
  -f additional_prompt='Optional extra review focus.'

Omit before_sha for a single-commit review. Pass before_sha to review the historic range before_sha..commit_sha.

Manual review runs are proposal-only. Use apply_existing=true to apply unchanged proposals later. Scheduled apply runs process both issues and pull requests by default, subject to the selected repository profile; pass target_repo, apply_kind=issue, or apply_kind=pull_request to narrow a manual run.

Scheduled runs cover the configured product profiles. openclaw/openclaw keeps the existing cadence; openclaw/clawhub runs on offset review/apply/audit crons so its reports live under records/openclaw-clawhub/ without colliding with default repo records. openclaw/clawsweeper is available for manual and event self-review smoke tests. Broad hot-intake sweeps cap scheduled fan-out at 50 one-item shards per run; exact event reviews still use one shard, and normal review backfills can fan out to 100 shards when explicitly configured.

Target repositories can opt into event-level latency by installing the dispatcher workflow in docs/target-dispatcher.md. The dispatcher sends repository_dispatch events to this repository with the target repo and exact item number; ClawSweeper then runs one event job that reviews, comments, and checks immediate safe apply instead of waiting for the next hot-intake cron or bulk publish lane.

Target repositories can opt into main-branch commit review with docs/commit-dispatcher.md. That dispatcher sends push ranges to this repository, where ClawSweeper expands the range and writes one commit report per SHA.

Checks

pnpm run check
pnpm run oxformat

oxformat is an alias for oxfmt; there is no separate oxformat pnpm package. The CI GitHub Actions workflow uses the latest Node release and runs pnpm run check on pushes, pull requests, and manual dispatches. The check gate includes the full test suite, a strict changed-surface coverage threshold, and a full compiled-repo coverage ratchet.

GitHub Actions Setup

Required secrets:

  • OPENAI_API_KEY: OpenAI API key used to log Codex in before review shards run.
  • CLAWSWEEPER_APP_CLIENT_ID: public GitHub App client ID for clawsweeper. Currently Iv23liOECG0slfuhz093.
  • CLAWSWEEPER_APP_PRIVATE_KEY: private key for clawsweeper; plan/review jobs use a short-lived GitHub App installation token for read-heavy target API calls, commit review uses a read-scoped target token while Codex runs, and apply/comment-sync/check jobs use the app token for comments, closes, and optional checks. Keep App credentials scoped to the actions/create-github-app-token step. Review shards run Codex over attacker-controlled issue/PR text, so codexEnv() also strips these App variables before spawning Codex.

Token flow:

  • Review shards log Codex in with OPENAI_API_KEY, then run without OpenAI or Codex token environment variables.
  • ClawSweeper uses the clawsweeper GitHub App token for read-heavy target context.
  • Apply mode uses the same app token for review comments and closes, so GitHub attributes mutations to the app bot account instead of a PAT user.
  • Commit review passes Codex only a read-scoped target token as GH_TOKEN for issue/PR/workflow/commit hydration, then creates write/check credentials only after Codex exits.
  • The ClawSweeper GitHub App commits generated reports back to openclaw/clawsweeper-state.

Required clawsweeper app permissions:

  • Contents: read/write, for report commits, repair branches, and repository dispatch inputs that need a contents-scoped installation token.
  • Issues: read/write, for issue comments, labels, closes, and maintainer command authorization context.
  • Pull requests: read/write, for PR comments, labels, merge readiness, repair PRs, and guarded automerge.
  • Actions: read/write on openclaw/clawsweeper, for run cancellation, manual dispatch, self-heal, and commit-review continuations.
  • Checks: write on target repositories when commit Check Runs should be published.

ClawSweeper no longer falls back to PAT-based write tokens. If the GitHub App installation does not grant the requested permission set, the workflow fails at token creation instead of silently switching identity.

Target repository setup:

  • install the issue/PR dispatcher from docs/target-dispatcher.md for exact item event reviews
  • install the commit dispatcher from docs/commit-dispatcher.md for main commit reviews
  • set CLAWSWEEPER_COMMIT_REVIEW_ENABLED=false to disable commit dispatch without code changes
  • set CLAWSWEEPER_COMMIT_REVIEW_CREATE_CHECKS=true only if commit Check Runs should be published
  • optionally set CLAWSWEEPER_COMMIT_REVIEW_SETTLE_SECONDS=0 for manual backfills where the target commit range is already settled; the default is 900

About

ClawSweeper scans all issues and PRs and suggest what we can close, and why. It runs every PR / Issue once a week.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors