Skip to content

Extend validate-action-pins with list, updates, and branch-pin support#123

Merged
lex57ukr merged 33 commits intomainfrom
122-extend-action-pin-tooling
Apr 19, 2026
Merged

Extend validate-action-pins with list, updates, and branch-pin support#123
lex57ukr merged 33 commits intomainfrom
122-extend-action-pin-tooling

Conversation

@lex57ukr
Copy link
Copy Markdown
Contributor

Summary

Extends validate-action-pins from a single SHA-vs-tag verifier into a three-mode audit tool and lays in the BATS harness we'd been missing. The driving use case was knight-owl-dev/homebrew-tap#50Homebrew/actions/setup-homebrew@main is SHA-pinned to a branch, Dependabot can't track that kind of drift, and we had no tool-supported way to see it.

The tool now supports check (verify pins, plus branch-pin drift reporting), list (offline pin inventory, optionally API-backed for classification), and updates (per-pin upgrade inventory across minor and major bumps), each with a shared --only=tag|branch|all filter for automation scenarios like "monitor only branch-pinned actions for drift".

Related Issues

Refs #122

Changes

  • Three subcommands on a single binary — check, list, updates — with bare validate-action-pins FILE... still routing to check (no caller changes needed in Makefile, scripts/ci-tools/verify.sh, or scripts/verify-deb-install.sh)
  • Branch-pin support in check: Dependabot-style # <ref> comment may be tag OR branch; tag mismatches remain FAIL, branch drift is WARN with commit-behind count from /compare/{base}...{head}
  • list subcommand: offline by default; taps the API for authoritative filtering only when --only=tag|branch is set
  • updates subcommand: reports every tag strictly newer than the current pin (minor and major — Dependabot doesn't open major PRs by default), plus branch HEAD for stale branch pins; always exits 0 (informational)
  • Shared --format=plain|tsv for list/updates and --only=all|tag|branch across all three subcommands; subcommand word may appear anywhere in the arg list
  • New BATS harness under tests/bats/images/ci-tools/validate-action-pins/ — 83 cases across six files organized by abstraction layer (cli / preflight / helpers / subcommand-*), using file:// fixtures in place of the GitHub API
  • Wired into CI via a new bats job parallel to lint, reusing the ci-tools container
  • Operational hardening: curl retries + --max-time, distinct WARN paths for auth / 429 / transport / rate-limit-exhausted failure modes, VALIDATE_ACTION_PINS_VERBOSE escape hatch, DOCKER_TTY Makefile knob for colourised interactive runs
  • Man page rewritten to cover the new surface and four-mode skip taxonomy

Further Comments

The design evolved across the session — subcommand split vs. separate binary, updates as oracle vs. inventory, permissive flag placement, CI wiring, plus multiple reviewer-hat passes. Trade-offs and the shifting punt list are captured in the comment thread on #122 (approach summary, deferred items, post-iteration addendum).

Deferred to follow-ups with reasoning in the punt-list comment: parallelizing API calls, TSV schema unification across subcommands, nightly smoke test against the real GitHub API.

Squash-merge expected.

lex57ukr and others added 30 commits April 18, 2026 14:39
Refactor validate-action-pins to be sourceable and retargetable so it
can be unit- and integration-tested. Behavior is byte-identical when
env vars are unset (verified by diff-guard against current output).

Testability changes:
- Wrap top-level body in main(); guard with BASH_SOURCE[0] == $0 so the
  script can be sourced without running.
- GITHUB_API_BASE env var (default https://api.github.com) threads into
  gh_api and the connectivity probe. Tests set it to file://fixtures.
- VALIDATE_ACTION_PINS_SKIP_CONNECTIVITY env var bypasses the
  rate_limit probe for tests that do not provide that fixture.
- Move RESOLVE_CACHE, FAILURES, FOUND_PINS, seen_in_file, and the
  auth_header population out of module scope into main() so functions
  no longer depend on module-level state; init_auth_header is callable
  from tests when needed.

Test harness:
- New tests/bats/ root, mirroring images/<name>/ under
  tests/bats/images/<name>/ per convention.
- tests/bats/helpers/common.bash loads bats-support/assert/file and
  normalises the test environment (GITHUB_API_BASE to fixtures,
  GITHUB_TOKEN cleared, connectivity probe skipped).
- 14 baseline tests pinning --help, --version, no-args, missing-file,
  tag match/mismatch, annotated-tag deref, duplicate-pin dedupe,
  connectivity failure, missing curl/jq, and sourced resolve_tag unit
  calls.
- Makefile target "make test-bats" runs the suite inside the ci-tools
  image; lint-sh extended to cover tests/bats/**.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split arg parsing from pin verification so the tool can grow new
subcommands. Bare `validate-action-pins FILE...` continues to route
to the pin-check path; `validate-action-pins check FILE...` is the
explicit equivalent. Other shapes (`list`, `updates`) are reserved
but not yet wired up.

- main() now handles a three-stage parse: scan-all-args for
  --help/--version, detect a known subcommand at position 1 (falling
  through to the default `check` otherwise), then collect files while
  rejecting unknown flags with exit 2.
- The pin-check body moves verbatim into cmd_check so it can be
  dispatched by name. Exit code semantics for that path are
  preserved — 0 on match, 1 on mismatch, 0 on transient skip.
- Unknown flags previously fell through to the file loop where they
  were silently skipped; they now fail fast with exit 2. Non-flag
  words that do not match a subcommand are still treated as files,
  so any workflow path — even one spelled like a reserved word — is
  accepted.
- Man page gains a SUBCOMMANDS section and an EXIT STATUS 2 entry.
- 7 new bats cases cover the dispatch surface: bare vs explicit
  equivalence, `check --version`, long/short unknown flags,
  non-subcommand word → file, `check` with no files, and `--`
  terminator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
validate-action-pins list inventories every `uses:` pin found in the
input workflow files with no API calls. Useful for audits and for
piping a machine-readable view into other tooling.

- parse_uses_line extracts action, ref, kind, and trailing-comment
  word from a `uses:` line using a loose regex. Kind is "sha" for
  40-hex refs and "ref" for symbolic refs; distinguishing tag from
  branch requires API calls and is deferred to resolver-backed
  subcommands. Local `./` actions and `docker://` schemes are
  skipped.
- cmd_list iterates files, parses each line, and emits records in
  either plain (`<basename>: <action>@<ref>  (# <comment>)`) or TSV
  (`<file>\t<action>\t<ref>\t<kind>\t<comment>`) format. No
  deduplication — every occurrence is reported.
- main gains --format=plain|tsv parsing (two-token and = forms),
  with invalid values rejected with exit 2. --format is currently
  list-scoped; check accepts it silently.
- check_api_preflight extracts the curl/jq dep check, auth init,
  and rate_limit connectivity probe into a reusable helper, called
  only from cmd_check. cmd_list skips all of it, so listing works
  offline and without jq installed.
- Man page gains a list synopsis, a SUBCOMMANDS entry with output
  formats and the skip rules, a --format OPTION, and two new
  EXAMPLES.
- 11 new bats cases cover list output in both formats, the skip
  rules for non-remote actions, --format parsing and validation,
  offline behaviour (no curl/jq, no probe), and sourced
  parse_uses_line unit calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SHA-pinned actions with a branch-name comment — e.g.
`Homebrew/actions/setup-homebrew@<sha> # main` — were previously
reported as an unresolvable ref because the resolver only probed
`/git/ref/tags`. This extends check to follow the Dependabot comment
convention: probe the tag endpoint first and fall back to
`/git/ref/heads`, classifying the result as either a tag or a branch.

- resolve_ref replaces resolve_tag, returning a tab-separated
  `<sha>\t<tag|branch>` tuple so callers can route on the kind.
  Annotated-tag dereference is preserved.
- compare_behind calls `/repos/.../compare/base...head` and returns
  the `.behind_by` integer. A branch pin that has moved is not a
  failure — the repo owner chose to track HEAD — so cmd_check emits
  WARN with the commit count, or "diverges from HEAD" when
  `.behind_by == 0` (the pin is on an unrelated line of history).
- The cache sentinel flips from `"!"` to `"NONE"` so it no longer
  collides with the tab-delimited resolved value. A sibling
  compare_cache memoises compare_behind calls.
- Unresolvable-ref message flips from "could not resolve tag" to
  "could not resolve ref" to reflect that either endpoint may have
  answered.
- Man page gains a Dependabot-convention paragraph, documents the
  WARN-not-FAIL branch-drift semantics, and shows the four possible
  check output shapes.
- 6 new integration cases cover branch match, behind HEAD, diverged,
  unresolvable, tag-mismatch regression, and duplicate branch-pin
  dedupe; 5 new unit cases cover resolve_ref and compare_behind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
validate-action-pins updates scans workflow files and, for each unique
pin, reports the newest upgrade available — the latest same-major
tag and the overall latest tag for tag pins, or the current branch
HEAD for branch pins. Output is informational; the subcommand always
exits 0 on a successful run regardless of whether upgrades exist.

- latest_tag fetches /repos/.../tags and uses jq to keep strict
  three-part semver names only (v6.0.2), optionally filter to a given
  major, and return the numerically-highest entry. Pre-release and
  non-semver tag names are skipped silently. A single page of /tags
  is consulted, which suffices for typical action repos; this is
  called out as a known limitation in the man page.
- head_sha returns the commit SHA at the head of a branch via
  /git/ref/heads; reused from the same endpoint pattern resolve_ref
  uses internally.
- cmd_updates classifies each pin's effective ref (the trailing
  comment for SHA pins, or the symbolic ref itself) by pattern: a
  three-part semver triggers the tag-lookup path, everything else
  goes through head_sha. Results are deduplicated per unique
  action@effective-ref.
- check_api_preflight now accepts an operation-name argument so WARN
  messages read correctly for the invoking subcommand ("update check"
  vs. the default "pin validation"). The default preserves existing
  output and tests.
- main routes `updates` through the same --format=plain|tsv parser
  as list. The plain form shows "up-to-date" or "current=X  latest=Y
  (major: Z)" with a kind suffix; the TSV form is six columns
  (file, action, current_ref, major_latest, overall_latest, kind).
- Usage and man page gain updates documentation, including a note
  that GITHUB_TOKEN is effectively required at scale (unauthenticated
  rate limit is 60 req/hr). A new EXAMPLES entry shows piping
  --format=tsv output into awk.
- 6 new integration cases cover tag with new major + newer patch,
  up-to-date tag, branch at HEAD, stale branch, TSV column count,
  and the exit-0 contract. 5 new sourced cases cover latest_tag
  major filter, overall picking, pre-release skip, non-semver skip,
  and head_sha.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Within a single file, seen_in_file short-circuits cmd_check before the
resolve lookup, so same-file duplicates never hit the API. Across
files, seen_in_file is reset per file and dedup relies on
resolve_cache instead. The existing "duplicate pins within one file"
test covered the first path; this adds a fixture and test for the
second, asserting exactly one OK line per file when the same pin
appears in two workflows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first cut of `validate-action-pins updates` tried to pick "the"
next version for each pin — the latest same-major tag and the overall
latest — which forces a policy on an intentionally unopinionated
audit tool. Real-world action tagging is messy (v6, v6.0, v6.0.2,
0.35.0 without v, major-only aliases, mixed conventions), and
Dependabot does not open major-version PRs by default, so the value
of this subcommand is listing everything a periodic audit should
consider, not choosing for the user.

Behaviour changes:

- list_newer_tags replaces latest_tag. It emits every tag strictly
  newer than the current ref, sorted ascending, so both minor and
  major bumps are visible in one place.
- Ref classification in cmd_updates is loosened to accept one-, two-,
  and three-part numeric tags (with or without a v prefix), so
  `@v6`-style major aliases route through the tag path instead of
  falling through to a branch lookup.
- Comparison normalises v6, v6.0, and v6.0.0 to [6,0,0] so major-only
  pins correctly report every three-part tag above them.
- Plain format: `newer=<t1> <t2> ...` replaces `latest=... (major: ...)`
  for tag pins; `head=<short-sha>` replaces the previous two-column
  latest form for stale branch pins.
- TSV format drops from six columns to five: file, action, ref,
  available (space-separated newer tags or short HEAD SHA), kind.
  Consumers split column four on spaces to iterate versions.
- Man page and EXAMPLES updated for the new shape and the awk
  pipeline now checks column 4.

Tests:

- "classifies major-only aliases (e.g. @v6) as tag pins" — covers
  the broader classification regex and the normalisation rule.
- "tsv puts the space-separated newer list in column 4" — pins the
  TSV column assignment.
- list_newer_tags sourced cases cover empty result, major-only ref,
  pre-release skip, and non-semver skip.
- Old latest_tag unit cases removed; the function no longer exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both subcommand descriptions had grown into single long paragraphs
once the tsv field definitions, parsing rules, and rate-limit caveats
were stacked on top of the prose. Adding .Pp separators between the
intro, plain-output, tsv-output, semantics, and API-cost portions
makes the rendered output easier to skim without changing any prose.

Same idiom the DESCRIPTION section already uses — valid mdoc inside
a .Bl -tag item, no rendering tradeoffs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SC2030 and SC2031 fired on `export GITHUB_API_BASE=...` inside @test
blocks because shellcheck sees bats's per-test subshell and warns
that the export is scoped to it. The idiomatic shell workaround is
inlining the env var on the `run` invocation — same scoping, no
warning — so rewrite the two connectivity-probe tests to do that and
drop both codes from the file-level disable.

SC2154 stays disabled. bats sets `output`, `BATS_TEST_DIRNAME`, and
`BATS_TEST_TMPDIR` at runtime, and `common_setup` exports
`REPO_ROOT`, `FIXTURES_DIR`, `API_FIXTURES_DIR`, and `SCRIPT` —
shellcheck can't trace any of them, and declaring each at file top
would be noisier than the blanket disable. This matches the bats
community norm for mixing shellcheck with bats-core.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each .bats file had to carry `# shellcheck disable=SC2154` at the
top because bats sets variables (`output`, `BATS_TEST_DIRNAME`,
`BATS_TEST_TMPDIR`, ...) and `common_setup` exports more that
shellcheck can't trace. A per-file pragma is a tax on every future
suite.

A subdirectory .shellcheckrc would duplicate the root rules
(shellcheck uses the nearest .shellcheckrc, not a merged stack), so
that route is fragile. Instead, pass `-e SC2154` on the bats-only
shellcheck invocation in `lint-sh`. The root .shellcheckrc's strict
rules still apply; only this one code is waived, and only for files
under tests/bats/.

- Makefile: `shellcheck -e SC2154 tests/bats/*/*.bash tests/bats/*/*/*.bats`
- Drop the file-level `# shellcheck disable=SC2154` comment block
  from validate-action-pins.bats; new suites inherit the exemption
  automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three orthogonal improvements, behaviour-preserving, covered by the
existing 55 bats cases:

1. cmd_check now uses parse_uses_line instead of its own inline
   regex. The tool had two parsers for the same `uses:` syntax — the
   strict one in cmd_check and the looser one used by list and
   updates — which meant any future tweak to the pin shape had to
   land in two places. cmd_check filters parse_uses_line's output to
   `kind=sha && comment!=""` so its contract (validate SHA pins with
   an explicit ref comment) is unchanged, and every subcommand now
   shares a single source of truth for "what is a uses line?"

2. resolve_ref's nested three-level if-tree is split into
   _resolve_as_tag (handles lightweight and annotated tags) and a
   reuse of head_sha for the branch fallback. resolve_ref becomes a
   straight-through "try tag, try branch, else 1" with
   printf-formatted tuple output, easier to read at a glance and
   trivially extendable if we ever need another ref kind.

3. Cmd_check's four scattered `echo "OK   ${basename}: ${action}@..."`
   statements collapse into _emit_check calls. The 4-char level
   column, the `%s@<sha:0:12>...` prefix, and the trailing detail
   now live in one printf format string, so a future output tweak
   (say, colour codes or a JSON mode) is a one-function change.

No rendered-output or exit-code change from the repo's own workflows;
real-world `check` against .github/workflows/*.yml produces the same
line-for-line output. All bats cases pass without modification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both cmd_check and cmd_updates were open-coding the same patterns:

- "check if key is in associative array, skip if so, else mark it
  seen" — five lines repeated in each subcommand.
- "for SHA pins use the comment as the ref hint, else use the ref
  itself" — the Dependabot convention, expressed differently in
  each caller.

Extract two small helpers:

- _once_per <cache> <key> — nameref-based first-occurrence gate.
  Callers keep their own associative array and decide when to reset
  it, so the three subcommands still have three different dedup
  scopes (per-file for check, per-run for updates, none for list),
  but the bookkeeping is one line at each use site.
- _effective_ref <kind> <ref> <comment> — returns the ref hint a
  resolver should consult, centralising the Dependabot rule in one
  place.

The scope difference is deliberate and stays: check reports per
file because file context matters ("which workflow has the bad
pin"), updates reports once per unique pin because it's a single
action item, list reports every occurrence because it's a raw
inventory.

SC2034 ("appears unused") suppressed on the seen_in_{file,run}
declarations and resets, since shellcheck can't trace uses through
nameref. The disables are one-line-scoped and annotated.

No behaviour or output changes — 55 bats cases pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous dedup consolidation used a single _once_per helper
that took the associative-array name by string and accessed it via
declare -n. That worked but needed a SC2034 suppression at every
`local -A seen_in_*` and every per-file reset, because shellcheck
does not trace reads/writes through namerefs.

Move the two sets to module scope as seen_in_file and seen_in_run,
and split the helper into two scope-specialised functions that
access the corresponding global directly:

- _once_per_file / _once_per_run: one-line predicates; first-seen
  returns 0, already-seen returns 1. Call sites read as
  `_once_per_file "${key}" || continue`.
- seen_in_file resets per file in cmd_check's loop; seen_in_run
  resets on entry to cmd_updates. Each subcommand also resets
  its set on entry so a sourced test that invokes the subcommand
  twice starts from a clean state.

With the access now visible inside the helpers, every SC2034
suppression is gone. The scope boundaries that matter (per-file vs
per-run vs none) are now encoded in the helper names rather than
implicit in where the caller chose to declare a local. cmd_list
continues to skip dedup entirely.

No behaviour, output, or test change — 55 bats cases green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Automation scenarios — e.g. a scheduled workflow that only cares
about branch-pinned actions drifting from HEAD — previously had to
grep the subcommand output or post-process TSV in awk. The
driving use case is knight-owl-dev/homebrew-tap, which SHA-pins
Homebrew/actions/setup-homebrew@main and needs a one-liner to
monitor that kind of pin without tripping on unrelated tag pins.

--only=all|tag|branch filters check and updates output by the
classified kind (default all).

- check applies the filter on the authoritative kind returned by
  the API (resolve_ref's tag vs branch tuple). Unresolvable refs are
  classified as "unknown" and reported only with --only=all.
- updates applies the filter on the routing kind (which API endpoint
  was consulted — /tags for tag-shaped refs, /git/ref/heads for
  everything else). Same "unknown" classification for pins that
  have no ref hint at all.
- list is offline by design and cannot authoritatively classify
  refs without API calls, so main accepts and ignores the flag
  there; list output is unaffected.

Naming consistency while I was in the area: plain updates output
now says `[unknown]` to match the tsv kind column (was
`[no ref hint]`). One classification token in both formats.

The filter runs post-API in check and post-classification in
updates, so API call counts are unchanged — this is strictly an
output filter, not a processing shortcut.

A shared _include_pin helper encapsulates the one-line comparison
so both subcommands route through the same predicate. No separate
"classify" helper — the inline classification already exists in
cmd_updates as part of the routing decision, and the check path
reuses resolve_ref's authoritative output.

Man page and usage updated; 9 new bats cases cover the filter
behavior across both subcommands plus the invalid-value exit 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
list was strictly offline, so --only had to be ignored there. That
cut users off from the most useful inventory query: "list only the
branch-pinned actions across my workflows" without writing a grep
pipeline.

Keep list offline by default (--only=all, no API calls, no
authentication) and switch to API-backed classification only when
the caller explicitly narrows with --only=tag or --only=branch.
Under that path, each unique pin's effective ref (comment for SHA
pins, the @ref itself otherwise) is resolved through resolve_ref
and the authoritative tag-vs-branch result drives the filter.
Unresolvable refs and pins with no ref hint classify as "unknown"
and are emitted only with --only=all.

API cost mirrors check: 1–2 calls per unique (action, effective_ref)
pair, memoised in a local cache. The same preflight that check and
updates use handles missing deps / unreachable API with a WARN and
exit 0 — list is only offline in the default mode, not in the
filtered mode.

Man page, usage, and synopsis updated to reflect the dual-mode
semantics. Replaces the "filter is ignored by list" test with three
new cases: filter-by-tag, filter-by-branch, and the preflight-fail
warning for --only=X when the API is unreachable. Added a guard
test that --only=all (default) still performs no API calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t awareness

Operator-visibility improvements surfaced by the DevOps review of
the #122 branch:

- All curl calls use `--retry 3 --retry-delay 2` so transient 5xx,
  DNS blips, and GitHub incident-window hiccups no longer read as
  "unresolvable ref". --retry is silently ignored on file://, so
  tests keep working.
- VALIDATE_ACTION_PINS_VERBOSE=1 passes curl stderr through
  instead of swallowing it. Previously a bad GITHUB_TOKEN or a
  malformed URL produced no trail at all — operators had no way
  to tell auth failure from network failure. A module-scope
  _CURL_ERRFD routes stderr uniformly.
- check_api_preflight now distinguishes four failure modes in its
  WARN message: missing curl/jq, auth rejection (HTTP 401/403),
  transport failure (curl exit → code "000"), and any other
  unexpected HTTP status. The auth branch includes the specific
  actionable hint (`check GITHUB_TOKEN`) that the old generic
  "cannot reach" message hid.
- Rate-limit remaining is read from the /rate_limit probe response
  and a WARN is emitted (non-fatal) when it falls below 20. Gives
  operators a heads-up on a large sweep before individual pin
  lookups start 403-ing mid-run.
- The file:// path continues to be the test harness; the
  rate-limit check now runs in that path too so the threshold
  warning is test-covered.

Two new bats cases cover the rate-limit WARN and verbose curl-
stderr passthrough. 69/69 green, shellcheck/shfmt/mandoc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The grammar for "what looks like a semver-shaped ref" appeared in
two places: the bash classifier inside cmd_updates routing, and the
jq test() filter inside list_newer_tags. If that grammar evolves
(calver support, pre-release inclusion, etc.), both needed to stay
in sync.

Define _SEMVER_RE once at module scope with a comment on the
rationale (what's included, what's deliberately excluded, why).
cmd_updates uses it directly in its =~ match; list_newer_tags
passes it to jq via --arg re so the same string drives both engines
without escape translation.

Pure refactor — 69/69 bats cases pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parse_uses_line regex-matched any line containing `uses: owner/repo@ref`,
including lines whose first non-whitespace character was `#` — so a
commented-out pin like

  # - uses: actions/checkout@<old-sha> # v5
  - uses: actions/checkout@<new-sha> # v6

was parsed as two live pins. check would emit two OK/FAIL/WARN
lines from the same logical occurrence, and updates would report
twice for the same action.

Add a line-prefix guard that rejects `^[[:space:]]*#` lines before
the main regex runs. Fixture `commented-uses.yml` exercises both
the leading-hash and indented-comment-with-hash cases. Two new
bats cases cover the parse-level and check-level behaviour.

Edge cases deliberately not handled: block-literal YAML (`|-`),
flow-mapping inline values (`uses: [...]`). Those are not produced
by real workflow authors and not supported by GitHub Actions'
parser either.

71/71 green, shellcheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test harness points GITHUB_API_BASE at file://fixtures/api so
every gh_api call becomes a file read. New contributors opening the
directory see paths like

  api/repos/foo/br-behind/compare/cccc...bbbb

and have to reverse-engineer why there's a literal `...` in a
filename and why nothing has a .json extension. Add a README that
documents the mirroring rule, the per-repo fixture inventory, and
a step-by-step for adding a new case.

Doc only — no code or test change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related changes that make the test suite survive adding more
ci-tools binaries later.

1. Layered split — the monolithic validate-action-pins.bats (67
   tests across dispatch, parse, check, list, updates, helpers,
   preflight) was hard to navigate because each subcommand mixed
   CLI-surface tests, sourced unit tests, and integration tests.
   Files are now grouped by abstraction level:

     cli.bats        — --help, --version, subcommand dispatch,
                       flag validation, `--` terminator
     preflight.bats  — missing deps, connectivity, auth-vs-reach,
                       rate-limit WARN, verbose mode
     helpers.bats    — sourced unit tests for parse_uses_line,
                       resolve_ref, compare_behind, list_newer_tags,
                       head_sha
     check.bats      — `check` subcommand end-to-end
     list.bats       — `list` subcommand end-to-end
     updates.bats    — `updates` subcommand end-to-end

   Unchanged test bodies; only the file a given @test lives in has
   moved. Test count (71) is preserved.

2. Per-tool subdirectory — each binary under ci-tools now owns a
   namespace under tests/bats/images/ci-tools/. validate-action-pins
   specifically moves from

     tests/bats/images/ci-tools/{validate-action-pins-*.bats,fixtures/}

   to

     tests/bats/images/ci-tools/validate-action-pins/
     ├── {cli,preflight,helpers,check,list,updates}.bats
     └── fixtures/{workflows,api,README.md}

   When we add check-deps or another ci-tools binary, it gets its
   own sibling directory with no naming collisions on fixtures or
   test-file names. The fixtures README documenting the URL-to-path
   mirroring convention moves with the fixtures.

Mechanical consequences:
- `load ../../helpers/common` → `load ../../../helpers/common`
  (one directory deeper).
- `make lint-sh` shellcheck glob gains a path level:
  `tests/bats/*/*/*.bats` → `tests/bats/*/*/*/*.bats`.

71/71 pass; full lint + test-bats green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The three subcommand integration files (check, list, updates) now
share a `subcommand-` prefix, making the tier visually obvious
when listing the directory:

    cli.bats
    helpers.bats
    preflight.bats
    subcommand-check.bats
    subcommand-list.bats
    subcommand-updates.bats

Cross-cutting files (cli, preflight, helpers) keep their bare
names since they don't share a tier. File moves only — no test
body or expectation change. 71/71 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three related changes so that the new bats suite runs on every PR,
and the preflight doesn't degrade silently when GitHub rate-limits.

CI integration:
- New `bats` job in .github/workflows/ci.yml, parallel to `lint`,
  running in the same ghcr.io/knight-owl-dev/ci-tools:latest
  container. Calls `make test-bats` for convention, with the
  BATS_RUNNER override pointed at the container-native `bats` so
  we skip a docker-in-docker round trip.

Makefile:
- test-bats gains a BATS_RUNNER ?= variable. Default is the
  docker-wrapped invocation (for macOS hosts without bats). CI
  overrides to run bats directly. Same target, one knob.

Preflight tightening:
- When GitHub reports `remaining=0`, preflight now returns 1 with
  a single "rate limit exhausted — skipping" WARN instead of
  letting the main loop run and emit one "could not resolve" per
  pin. The old low-but-nonzero path (< 20) still just warns and
  proceeds.
- New bats case in preflight.bats pins the behavior: remaining=0
  fixture → skip cleanly, no per-pin output.

72/72 green, shellcheck/shfmt/mandoc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DevOps round-2 findings against gh_api + preflight:

1. Retry policy was duplicated. gh_api used _CURL_FLAGS
   (-fsSL --retry 3 --retry-delay 2); the preflight built its own
   inline `curl -sS --retry 3 --retry-delay 2 -o ... -w ...`.
   Tune one and the other drifts. Split into two named constants
   with the same retry policy — _CURL_FLAGS (gh_api, with --fail)
   and _CURL_PROBE_FLAGS (preflight, without --fail so it can
   inspect error bodies) — and reuse each.

2. No per-request timeout. --retry 3 with --retry-delay 2 meant a
   single stalled DNS or SYN could eat 15–20s before curl gave up,
   and that time was multiplied across pins in large workflows.
   Add --max-time 10 to both flag sets; file:// URLs silently
   ignore it.

3. HTTP 429 (secondary rate limit) was hidden in the "unexpected
   code" bucket, emitting a generic WARN instead of naming the
   cause. Extract the status-code dispatch into
   _preflight_classify_status — a pure function callable per-code
   from sourced tests — and give 429 its own branch:
   "secondary rate limit hit (HTTP 429) — skipping {op}; slow
   down, set GITHUB_TOKEN, or retry after the Retry-After
   interval". Matching the tone of the primary-limit messages.

6 new bats cases cover each classifier branch (200/401/403/429/000/
503) by sourcing the script and calling the helper directly — no
fake HTTP server needed.

78/78 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The preflight creates a tempfile with mktemp then has to rm it
before every return. Four exit points today, each needed its own
`rm -f "${body_file}"` plus one at function end — five places
that must stay in sync if another branch lands.

Replace with a single `trap 'rm -f "${body_file:-}"' RETURN` at
the top of the function. Bash's RETURN trap fires on any return
from the enclosing function, including exit codes from nested
calls and the implicit fall-through return. The :- guard handles
the pathological case where mktemp fails before the assignment.

No behavior change — tempfiles still get cleaned up on every
path, and `_preflight_classify_status` can now be called with
a simple `|| return 1` pattern instead of an if-then wrapper
that had to manually rm before returning.

Shorter preflight, one cleanup site, new branches get cleanup
for free. 78/78 green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
main() had grown to ~110 lines: the empty-arg check, the --help/
--version scan, subcommand detection, a 40-line flag loop, two
validation switches, an empty-files check, and the dispatch. Each
concern is load-bearing but the body read as a single long
procedure. Next new flag (--max-time? --format=json?) would push
it over a readability cliff.

Pull the two largest concerns into named helpers:

- _short_circuit_help_version — scans for --help / --version at
  any arg position, exits 0 after printing if found. Preserves the
  `check --version` and other-order cases that the existing tests
  pin.
- _parse_cli_args — owns subcommand detection + the flag loop +
  the two validation switches. Publishes its results through four
  module-scope globals (cli_subcmd, cli_format, cli_only,
  cli_files) rather than threading tuples through return values.

main() now reads top-to-bottom as: no-args guard → short-circuit →
parse → empty-files guard → dispatch. ~18 lines.

Module-scope publishing is consistent with the project's other
shared state (auth_header, seen_in_file, seen_in_run) — declared
once, mutated by its owning helper, read where needed. The cli_
prefix namespaces them so nothing inside cmd_check / cmd_list /
cmd_updates can shadow them accidentally.

No behavior change — CLI output byte-identical for --help,
--version, --bogus, and no-args paths; 78/78 bats cases pass
unchanged; full lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The preflight had an `if [[ GITHUB_API_BASE == file://* ]]` branch
that did a bare curl (no HTTP semantics) and an `else` branch that
did curl-with-status-capture and routed through the classifier.
Both branches silently converged on a shared rate-limit check after
the `esac`. A reader tracing the flow saw two inline curl blocks
and had to mentally match "file:// skips classifier" against
"HTTP uses classifier" with no signposts.

Extract each branch into a named helper:

- _preflight_probe_file — file:// path. Runs curl with the failure-
  on-error _CURL_FLAGS, writes the body to the caller's tempfile,
  emits the "cannot reach" WARN on exit nonzero.
- _preflight_probe_http — HTTP(s) path. Uses _CURL_PROBE_FLAGS
  (no --fail) to capture the status code even on errors, then
  delegates to _preflight_classify_status for the code-by-code
  WARN routing.

Both have the same signature — `helper body_file op` → 0/1 — so
the preflight body becomes:

    if [[ "${GITHUB_API_BASE}" == file://* ]]; then
      _preflight_probe_file "${body_file}" "${op}" || return 1
    else
      _preflight_probe_http "${body_file}" "${op}" || return 1
    fi

The convergence on the rate-limit-remaining check afterwards is
now obviously intentional — both probes have established the body
in the same file, and the shared follow-up logic reads the same
field regardless of transport.

Pure refactor — 78/78 bats cases pass unchanged; full lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two doc-only edits to make internal helpers easier to read without
having to chase behavior across functions:

- _include_pin's $1 description had a parenthetical about "unknown"
  that actually belongs to $2. main() validates $1 as all|tag|branch
  only; "unknown" is a classified-kind value that the callers pass
  as $2 for refs they couldn't authoritatively classify. Move the
  parenthetical to where it applies.

- parse_uses_line's regex was a wall with no legend — the body
  indexes BASH_REMATCH[1], [2], [4] with no in-file note of what
  each captures. Add a four-line capture-group map right above
  the regex. Reader can now trace action/ref/comment assignments
  without pattern-matching in their head.

No behavior change — 78/78 bats cases pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lex57ukr and others added 3 commits April 18, 2026 17:46
cmd_updates had ~35 inline lines of output formatting: two nearly-
identical TSV printfs (differing only in column 3) plus a case-
within-case for the plain path with four distinct printf strings.
The main loop body read as "classify → fetch → then thirty-five
lines of format juggling". Two structural tidies:

1. The TSV branch had an if/else that flipped column 3 between
   `ref` and `effective_ref` purely to handle the unknown case.
   Lift the choice into a `col_ref` local and call printf once:

       local col_ref="${effective_ref}"
       [[ "${effective_kind}" == "unknown" ]] && col_ref="${ref}"
       printf '%s\t%s\t%s\t%s\t%s\n' \
         "${file}" "${action}" "${col_ref}" "${available}" "${effective_kind}"

   One emission, one format string, semantic difference named.

2. Extract a `_emit_update_plain` helper that owns all four plain-
   format shapes (unknown / tag up-to-date / tag newer / branch
   up-to-date / branch head-drifted). Mirrors the `_emit_check`
   pattern the `check` subcommand already uses — when it's time to
   tweak an output line later, it's a one-function change.

cmd_updates' main loop body drops from ~40 lines to ~20, reading
straight-through as classify → filter → emit. The output lives in
two named helpers — _emit_check for check, _emit_update_plain for
updates — with matching signatures for uniformity.

No behavior change — 78/78 bats cases pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Arch-4 from the round-1 review. The subcommand word had to be $1,
so `validate-action-pins --format=tsv list workflows/*.yml` didn't
route correctly — `--format=tsv` was $1, no subcommand match,
default `check`, and `list` got consumed as a file path. `kubectl`
and `gh` accept flag-subcommand ordering either way; we do too now.

Move the check|list|updates detection inside the main flag loop
with a single `subcmd_seen` latch. First matching positional word
becomes the subcommand; later `check|list|updates` tokens are
positional files (handles the rare "a workflow literally named
list" case). `--` still terminates subcommand scanning along with
flag scanning, matching the existing idiom.

Tradeoff accepted: a real workflow file named exactly `check`,
`list`, or `updates` (no path) needs `--` to disambiguate. Users
normally pass `.github/workflows/*.yml`, where the path prefix
already disambiguates; the narrow edge case is documented in the
man page's new SUBCOMMANDS note.

5 new bats cases cover: flag-then-subcommand, files-then-
subcommand, --only-then-subcommand, second occurrence as file,
and the `-- check` quoting escape hatch. Man page gains a
SUBCOMMANDS-level note about the permissive ordering and the `--`
workaround. 83/83 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Full pass against current behavior turned up several drifts from
when the subcommand/filter/preflight work last updated the page.

- Date bumped (Feb 15 → Apr 18). The whole SUBCOMMANDS + --only
  story landed after that date.
- DESCRIPTION's "skipped successfully" paragraph listed only two
  skip conditions (unreachable API, missing dep). Real list after
  the preflight hardening is six: missing curl/jq, unreachable,
  auth rejection (401/403), secondary rate limit (429), exhausted
  primary rate limit, unexpected HTTP status. Each produces its
  own named WARN; document them. Also point at
  VALIDATE_ACTION_PINS_VERBOSE as the escape hatch.
- `check` SUBCOMMANDS synopsis was missing --only (top SYNOPSIS
  had it); `updates` SUBCOMMANDS synopsis was missing --only too.
  Bring both in line with the top synopsis.
- `check` SUBCOMMANDS description now notes the "SHA pin + #
  comment" scope rule so a reader jumping straight there knows
  what check operates on.
- Updates plain-output shape list was three entries; add the
  `[unknown]` shape for SHA pins with no ref hint (shipped back in
  the updates-refactor commit) and explain when it's emitted.
- ENVIRONMENT gained VALIDATE_ACTION_PINS_VERBOSE, a real operator
  knob. Test-only env vars (GITHUB_API_BASE,
  VALIDATE_ACTION_PINS_SKIP_CONNECTIVITY) stay in the script header
  only, not the man page.

Rendered output via mandoc looks clean — paragraph breaks land in
the right places, no formatting artefacts. `make lint-man` clean;
83/83 bats cases unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lex57ukr lex57ukr added enhancement New feature or request github-actions GitHub Actions workflow changes labels Apr 19, 2026
@lex57ukr lex57ukr merged commit 8b96608 into main Apr 19, 2026
8 checks passed
@lex57ukr lex57ukr deleted the 122-extend-action-pin-tooling branch April 19, 2026 00:11
lex57ukr added a commit to knight-owl-dev/homebrew-tap that referenced this pull request Apr 20, 2026
Dependabot doesn't refresh SHA pins that track a branch (e.g.
`Homebrew/actions/setup-homebrew@<sha> # main`), so they drift silently.
The new `validate-action-pins updates --format=tsv --only branch` subcommand
(knight-owl-dev/devops#123) makes drift machine-detectable: for each branch
pin it emits the current branch head SHA, or empty when up-to-date.

Wire that into a weekly scheduled workflow that files a deduped GitHub
issue (via the pre-created `action-pin-update` label, same pattern as
`cve-monitor` in devops) when any branch pin has fallen behind. Render
the issue body from `docs/action-pin-issue.md`. The workflow runs on
ubuntu-latest and installs ci-tools via `scripts/install-ci-tools.sh`,
which adds the apt.knight-owl.dev repo after verifying its GPG key
fingerprint against the one published in knight-owl-dev/apt.

Also bring `docs/how-to/security.md` in line with the updated tool: the
WARN it emits for branch pins now reports drift distance rather than
tag-resolution failure, and the periodic review is no longer manual.
lex57ukr added a commit to knight-owl-dev/homebrew-tap that referenced this pull request Apr 20, 2026
Dependabot doesn't refresh SHA pins that track a branch (e.g.
`Homebrew/actions/setup-homebrew@<sha> # main`), so they drift silently.
The new `validate-action-pins updates --format=tsv --only branch` subcommand
(knight-owl-dev/devops#123) makes drift machine-detectable: for each branch
pin it emits the current branch head SHA, or empty when up-to-date.

Wire that into a weekly scheduled workflow that files a deduped GitHub
issue (via the pre-created `action-pin-update` label, same pattern as
`cve-monitor` in devops) when any branch pin has fallen behind. Render
the issue body from `docs/action-pin-issue.md`. The workflow runs on
ubuntu-latest and installs ci-tools via `scripts/install-ci-tools.sh`,
which adds the apt.knight-owl.dev repo after verifying its GPG key
fingerprint against the one published in knight-owl-dev/apt.

Also bring `docs/how-to/security.md` in line with the updated tool: the
WARN it emits for branch pins now reports drift distance rather than
tag-resolution failure, and the periodic review is no longer manual.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request github-actions GitHub Actions workflow changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant