Skip to content

feat(identity): hardening + kickstart sync quick wins#21

Merged
sabbour merged 5 commits intodevfrom
squad/identity-quick-wins
Apr 21, 2026
Merged

feat(identity): hardening + kickstart sync quick wins#21
sabbour merged 5 commits intodevfrom
squad/identity-quick-wins

Conversation

@sabbour
Copy link
Copy Markdown
Owner

@sabbour sabbour commented Apr 21, 2026

Summary

Working as EECOM (Environmental, Electrical & Consumables Manager)

Implements comprehensive identity hardening based on two proposals:

  • docs/proposals/identity-hardening-roadmap-2026-04-20.md
  • docs/proposals/kickstart-identity-sync-2026-04-20.md

Changes

SDK (packages/squad-sdk/src/identity/)

  • TokenResolveError structured type — kind: 'not-configured' | 'runtime' with human message
  • H-01 fetch timeoutAbortController + Promise.race 10-second cap on installation token requests
  • H-02 PEM validationcreatePrivateKey() validates key before signing; rejects with descriptive error on bad PEM
  • H-03 partial env detection — loud error when only 1–2 of 3 required env vars are set (no silent fallthrough)
  • H-07 mock hookSQUAD_IDENTITY_MOCK=1 bypasses real credentials; SQUAD_IDENTITY_MOCK_TOKEN sets custom value
  • resolveTokenWithDiagnostics() — full diagnostic result ({ token, resolvedRoleKey, error })
  • clearTokenCache() — for test isolation
  • Cache key fix — keyed by ${projectRoot}:${roleKey} to prevent cross-test pollution
  • scribe added to RoleSlug union; ALL_ROLES constant exported from SDK
  • Role aliasesresolveRoleSlug() maps shorthand (core, ui, qa, ops, writer, sec, ml, note) to canonical slugs

Template (packages/squad-cli/templates/scripts/resolve-token.mjs)

  • ESM dual-mode guardisCliInvocation IIFE prevents CLI side-effects when imported as a module
  • --required flag — exits 1 with error message when token cannot be resolved
  • Parity with SDK — same timeout, PEM validation, mock hook, role aliases, cache key format

CLI (packages/squad-cli/src/cli/commands/identity.ts)

  • Added scribe role to ALL_ROLES / ROLE_DESCRIPTIONS
  • ensureKeysIgnored() call + mode 0o600 writes for new PEM files

Tests

  • test/identity/hardening.test.ts — 51 new acceptance tests (H-01 through H-07 + sync contract)
  • Updated exec.test.ts to mock resolveTokenWithDiagnostics (new API)
  • Updated tokens.test.ts partial env test to reflect new loud-error behavior
  • vitest.config.tstestTimeout: 15000 for H-01 real 10s timer test

Test Results

142 / 142 tests pass

Test Files  12 passed (12)
     Tests  142 passed (142)

Leela Lead Bot and others added 2 commits April 18, 2026 04:45
Squashed 67 commits from squad/agent-github-identity onto upstream/dev.
- TokenResolveError structured type (kind: not-configured | runtime)
- H-01: AbortController + Promise.race 10s fetch timeout
- H-02: PEM validation via createPrivateKey before signing
- H-03: partial env detection — loud error when 1-2 of 3 vars set
- H-07: SQUAD_IDENTITY_MOCK / SQUAD_IDENTITY_MOCK_TOKEN mock hook
- Role aliases: resolveRoleSlug() maps shorthand to canonical slugs
- scribe added to RoleSlug union; ALL_ROLES constant exported from SDK
- isCliInvocation ESM dual-mode guard in resolve-token.mjs
- resolveTokenWithDiagnostics() + clearTokenCache() public API
- Cache keyed by projectRoot:roleKey (prevents cross-test pollution)
- All 142 identity tests pass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Owner Author

@sabbour sabbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flight review — PR #21

Verdict: Request changes (posted as comment due to GitHub self-review limitation)

Findings verification

  • Sync #1 resolveTokenWithDiagnostics: ✅ — New SDK export with { token, resolvedRoleKey, error } result type. Correctly implemented in both tokens.ts and resolve-token.mjs (CLI templates). Backward-compat resolveToken() wrapper delegates to it and returns token ?? null.
  • Sync #2 --required flag: ✅ in CLI templates copy / ❌ stale in 3 other copies — parseCliArgs extracts --required; exits 1 + stderr when token fails. Test confirms exit 0 without flag (back-compat). See Blocking #2.
  • Sync #3 isCliInvocation dual-mode ESM guard: ✅ in CLI templates / ❌ missing in 3 other copies — IIFE comparing process.argv[1] === fileURLToPath(import.meta.url). Import test confirms no CLI side effects when loaded as module.
  • Sync #5 partial env detection: ✅ — Both SDK (resolveEnvCredentials returns { error: "Incomplete..." }) and CLI template (returns { error: "Partial env config..." }) detect 1-2 of 3 env vars as kind: 'runtime' error, naming the missing variables.
  • Sync #6 scribe role: ✅ — Added to RoleSlug union and ALL_ROLES in types.ts. Tests confirm compile-time type check and runtime acceptance.
  • Sync #7 execWithRoleToken dead catch: ✅ — exec.ts uses resolveTokenWithDiagnostics (never throws), surfaces errors via stderr. spawn.ts similarly wraps with debugLog on catch. No dead catch blocks remain.
  • H-01 AbortController timeout: ✅ — Both SDK and CLI template use AbortController + Promise.race + setTimeout(10_000). Real-timer test proves timeout fires at 10s (H-01 test ran in 10.5s).
  • H-02 PEM validation: ✅ in SDK + CLI template / ❌ stale in 3 other copies — createPrivateKey(privateKeyPem) in buildJWT before createSign. Throws "Invalid PEM format for role: ...". Tests cover empty, truncated, garbage, and base64-stripped PEM.
  • H-04 error taxonomy: ✅ — TokenResolveError with kind: 'not-configured' | 'runtime' applied consistently. not-configured for missing credentials/registration/PEM. runtime for PEM invalid, network failure, partial env, timeout.
  • H-05 mode 0o600: ✅ — Three writeFileSync(..., { mode: 0o600 }) and one chmodSync call in identity.ts (lines 380, 597, 1052). Read-time permission warning in both tokens.ts (line 345) and CLI template (line 211). Platform guard for Windows.
  • H-06 .gitignore auto-append: ✅ — ensureKeysIgnored() in identity.ts called from 3 CLI entry points (create, import, rotate). Root .gitignore also updated.
  • H-07 SQUAD_IDENTITY_MOCK: ✅ — Both SDK and CLI template check SQUAD_IDENTITY_MOCK === '1', return mock-token-{role} or custom SQUAD_IDENTITY_MOCK_TOKEN. No filesystem/network I/O performed.
  • H-08 nowOverride: ✅ — buildJWT(appId, pem, nowOverride?) in both SDK and CLI template. generateAppJWT passes through. Tests verify deterministic JWT payload with fixed timestamp.

Blocking issues

1. Changeset identity-hardening.md uses wrong package names (BLOCKING)

# Current (WRONG):
"@squad/sdk": minor
"@squad/cli": minor

# Correct (matches all other changesets in this PR):
'@bradygaster/squad-sdk': minor
'@bradygaster/squad-cli': minor

Every other changeset in this PR (identity-module.md, env-var-credentials.md, exec-with-role-token.md, wire-gh-token-spawn.md, etc.) uses @bradygaster/squad-*. This one will be silently ignored by @changesets/cli during version bump — no error, just a no-op. The feature will ship without a changelog entry or version bump.

Fix: Replace "@squad/sdk"'@bradygaster/squad-sdk' and "@squad/cli"'@bradygaster/squad-cli'.

2. Three of four resolve-token.mjs copies are stale (BLOCKING)

File Lines Hardened?
packages/squad-cli/templates/scripts/resolve-token.mjs 283 ✅ All 13 findings
templates/scripts/resolve-token.mjs 224 ❌ Old version
packages/squad-sdk/templates/scripts/resolve-token.mjs 224 ❌ Old version
.squad-templates/scripts/resolve-token.mjs 224 ❌ Old version

The three stale copies lack: --required flag, isCliInvocation guard, PEM validation (createPrivateKey), AbortController timeout, partial env detection, SQUAD_IDENTITY_MOCK hook, role alias resolution, clearTokenCache export.

Users who receive their template from the SDK package or root templates directory will get the unhardened version. Since squad init/upgrade may source from any of these, this is a real user-facing gap.

Fix: Sync all four copies to the 283-line hardened version.

Non-blocking observations

  1. Role slug resolution asymmetry — CLI template's resolveTokenWithDiagnostics calls resolveRoleSlug(roleKey) internally (line 174, mapping aliases like core→backend). SDK's version does NOT — uses roleKey as-is. spawn.ts correctly calls resolveRoleSlug(role) before resolveToken() (line 122), so this works, but it's a design divergence worth documenting. SDK callers must know to resolve aliases themselves.

  2. H-06 test coverage is simulation-based — The .gitignore tests (hardening.test.ts lines 760-830) simulate appendFileSync calls rather than exercising ensureKeysIgnored() from identity.ts directly. This is acceptable for now but means the guard logic (idempotency check) is tested by replication rather than invocation.

  3. H-01 first test runs for ~10.5s real time — The "fetch hangs" test uses real wall-clock time (no fake timers). This is correct for validating the timeout actually fires, but makes the test suite slow. Consider flagging it with a @slow tag if test runtime becomes a concern.

Test quality

  • 142 tests, 12 files, all green ✅
  • Failure paths well-covered: timeout, PEM corruption (empty, truncated, garbage, base64-stripped), partial env (1-of-3, 2-of-3 with named missing vars), missing config, missing PEM file
  • FIDO's fake-timer concern: Resolved. Promise.race with explicit timeoutPromise + AbortController.signal.addEventListener works correctly with vi.useFakeTimers()
  • FIDO's stderr spy concern: Resolved. vi.spyOn(process.stderr, 'write') correctly captures the permission warning
  • Determinism stress tests (10x parallel) pass cleanly
  • Protected-file compliance verified: CLI template imports only node:crypto, node:fs, node:path, node:url — zero npm deps, zero SDK imports ✅

Verdict reasoning

EECOM delivered an excellent implementation. All 13 findings are correctly implemented in the SDK (tokens.ts) and the primary CLI template. The error taxonomy is consistent, the timeout is wired correctly, PEM validation fails loudly, and the mock hook is clean. Test quality is high with strong failure-path coverage.

Two issues require changes before merge:

  1. The changeset naming typo will silently break the release pipeline. This is a one-line fix but the consequence of missing it is that the entire feature ships without a changelog entry — exactly the kind of silent failure we're trying to eliminate.

  2. The stale template copies create a real user-facing gap where squad init could stamp an unhardened resolve-token.mjs. Given that H-01 (timeout) and H-02 (PEM validation) are both CRITICAL-severity findings, shipping three copies without those fixes defeats the purpose of the hardening PR.

Both fixes are mechanical (copy + rename). After these two changes, this is a clean approve.

— Flight

Leela Lead Bot and others added 2 commits April 21, 2026 00:53
Reviewed 13 findings from identity-hardening + kickstart-sync proposals.
Verdict: request changes (2 blocking: changeset naming, stale template copies).
142/142 tests green.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…js copies

- .changeset/identity-hardening.md: @squad/sdk -> @bradygaster/squad-sdk,
  @squad/cli -> @bradygaster/squad-cli (unknown names silently ignored by
  changesets tooling)
- Sync .squad-templates/scripts/resolve-token.mjs to canonical hardened version
- Sync packages/squad-sdk/templates/scripts/resolve-token.mjs likewise
- Sync templates/scripts/resolve-token.mjs likewise
  (stamp code picks up packages/squad-cli/templates/ from dist and templates/
  from bundled root; .squad-templates/ and squad-sdk/templates/ are legacy paths
  that may be resolved in edge cases — dedup refactor tracked in decision inbox)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Owner Author

@sabbour sabbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flight re-review — APPROVE

Both blockers resolved in aeaba5c:

  1. ✅ Changeset package names corrected: @bradygaster/squad-sdk / @bradygaster/squad-cli
  2. ✅ All four resolve-token.mjs copies synced at 283 lines, byte-identical

Build green, 142/142 identity tests pass. No regressions.

Verdict: Approve — ready for Ahmed's merge decision.

— Flight

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant