Skip to content

fix(migrate): automated migration experience for v0.5.4 → v0.8.x#199

Closed
williamhallatt wants to merge 12 commits intobradygaster:devfrom
williamhallatt:williamhallatt/197-migration-experience
Closed

fix(migrate): automated migration experience for v0.5.4 → v0.8.x#199
williamhallatt wants to merge 12 commits intobradygaster:devfrom
williamhallatt:williamhallatt/197-migration-experience

Conversation

@williamhallatt
Copy link
Copy Markdown
Contributor

@williamhallatt williamhallatt commented Mar 5, 2026

Closes #197

Summary

Implements a complete automated migration experience to address failed migrations from v0.5.4 to v0.8.x (particularly for non-global/npx installs).

Changes

New command: squad migrate

  • Backs up, cleans, and reinitialises .squad/ in one step
  • Supports --dry-run, --backup-dir, and --restore [path] flags
  • Auto-rollback on failure (restores from backup if any step fails)
  • Path containment guards on --restore and --backup-dir (security)
  • config.json preserved across migration (user config survives)

Bug fixes

  • Shell re-init after migrate — deletes stale .first-run/.init-prompt markers so the shell doesn't treat a migrated install as brand-new
  • Casting registry missing after migrate — restores or synthesises casting/registry.json post-reinit (Case A: restore from backup for v0.8.x repos; Case B: generate from agents/ for v0.5.x repos)
  • Cast overwrite guardcreateTeam() no longer overwrites existing charter.md or non-empty history.md
  • scrub-emails default path — fixed from .ai-team/ to .squad/

Tests (57 total, all passing)

  • test/migrate-command.test.ts — 23 unit/integration tests
  • test/cast-guard.test.ts — 11 tests for createTeam() overwrite guard
  • test/migrate-e2e.test.ts — 23 E2E tests via execFileSync against built CLI

Documentation

  • docs/get-started/migration.md — full rewrite covering all 9 scenarios
  • docs/scenarios/upgrading.md, troubleshooting.md — updated
  • docs/reference/cli.md — migrate command documented
  • docs/scenarios/disaster-recovery.md, keep-my-squad.md — updated
  • docs/whatsnew.md — v0.8.21 section added
  • docs/blog/021-the-migration.md — squad migrate callout added
  • README.md — updated

Related PRs and issues

# What Relationship
PR #203 fix: only install Squad-framework workflows during init Independent — targets same dev base. Recommended to merge first (smaller, lower risk). Also contains the squad.agent.md spawn template fix that addresses the session-metadata root cause found during this branch's review.
#201 squad init installs generic CI/CD workflows Fixed by PR #203 — no overlap with this PR's code.
#202 squad link should auto-add .squad/config.json to .gitignore Follow-up issue. Both this PR and PR #203 add .squad/config.json to .gitignore manually as a patch. #202 tracks the programmatic fix in link.ts / init-remote.ts.

Reviewer notes

Notes

Working as @copilot (Coding Agent)

⚠️ This task was flagged as 🟡 needs-review — please have a squad member review before merging.

@williamhallatt
Copy link
Copy Markdown
Contributor Author

williamhallatt commented Mar 5, 2026

CI check status

❌ Squad CI — test failures

Two pre-existing test failures on dev itself (confirmed by run 22697808051, which predates this PR):

Test Failure
test/cli/init.test.ts:94 Expects .squad/decisions/decisions.md merge=union but runInit writes .squad/decisions.md merge=union — stale test
test/cli/consult.test.ts:178 Expects "no personal squad" error but gets "Unknown command: consult" — command not yet implemented

Neither file is touched by this PR (git diff confirms zero changes to both). These failures exist on dev independent of this work.

This PR introduces no new test failures. Our 57 new tests all pass locally (npm test in packages/squad-cli/).

- Shell scaffolds .squad/ when team.md is absent instead of blocking all
  input with a circular 'Run /init' message
- Add 'squad migrate' command: backup → clean → reinit → restore user files
  Supports --dry-run and --backup-dir flags
- Fix scrub-emails default directory (.ai-team/ → .squad/)
- Rewrite docs/scenarios/upgrading.md (remove npx github: references,
  add global/local/npx coverage)
- Update docs/scenarios/troubleshooting.md (Node.js 22→20, remove SSH hang)
- Update docs/get-started/migration.md (Node.js 18→20, @latest, all install
  methods, squad migrate usage)

Closes bradygaster#197
- Add 'If Something Went Wrong' subsection to Scenario 2 showing
  how to restore from the .squad-backup-{timestamp}/ backup
- Add 'How do I restore from a migrate backup?' troubleshooting entry
  with ls -d tip for finding the backup folder

Closes bradygaster#197 (partial)
- Add runRestore() to restore the entire .squad/ from a backup snapshot
- squad migrate --restore auto-detects the most recent .squad-backup-*/
- squad migrate --restore <path> restores from a specific backup
- Wrap migration steps 2-4 in try/catch: on any error, auto-restore the
  backup and exit with a clear message before the user is left stranded
- Update docs to replace manual rm/cp restore instructions with
  squad migrate --restore
- Update help text to document the new --restore flag
Update cli.md, disaster-recovery.md, keep-my-squad.md, README.md
to reflect squad migrate, --restore flag, and current .squad/ paths.

Refs bradygaster#197
After migrate runs sdkInitSquad() in step 3, it creates .first-run and
potentially .init-prompt. The shell treats these as signals that no team
exists yet and prompts the user to cast a team — even though migrate just
restored a full team roster. If the user follows that prompt and confirms,
createTeam() runs without guards and overwrites all existing agent files.

Two fixes:
1. migrate.ts: after sdkInitSquad(), delete .first-run and .init-prompt if
   present so the shell sees a normal (not first-run) state.
2. cast.ts createTeam(): guard charter.md and history.md writes so they only
   happen when the file is absent or empty — existing agent work is preserved
   even if init mode accidentally triggers.
… guard

- test/migrate-command.test.ts: 19 unit/integration tests for runMigrate,
  runRestore, findLatestBackup — dry-run, auto-rollback, .first-run cleanup,
  legacy .ai-team/ migration, restore (auto and explicit)
- test/cast-guard.test.ts: 11 tests for createTeam() charter/history guard —
  verifies existing non-empty files are not overwritten on /init
- test/migrate-e2e.test.ts: 23 E2E tests exercising the built CLI via
  execFileSync — full flow, restore, shell reinit guard assertions

All 53 tests pass. Closes gap in coverage for bradygaster#197 migration fixes.
Two-case fix for casting/registry.json not existing after squad migrate:
- Case A (v0.8.x repos): backup had casting/ → restore entire casting/
  directory from backup, preserving existing names and universe assignments
- Case B (v0.5.x repos): backup lacked casting/ → scan restored agents/
  and create registry.json with each agent as legacy_named:true, plus
  default policy.json and empty history.json

Fixes squad doctor ❌ casting/registry.json after migration.

Adds 4 new tests covering both migration cases (57 total, all pass).
- Security: constrain --restore and --backup-dir paths to cwd (Baer)
- Security: prevent path traversal in copyRecursive restore targets
- Correctness: track backupComplete flag; gate rollback on verified backup (Fenster)
- Feature: add config.json to USER_OWNED so user config survives migration (Keaton)
- UX: replace sdkInitSquad internal name in --dry-run output (Marquez)
- UX: style rollback/error path with error()/warn()/info() helpers (Marquez)
- UX: add success() completion markers to Steps 2 and 4 (Marquez)
- Docs: add v0.8.21 section to whatsnew.md (McManus)
- Docs: fix cli.md install section — replace 'squad init' with correct insider install command (McManus)
@williamhallatt williamhallatt force-pushed the williamhallatt/197-migration-experience branch from 0127cfb to 4f9a5f9 Compare March 5, 2026 06:27
…octor hint

- Add process.env.NODE_NO_WARNINGS='1' before first import (re-adds line
  from ceb599b that was dropped in subsequent cli-entry.ts restructure)
- --version / -v now outputs bare semver without 'squad ' prefix
- Empty/whitespace-only args now show help and exit 0 (was: exit 1 + unknown command)
- Unknown command error now mentions 'squad doctor' as remediation hint

Fixes test/cli-p0-regressions.test.ts (all 6 assertions now pass) and
test/first-run-gating.test.ts bradygaster#624 NODE_NO_WARNINGS assertion.

The banner spacer test (first-run-gating.test.ts:658) remains failing —
pre-existing App.tsx regex mismatch on upstream/dev, not in scope for this PR.
@williamhallatt
Copy link
Copy Markdown
Contributor Author

CI status update — post-rebase

✅ Fixed in this push (commit 0e08594)

Four pre-existing regression failures that were already broken on upstream/dev have been fixed, since they're all in cli-entry.ts which this PR already modifies:

Test Issue Fix
cli-p0-regressions.test.ts — bare semver --version output included squad prefix Removed prefix, outputs VERSION directly
cli-p0-regressions.test.ts — whitespace args Empty/whitespace args exited 1 with "unknown command" Filter empty args; fall through to --help (exit 0)
cli-p0-regressions.test.ts — doctor hint Unknown command error had no squad doctor reference Updated error message to include squad doctor
first-run-gating.test.ts #624 — NODE_NO_WARNINGS process.env.NODE_NO_WARNINGS = '1' was absent from cli-entry.ts Restored before first import (was dropped in a later cli-entry.ts restructure)

⚠️ Remaining pre-existing failures (not in scope for this PR)

Three failures remain that are also failing on upstream/dev HEAD and predate this branch:

  1. first-run-gating.test.ts:658 — banner spacer — The test regex /const headerElement[\s\S]*?useMemo\(\(\)\s*=>\s*\(/ expects an arrow function returning a parenthesised JSX expression (() => (...)), but App.tsx's headerElement useMemo now uses a block body (() => { if/else; return ... }). The regex cannot match, so the assertion expect(headerBlock).not.toBeNull() fails. Not in scope — we don't modify App.tsx.

  2. aspire-integration.test.ts — Playwright browser binary (chrome-headless-shell) is not installed in the CI runner. Infrastructure issue. Not fixable in code.

  3. docs-build.test.ts:166 — Build script exceeds the 5 000 ms test timeout. Infrastructure/environment issue. Not fixable in code.

All three are confirmed pre-existing on upstream/dev's current HEAD by comparing dev's cli-entry.ts and App.tsx against the same test expectations.

…nfig.json

Remove requester attribution and branch name from breedan and hockney
history entries. Add .squad/config.json to .gitignore (machine-local
teamRoot path, must not be committed). See bradygaster#202 for follow-up.
@williamhallatt
Copy link
Copy Markdown
Contributor Author

🧹 Session metadata cleanup applied

Same pattern as PR #203 — stripped requester attribution and branch name from two history file entries that were added during this session:

  • .squad/agents/breedan/history.md — removed Requested by: William Hallatt + branch reference
  • .squad/agents/hockney/history.md — removed branch name from section heading
  • .gitignore — added .squad/config.json entry (machine-local teamRoot path, must not be committed)

Root cause and the spawn template fix that prevents recurrence are documented in #201 / PR #203. Follow-up for auto-gitignore on squad link / squad init --remote tracked in #202.

tmcclell pushed a commit to tmcclell/squad that referenced this pull request Mar 5, 2026
…tes (bradygaster#185, bradygaster#188, bradygaster#191, bradygaster#192, bradygaster#195, bradygaster#196, bradygaster#199, bradygaster#201, bradygaster#203, bradygaster#206, bradygaster#207)

Documentation Epic bradygaster#182 — complete:

Docs Content (McManus):
- Architecture overview: SDK ↔ CLI ↔ SquadUI system design
- Migration guide: Beta → v1 with 10-step checklist
- Global CLI install guide: npm, npx, GitHub native
- VS Code integration guide: client compatibility, extension patterns
- SDK API reference: 574 lines, all 30+ exports documented

Docs Site Engine (Keaton):
- Static site generator: node docs/build.js → docs/dist/
- GitHub Pages ready, responsive design, sidebar nav
- Index landing page linking all guides

Mechanical Updates (Fenster):
- .ai-team/ → .squad/ across 25 doc files (bradygaster#191)
- CLI invocation references verified current (bradygaster#192)
- Beta repo URLs updated to squad-pr (bradygaster#195)

Docs Tests (Hockney):
- 17 docs validation tests: headings, code blocks, links, build
- Fixed link checker for parent-dir refs, Windows rmSync

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bradygaster
Copy link
Copy Markdown
Owner

Thank you, @williamhallatt, for this comprehensive and thoughtful migration tooling PR. Your work demonstrates excellent engineering — the 57-test suite, --dry-run flag, --backup-dir, and --restore flow are exactly the patterns we'd want in a formal migration system.

Since you opened this PR, the squad dev branch has evolved its own migration infrastructure (migrate-directory.ts handling .ai-team→.squad renames, migrations.ts for version-based additive runners, and --migrate-directory flag on the upgrade command). Rather than merge parallel implementations, we're treating this as invaluable feedback for our next formal migration release cycle.

Separately, your fixes to cli-entry.ts are valuable: bare semver output, whitespace args handling, doctor hint, and NODE_NO_WARNINGS env. The team will address these 4 regressions independently. The squad project is stronger because you dug in.

This PR will inform our migration strategy as we move toward v1.0. Cheers to the collaboration. 🙏

@williamhallatt
Copy link
Copy Markdown
Contributor Author

@bradygaster you're welcome! Wrt future collaboration, if there are things you'd like me to do differently (or not at all), please let me know. This is a great tool and I'll keep hammering at things I find as long as you'll let me 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve migration experience: fix shell init, add migrate command, update docs for all install methods

2 participants