Skip to content

Add configurable repo/worktree storage root with re-clone flow#280

Merged
PureWeen merged 11 commits intoPureWeen:mainfrom
vitek-karas:Add-a-way-to-specify-polypilot-clone-root-7299
Mar 10, 2026
Merged

Add configurable repo/worktree storage root with re-clone flow#280
PureWeen merged 11 commits intoPureWeen:mainfrom
vitek-karas:Add-a-way-to-specify-polypilot-clone-root-7299

Conversation

@vitek-karas
Copy link
Copy Markdown
Contributor

Summary

  • add ConnectionSettings.RepositoryStorageRoot and normalize/validate it
  • update RepoManager to resolve clone/worktree roots from the setting while keeping repos.json under ~/.polypilot
  • preserve existing worktree cleanup behavior by persisting WorktreeInfo.BareClonePath
  • add bulk RecloneAllRepositoriesToCurrentRootAsync and lazy migration for existing repos on new worktree/fetch/branch operations
  • add Settings UI to pick/reset storage root and trigger "Re-clone all registered repos in new root"
  • add unit tests for settings + repo storage behavior

Validation

  • dotnet build -f net10.0-maccatalyst /p:CopilotSkipCliDownload=true
  • dotnet test /p:CopilotSkipCliDownload=true --filter "FullyQualifiedName~ConnectionSettingsTests"
  • dotnet test /p:CopilotSkipCliDownload=true --filter "FullyQualifiedName~RepoManagerStorageTests"
  • dotnet test /p:CopilotSkipCliDownload=true --filter "FullyQualifiedName!~CliPathResolutionTests"

PureWeen added a commit that referenced this pull request Mar 5, 2026
Root cause of wrong-remote pushes:
- Workers used 'git fetch origin pull/<N>/head:pr-<N>' which creates a branch
  with NO tracking info. Subsequent 'git push' defaulted to origin, silently
  pushing to PureWeen/PolyPilot even for fork PRs.
- Workers also used 'git rebase + --force-with-lease' which is unnecessary
  when using merge.

Fix:
- .squad/routing.md: updated Fix Process to use 'gh pr checkout <N>' (sets
  correct tracking) and 'git merge origin/main' (no force push needed)
- .squad/decisions.md: explicit rules against force push and manual fetch
- push-to-pr.sh: script that derives correct remote from gh pr view metadata
  and pushes without --force, with pre-push safety checks

Verified: 'gh pr checkout' correctly sets tracking to fork remote for PR #280
(vitek-karas) and to origin for same-repo PRs like #247 (PureWeen).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Mar 8, 2026
…t + VsCodeVariant Editor

- Keep both ConnectionSettings.RepositoryStorageRoot (PR #280) and Editor (main)
- Keep both NormalizeRepositoryStorageRoot (PR #280) and NormalizeEnumFields (main)
- Merge RepoManager: instance-based ReposDir/WorktreesDir (reads from settings)
  with SetBaseDirForTesting static method (for test isolation from main)
- Keep Load() BackfillWorktreeClonePaths AND _loadedSuccessfully = true
- Fix duplicate else-if and repo variable scope issue in RemoveWorktreeAsync

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the Add-a-way-to-specify-polypilot-clone-root-7299 branch 2 times, most recently from f94d528 to 79a2981 Compare March 8, 2026 22:46
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the Add-a-way-to-specify-polypilot-clone-root-7299 branch from 79a2981 to 5c35b15 Compare March 8, 2026 23:16
PureWeen and others added 2 commits March 8, 2026 19:04
Consolidate infrastructure/tooling settings under the Developer group:
- Move CLI Source from its own 'Copilot CLI' group into Developer
- Move Repository Storage from Connection group into Developer
- Remove the now-empty Copilot CLI group
- Developer group is now always visible (not gated on GitAutoUpdate)
- Update search keywords for the reorganized groups

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Mar 9, 2026
## Problem

PR review workers were pushing to the wrong remote (e.g., pushing to
`origin/PureWeen` when the PR comes from a fork like `vitek-karas`), and
using `--force-with-lease` after rebasing unnecessarily.

### Root Cause (verified by testing)

Workers were using manual checkout:
```bash
git fetch origin pull/<N>/head:pr-<N>
git checkout pr-<N>
```

This creates a branch with **no tracking info** (`Remote: NONE`, `Merge:
NONE`). When the worker then runs `git push` or `git push origin`, git
defaults to `origin` (PureWeen/PolyPilot) — even if the PR came from a
fork.

In contrast, `gh pr checkout <N>` correctly sets up tracking:
- Same-repo PR #247: `Remote: origin`, `Merge:
refs/heads/copilot/build-notification-and-timers` ✅
- Fork PR #280: `Remote: vitek-karas`, `Merge:
refs/heads/Add-a-way-to-specify-polypilot-clone-root-7299` ✅

The second problem was `git rebase origin/main` + `--force-with-lease`,
which requires force pushing. Using `git merge origin/main` adds a merge
commit instead — no force push needed.

## Changes

### `push-to-pr.sh`
A shell script that:
1. Reads PR metadata via `gh pr view` to find the correct owner and
branch
2. Maps the owner to the correct git remote
3. Pushes to that remote **without** `--force`
4. Verifies the push landed by comparing local and remote HEADs
5. Refuses to push if current branch doesn't match the PR branch (guards
against wrong-branch pushes)

### `.squad/routing.md`
Updated Fix Process replacing rebase+force-push with merge+safe-push,
with clear explanation of why `gh pr checkout` is required.

### `.squad/decisions.md`
Explicit shared rules:
- Never force push
- Always use `gh pr checkout` (not manual fetch)
- Always merge (not rebase)
- Always verify push target before pushing
- Use `push-to-pr.sh` for all PR pushes

### `.squad/team.md`
Team definition file (required for squad directory to be recognized by
PolyPilot's squad discovery).

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen and others added 2 commits March 8, 2026 22:43
…erPicker guard

- Developer nav button now shows when GitAutoUpdate.IsAvailable OR PlatformHelper.IsDesktop
- BrowseRepositoryStorageRoot() wrapped in #if MACCATALYST || WINDOWS for GTK compat

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the Add-a-way-to-specify-polypilot-clone-root-7299 branch from efba135 to 10d7af4 Compare March 9, 2026 05:15
PureWeen and others added 2 commits March 9, 2026 11:22
- Add spacing between input and action buttons (margin-top on actions)
- Fix Use Default button to reliably clear input (use empty string)
- Add GetDefaultStorageRoot() to show true default path in placeholder
- Style placeholder text as italic with reduced opacity for clarity
- Add margin-top to settings-field for consistent section spacing
- Change Use Default button from red (stop-btn) to green (start-btn)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@vitek-karas vitek-karas marked this pull request as ready for review March 9, 2026 19:05
@vitek-karas
Copy link
Copy Markdown
Contributor Author

I fixed some UI bugs - and manually tested it - it works well for me.

PureWeen added 2 commits March 9, 2026 15:30
…lot-clone-root-7299' into Add-a-way-to-specify-polypilot-clone-root-7299
@PureWeen
Copy link
Copy Markdown
Owner

PR #280 Re-Review — "Add configurable repo/worktree storage root with re-clone flow"

CI Status: ⚠️ No checks configured on this branch
Prior review comments: None found (first formal review round)
Author note: "I fixed some UI bugs - and manually tested it - it works well for me."

5-Model Consensus Review (claude-opus-4.6 ×2, claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex)

4/5 models completed; synthesized from 4 results.


🔴 CRITICAL — GetStorageRootDir() calls ConnectionSettings.Load() (disk I/O) on every ReposDir/WorktreesDir access

PolyPilot/Services/RepoManager.cs ~L17, L85–90
Consensus: 4/4 models

The PR replaces the old cached-static pattern:

// OLD: computed once, locked
private static string? _reposDir;
private static string ReposDir { get { lock (_pathLock) return _reposDir ??= GetReposDir(); } }

with uncached instance properties:

// NEW: disk read on every access
private string ReposDir => Path.Combine(GetStorageRootDir(), "repos");

where GetStorageRootDir() calls ConnectionSettings.Load()File.ReadAllText(settings.json) + JSON deserialization on every access. A single EnsureRepoCloneInCurrentRootAsync call accesses ReposDir 2–3 times (via GetDesiredBareClonePath, Directory.CreateDirectory(ReposDir)). Since EnsureRepoCloneInCurrentRootAsync is now called from FetchAsync, GetBranchesAsync, CreateWorktreeAsync, CreateWorktreeFromPrAsync, and AddRepositoryAsync, every git operation now pays multiple unnecessary disk reads just to compute paths.

Beyond performance, this is a correctness risk: if the user saves new settings during a long-running clone, two accesses to ReposDir within the same method can resolve to different paths mid-operation.

Fix: Cache the resolved storage root (like the old _reposDir ??= pattern) and invalidate when settings are saved or SetBaseDirForTesting is called.


🟡 MODERATE — FetchAsync and GetBranchesAsync silently trigger full clone/fetch for pre-existing repos

PolyPilot/Services/RepoManager.cs ~L820, L832
Consensus: 4/4 models

Both methods now call EnsureRepoCloneInCurrentRootAsync(repo, null, ct) before their actual git work. For repos added before this PR where repo.BareClonePath is null, the early-return guard is skipped and a full git clone --bare or git fetch is triggered — with no onProgress callback (passed as null). UI code that calls GetBranchesAsync expecting a fast local read can silently block for minutes on a large repo. At minimum, pass through a progress callback from callers; ideally, implicit re-clone should not be a side effect of read-like operations.


🟡 MODERATE — repo.BareClonePath mutated and Save() called without _stateLock → potential JSON corruption

PolyPilot/Services/RepoManager.cs ~L187–190
Consensus: 2/4 models

EnsureRepoCloneInCurrentRootAsync sets repo.BareClonePath = targetBarePath and then calls Save() (which writes repos.json) without holding _stateLock. Two concurrent callers for the same repo (e.g., a FetchAsync and GetBranchesAsync racing on startup) both fail the early-return (null BareClonePath), both run, and both call File.WriteAllText concurrently on repos.json. Concurrent WriteAllText calls can corrupt the file. Fix: acquire _stateLock around the field write and save, or use a per-repo SemaphoreSlim.


🟡 MODERATE — RecloneAllRegisteredRepos saves ALL settings before re-clone (no rollback, saves unintended changes)

PolyPilot/Components/Pages/Settings.razor ~L1007
Consensus: 2/4 models

settings.Save();   // ← commits ALL settings fields, not just RepositoryStorageRoot
recloneBusy = true;
await RepoManager.RecloneAllRepositoriesToCurrentRootAsync(...);
// no rollback if above throws

Two problems: (1) If the user changed other settings (mode, port, password) without clicking "Save & Reconnect", clicking "Re-clone all" permanently commits those unintended changes. (2) If the clone fails, the setting is already saved pointing to the new (incomplete) root with no way to roll back. Fix: save only RepositoryStorageRoot selectively, or make the button unavailable until settings are explicitly saved first; add rollback on failure.


🟡 MODERATE — Old bare clone directories never cleaned up after storage root change (storage leak)

PolyPilot/Services/RepoManager.cs ~L158–200
Consensus: 2/4 models

EnsureRepoCloneInCurrentRootAsync clones to the new root but never removes the old bare clone at the original path. When a user migrates storage root, disk usage doubles. The old clones are orphaned with no mechanism for cleanup. Fix: after a successful clone to the new location, offer to delete or explicitly move (instead of clone) the old directory.


Test Coverage Assessment

Good:

  • RepoManagerStorageTests.cs (new) — backfill logic, JSON round-trip, PathsEqual ✅
  • ConnectionSettingsTests.cs — NormalizeRepositoryStorageRoot, round-trip ✅
  • Tests use SetBaseDirForTesting via [ModuleInitializer] — correct isolation ✅

Gaps:

  • No test for concurrent EnsureRepoCloneInCurrentRootAsync calls on the same repo (the JSON corruption race)
  • No test for GetBranchesAsync behavior when BareClonePath is null (post-upgrade scenario)
  • No test verifying GetStorageRootDir() returns consistent results when called multiple times within one operation

Recommended Action: ⚠️ Request Changes

Blocking (must fix before merge):

  1. Cache the resolved storage root — call ConnectionSettings.Load() once (or on-demand with invalidation), not on every property access
  2. Guard repo.BareClonePath mutation and Save() in EnsureRepoCloneInCurrentRootAsync under _stateLock

Non-blocking (recommend fixing):
3. Add progress reporting to FetchAsync/GetBranchesAsync's EnsureRepoCloneInCurrentRootAsync calls, or remove the implicit auto-migrate from read operations
4. Fix RecloneAllRegisteredRepos save ordering — save settings after successful clone, not before; save only the changed field
5. Add cleanup/move semantics for old bare clones to avoid storage doubling

PureWeen and others added 2 commits March 9, 2026 22:36
- Cache GetStorageRootDir() result to avoid ConnectionSettings.Load() disk I/O
  on every ReposDir/WorktreesDir access (was ~10 calls per operation)
- Add InvalidateStorageRoot() for settings changes in Settings.razor
- Add _stateLock protection around BareClonePath mutations and Save()
  in EnsureRepoCloneInCurrentRootAsync
- Wrap BackfillWorktreeClonePaths iteration with _stateLock
- Clear _storageRoot in SetBaseDirForTesting for test isolation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner

🔍 PR #280 Re-Review — Post-Fix Assessment

"Add configurable repo/worktree storage root with re-clone flow"

Fix commit: 6ae0cf9e — "fix: cache storage root, add thread safety for repo state mutations"
Method: 5-model ensemble (Claude Opus 4.6 ×2, Claude Sonnet 4.6, Gemini 3 Pro, GPT-5.3-Codex) with ≥2/5 consensus filter


Previous Findings Status

# Finding Severity Status
1 Uncached ConnectionSettings.Load() disk I/O on every ReposDir/WorktreesDir access 🔴 CRITICAL FIXED_storageRoot cache + GetCachedStorageRoot() with lock eliminates per-access disk reads
2 Silent re-clone side-effects in FetchAsync/GetBranchesAsync 🟡 MODERATE ⚠️ By design — migration pattern; EnsureRepoCloneInCurrentRootAsync is the intended safety net
3 Concurrent Save() race on BareClonePath mutation 🟡 MODERATE FIXEDlock (_stateLock) now wraps BareClonePath assignment + BackfillWorktreeClonePaths + Save()
4 Settings saved before reclone validation completes 🟡 MODERATE ⚠️ Acceptable — ensure-on-use pattern (finding 2) acts as safety net; low practical risk
5 Orphaned old clones after storage root change 🟡 MODERATE ⚠️ Conservative design — no auto-deletion is safer than accidental data loss

New Findings Evaluation

The re-review raised 4 new concerns. All were investigated and resolved:

Finding Models Verdict Reason
lock (_stateLock) spans await points 3/5 False positive C# compiler error CS1996 prevents await inside lock. The lock wraps only synchronous mutations after async git operations complete. CI passes.
InvalidateStorageRoot() missing in SaveAndApply 3/5 Already fixed Fix commit adds RepoManager.InvalidateStorageRoot() in both RecloneAllRegisteredRepos and SaveAndApply paths
BackfillWorktreeClonePaths runs before BareClonePath update 3/5 Incorrect Inside the lock block, order is: (1) repo.BareClonePath = targetBarePath, (2) BackfillWorktreeClonePaths(repo), (3) Save() — correct sequence
_storageRoot not cleared in SetBaseDirForTesting() 2/5 Already present Fix commit adds _storageRoot = null in SetBaseDirForTesting()

Test Coverage

  • ✅ 2382 tests pass (0 failures) after fix commit
  • RepoManagerStorageTests.cs — 6 new tests cover storage root behavior
  • ✅ Test isolation verified (SetBaseDirForTesting clears _storageRoot)

CI Status

⚠️ Pre-existing infrastructure failures (not PR-specific)


Recommendation: ✅ Approve

Both critical findings from the initial review are resolved:

  1. Disk I/O hot path — eliminated via caching with proper invalidation
  2. Thread safety_stateLock protects concurrent state mutations

Remaining items (2, 4, 5) are deliberate design tradeoffs with acceptable risk profiles. No new blocking issues found.

@PureWeen PureWeen merged commit e92013d into PureWeen:main Mar 10, 2026
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
## Problem

PR review workers were pushing to the wrong remote (e.g., pushing to
`origin/PureWeen` when the PR comes from a fork like `vitek-karas`), and
using `--force-with-lease` after rebasing unnecessarily.

### Root Cause (verified by testing)

Workers were using manual checkout:
```bash
git fetch origin pull/<N>/head:pr-<N>
git checkout pr-<N>
```

This creates a branch with **no tracking info** (`Remote: NONE`, `Merge:
NONE`). When the worker then runs `git push` or `git push origin`, git
defaults to `origin` (PureWeen/PolyPilot) — even if the PR came from a
fork.

In contrast, `gh pr checkout <N>` correctly sets up tracking:
- Same-repo PR PureWeen#247: `Remote: origin`, `Merge:
refs/heads/copilot/build-notification-and-timers` ✅
- Fork PR PureWeen#280: `Remote: vitek-karas`, `Merge:
refs/heads/Add-a-way-to-specify-polypilot-clone-root-7299` ✅

The second problem was `git rebase origin/main` + `--force-with-lease`,
which requires force pushing. Using `git merge origin/main` adds a merge
commit instead — no force push needed.

## Changes

### `push-to-pr.sh`
A shell script that:
1. Reads PR metadata via `gh pr view` to find the correct owner and
branch
2. Maps the owner to the correct git remote
3. Pushes to that remote **without** `--force`
4. Verifies the push landed by comparing local and remote HEADs
5. Refuses to push if current branch doesn't match the PR branch (guards
against wrong-branch pushes)

### `.squad/routing.md`
Updated Fix Process replacing rebase+force-push with merge+safe-push,
with clear explanation of why `gh pr checkout` is required.

### `.squad/decisions.md`
Explicit shared rules:
- Never force push
- Always use `gh pr checkout` (not manual fetch)
- Always merge (not rebase)
- Always verify push target before pushing
- Use `push-to-pr.sh` for all PR pushes

### `.squad/team.md`
Team definition file (required for squad directory to be recognized by
PolyPilot's squad discovery).

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…een#280)

## Summary
- add `ConnectionSettings.RepositoryStorageRoot` and normalize/validate
it
- update `RepoManager` to resolve clone/worktree roots from the setting
while keeping `repos.json` under `~/.polypilot`
- preserve existing worktree cleanup behavior by persisting
`WorktreeInfo.BareClonePath`
- add bulk `RecloneAllRepositoriesToCurrentRootAsync` and lazy migration
for existing repos on new worktree/fetch/branch operations
- add Settings UI to pick/reset storage root and trigger "Re-clone all
registered repos in new root"
- add unit tests for settings + repo storage behavior

## Validation
- `dotnet build -f net10.0-maccatalyst /p:CopilotSkipCliDownload=true`
- `dotnet test /p:CopilotSkipCliDownload=true --filter
"FullyQualifiedName~ConnectionSettingsTests"`
- `dotnet test /p:CopilotSkipCliDownload=true --filter
"FullyQualifiedName~RepoManagerStorageTests"`
- `dotnet test /p:CopilotSkipCliDownload=true --filter
"FullyQualifiedName!~CliPathResolutionTests"`

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Shane Neuville (HE/HIM) <shneuvil@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants