Skip to content

fix: stop recurring Keychain password prompts on server recovery#462

Merged
PureWeen merged 1 commit intomainfrom
fix/stop-recurring-keychain-prompts
Apr 1, 2026
Merged

fix: stop recurring Keychain password prompts on server recovery#462
PureWeen merged 1 commit intomainfrom
fix/stop-recurring-keychain-prompts

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

@PureWeen PureWeen commented Apr 1, 2026

Problem

Every couple of hours, the user gets prompted to "allow copilot-cli" and must enter their macOS login password ~5 times. This is a regression from PR #456 — that fix addressed the 10-second polling loop but missed a second cache-clearing path.

Root Cause

TryRecoverPersistentServerAsync() clears _resolvedGitHubToken = null on every recovery cycle (line 1339). Recovery is triggered by:

  • Watchdog timeouts (token expiry ~1-8h)
  • Wake/sleep health check failures
  • Auth polling success → server restart

After clearing, the next CheckAuthStatusAsync() call sees _resolvedGitHubToken == null → enters the lazy Keychain resolution path → spawns security find-generic-password -s copilot-cli -wmacOS password dialog.

Re-reading the Keychain is useless — the stored token is a static snapshot from copilot login. If it expired, re-reading returns the same expired token. Only running copilot login again would write a new one.

Fix

Remove the _resolvedGitHubToken = null from TryRecoverPersistentServerAsync. The cached token is still forwarded to the new server via tokenToForward. Only two paths now clear the cache:

  • ReconnectAsync — explicit user action (settings change)
  • ReauthenticateAsync — explicit user action (Re-authenticate button)

When the token expires, the auth banner appears and the user clicks Re-authenticate → fresh Keychain read (correct — user-initiated).

Changes

File Change
CopilotService.cs Remove _resolvedGitHubToken = null from recovery path
auth-token-safety/SKILL.md Update INV-A3 to reflect new invariant

Tests

3064/3064 pass ✅

Related

TryRecoverPersistentServerAsync was clearing _resolvedGitHubToken = null
on every recovery cycle (watchdog timeout, wake/sleep, auth polling).
This caused CheckAuthStatusAsync to re-enter the lazy Keychain resolution
path, spawning `security find-generic-password` which triggers a macOS
password dialog each time.

Re-reading the Keychain is pointless after token expiration — the stored
token is static and only changes when `copilot login` is run. The same
expired token comes back, triggering another dialog on the next cycle.

Fix: preserve the cached token through recovery. Only ReauthenticateAsync
(explicit user action) and ReconnectAsync (settings change) clear it.
When the forwarded token expires, the auth banner appears and the user
can click Re-authenticate for a fresh Keychain read.

Updates INV-A3 in auth-token-safety skill to reflect the new invariant.

Tests: 3064/3064 pass ✅

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Apr 1, 2026

🔍 Multi-Model Code Review — PR #462 R1

Models: Claude Opus 4.6 · Claude Sonnet 4.6 · GPT-5.3-Codex
CI: ⚠️ No checks configured | PR: fix: stop recurring Keychain password prompts on server recovery


Summary

The PR removes _resolvedGitHubToken = null from TryRecoverPersistentServerAsync, keeping the cached token across automatic server recovery. This prevents the lazy-resolution path in CheckAuthStatusAsync from re-reading the macOS Keychain (which triggers a password dialog) when the token hasn't changed.


✅ All 3 Models Agree: No Issues Found

Regressions — None ✅
Traced the full expired-token recovery path:

  1. Watchdog triggers TryRecoverPersistentServerAsync → starts server with cached (possibly expired) token
  2. CheckAuthStatusAsync sees _resolvedGitHubToken != null → skips lazy Keychain resolution (no dialog!)
  3. Falls through to AuthNotice banner + auth polling
  4. User runs copilot login → clicks Re-authenticate → ReauthenticateAsync does fresh ResolveGitHubTokenForServer() → overwrites cache → recovery works

This is more stable than the old behavior (clear → lazy resolve → read same expired token → loop with Keychain dialogs).

Security — None ✅
Retaining an expired token in a private field on a singleton adds no attack surface. The token was already passed to the server process via COPILOT_GITHUB_TOKEN env var.

Race Conditions — None ✅
Removing the null write actually reduces contention. All write paths remain serialized: _recoveryLock guards recovery, _tokenResolutionLock guards lazy resolution, ReauthenticateAsync/ReconnectAsync are UI-thread actions.

Bugs — None ✅

  • tokenToForward local variable (line 1341) correctly captures cached value
  • ReconnectAsync (line 1224) still clears cache for settings changes
  • ReauthenticateAsync (line 332) overwrites with fresh resolution

📋 Test Coverage

Path Tested?
Recovery mode guard (non-persistent)
Recovery failure (server start fails)
Concurrent recovery callers (30s window)
Token value forwarded after recovery ⚠️ Pre-existing gap — no test asserts cached token is preserved

Verdict: ✅ Approve

Clean, well-scoped fix with thorough inline documentation. Aligns with auth-token-safety skill invariants (INV-A2, INV-A3). No blocking issues. Consider adding a test asserting token preservation across recovery as a follow-up.

@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Apr 1, 2026

Multi-Model Code Review: PR #462

PR: fix: stop recurring Keychain password prompts on server recovery
Models: claude-opus-4.6 ✅, gpt-5.3-codex ✅, claude-sonnet-4.6 ⏱️ (timed out — excluded)
CI: ⚠️ No checks reported on the branch
Prior reviews: None (0 comments, 0 reviews)


Consensus Findings

No blocking issues found. Both models independently verified:

Area Verdict Details
Regressions ✅ Clean Expired token → auth banner → user Re-authenticates. More stable than before (old code caused Keychain re-read loop with same expired token).
Security ✅ Clean Retaining an expired token in a private singleton field adds no attack surface. Token was already held in tokenToForward local + server process env.
Logic bugs ✅ Clean tokenToForward correctly captures cached value. Both explicit-clear paths (ReconnectAsync line 1224, ReauthenticateAsync line 332) are unaffected.
Data loss ✅ Clean Change is auth-cache-only; sessions unaffected.
Race conditions ✅ Clean Removing the null write reduces write contention. Existing serialization (_recoveryLock, _tokenResolutionLock) provides adequate safety.

Test Coverage Note (2/2 models flagged — 🟢 MINOR)

PolyPilot.Tests/ServerRecoveryTests.cs ~line 319: No test asserts that _resolvedGitHubToken is preserved after TryRecoverPersistentServerAsync. A regression test using StubServerManager.LastGitHubToken to verify the cached token survives recovery would harden this invariant. Both models flagged this as a pre-existing gap (not introduced by this PR), not a blocker.

Summary

The fix is correct and well-scoped. The one-line removal of _resolvedGitHubToken = null eliminates the root cause of recurring macOS Keychain password prompts. The SKILL.md update accurately documents the new invariant. Both Opus and Codex independently traced the full recovery → auth-check → banner → re-authenticate path and confirmed no regressions.

Recommended action: ✅ Approve

@PureWeen PureWeen merged commit 8ee458e into main Apr 1, 2026
@PureWeen PureWeen deleted the fix/stop-recurring-keychain-prompts branch April 1, 2026 05:21
PureWeen added a commit that referenced this pull request Apr 1, 2026
## Problem

After PR #462 merged, users still get macOS "allow copilot-cli" password
dialogs every ~1-2 hours. Process monitoring confirmed the parent PID is
the PolyPilot .NET process itself (`dotnet exec`), not external copilot
binaries.

## Root Cause

PR #462 prevented `TryRecoverPersistentServerAsync` from clearing
`_resolvedGitHubToken`. But the token was **never set in the first
place** — on startup, `ResolveGitHubTokenFromEnv()` returns null (no env
var), and when `CheckAuthStatusAsync` finds the server authenticated, it
returns `true` without setting the field. So `_resolvedGitHubToken`
stays `null`.

Later, any transient auth failure (connection blip, client recreation,
server token rotation) calls `CheckAuthStatusAsync` again. This time
auth fails, the guard `_resolvedGitHubToken == null` is satisfied, and
the lazy path fires:
1. `ResolveGitHubTokenForServer()` → `TryReadCopilotKeychainToken()`
2. Tries 3 service names: `copilot-cli`, `github-copilot`, `GitHub
Copilot`
3. Each spawns `/usr/bin/security find-generic-password -s <name> -w`
(3s timeout)
4. **3 macOS password dialogs** (PolyPilot not in the Keychain ACL for
this entry)

## Evidence

Process monitor output confirmed PolyPilot as the source:
```
security PID=66918, Parent PID=66646
  66646 66630 /usr/local/share/dotnet/dotnet exec --runtimeconfig ...
security PID=67068, Parent PID=66646  (3s later)
security PID=67146, Parent PID=66646  (3s later)
```

## Fix

Set `_resolvedGitHubToken ??= string.Empty` when `CheckAuthStatusAsync`
finds the server authenticated. The empty string sentinel means "server
can self-authenticate, no Keychain read needed." The lazy path guard
`_resolvedGitHubToken == null` is no longer satisfied.

`ServerManager.StartServerAsync` uses
`!string.IsNullOrEmpty(githubToken)` so the empty sentinel is never
forwarded as a `COPILOT_GITHUB_TOKEN` env var.

## Behavior Matrix

| Scenario | Before | After |
|----------|--------|-------|
| Startup, server self-auths | `_resolvedGitHubToken` stays null | Set
to `""` (sentinel) |
| Later transient auth failure | Lazy path fires → 3 password dialogs |
Lazy path skipped → auth banner shown |
| Server genuinely cant auth | Lazy path fires once (correct) | Same —
`""` not set because auth failed |
| ReauthenticateAsync (user) | Fresh Keychain read | Same — clears to
null first |
| ReconnectAsync (settings) | Clears to null | Same |

## Tests
3064/3064 pass ✅

## Related
- Fixes remaining issue after #462

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 1, 2026
## Problem

PR #446 added Keychain-reading code (TryReadCopilotKeychainToken,
ResolveGitHubTokenForServer, RunProcessWithTimeout) to help users
whose headless server could not self-authenticate. This caused a
4-PR regression chain (#446#456#462#463):

1. Each /usr/bin/security call triggers a macOS password dialog
   (PolyPilot is not in the copilot-cli Keychain ACL)
2. TryReadCopilotKeychainToken tried 3 service names sequentially,
   each spawning a separate dialog (3× password prompts per call)
3. Clicking Allow/Deny rewrote the Keychain ACL, breaking the
   server own native keytar access
4. The server fell back to its own /usr/bin/security calls, creating
   a spiral of recurring password prompts every 1-2 hours

## Root Cause Analysis

The headless copilot server authenticates on its own at startup via
its native credential store. This has worked reliably across dozens
of worktree switches (different binary paths each time). The original
PR #446 user issue ("Session was not created with authentication
info") was a server auth loss that should have been solved by
restarting the server — which TryRecoverPersistentServerAsync already
does — not by reading the Keychain from the UI process.

## Changes

- Remove ResolveGitHubTokenForServer (Keychain + gh CLI reads)
- Remove TryReadCopilotKeychainToken (3-service-name loop)
- Remove RunProcessWithTimeout (only used by above)
- Remove _tokenResolutionLock SemaphoreSlim (guarded lazy path)
- Remove lazy Keychain resolution path in CheckAuthStatusAsync
- Remove sentinel logic (_resolvedGitHubToken ??= string.Empty)
- Simplify ReauthenticateAsync to just restart the server
- Keep ResolveGitHubTokenFromEnv (env vars are safe, no prompts)
- Keep auth banner + Re-authenticate button (correct UX)
- Rewrite auth-token-safety skill doc with new invariant
- Remove 7 tests for deleted methods, keep 3 env var tests

## For users who cannot self-authenticate

The auth banner says: "run copilot login in a terminal, then click
Re-authenticate." Re-authenticate restarts the server, which picks
up the fresh credentials. This was the original PR #446 design before
Keychain code was added during review rounds.

Tests: 3057/3057 pass ✅

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 1, 2026
## Problem

PR #446 added Keychain-reading code (TryReadCopilotKeychainToken,
ResolveGitHubTokenForServer, RunProcessWithTimeout) to help users
whose headless server could not self-authenticate. This caused a
4-PR regression chain (#446#456#462#463):

1. Each /usr/bin/security call triggers a macOS password dialog
   (PolyPilot is not in the copilot-cli Keychain ACL)
2. TryReadCopilotKeychainToken tried 3 service names sequentially,
   each spawning a separate dialog (3× password prompts per call)
3. Clicking Allow/Deny rewrote the Keychain ACL, breaking the
   server own native keytar access
4. The server fell back to its own /usr/bin/security calls, creating
   a spiral of recurring password prompts every 1-2 hours

## Root Cause Analysis

The headless copilot server authenticates on its own at startup via
its native credential store. This has worked reliably across dozens
of worktree switches (different binary paths each time). The original
PR #446 user issue ("Session was not created with authentication
info") was a server auth loss that should have been solved by
restarting the server — which TryRecoverPersistentServerAsync already
does — not by reading the Keychain from the UI process.

## Changes

- Remove ResolveGitHubTokenForServer (Keychain + gh CLI reads)
- Remove TryReadCopilotKeychainToken (3-service-name loop)
- Remove RunProcessWithTimeout (only used by above)
- Remove _tokenResolutionLock SemaphoreSlim (guarded lazy path)
- Remove lazy Keychain resolution path in CheckAuthStatusAsync
- Remove sentinel logic (_resolvedGitHubToken ??= string.Empty)
- Simplify ReauthenticateAsync to just restart the server
- Keep ResolveGitHubTokenFromEnv (env vars are safe, no prompts)
- Keep auth banner + Re-authenticate button (correct UX)
- Rewrite auth-token-safety skill doc with new invariant
- Remove 7 tests for deleted methods, keep 3 env var tests

## For users who cannot self-authenticate

The auth banner says: "run copilot login in a terminal, then click
Re-authenticate." Re-authenticate restarts the server, which picks
up the fresh credentials. This was the original PR #446 design before
Keychain code was added during review rounds.

Tests: 3057/3057 pass ✅

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 1, 2026
## Problem

PR #446 added Keychain-reading code to help users whose headless server
couldn't self-authenticate. This caused a **4-PR regression chain**
(#446#456#462#463) of recurring macOS password dialogs:

1. `TryReadCopilotKeychainToken` spawned `/usr/bin/security
find-generic-password` for 3 service names sequentially — **3 password
dialogs per call**
2. Clicking Allow/Deny on those dialogs **rewrote the Keychain ACL**,
breaking the server's own native keytar access
3. The server fell back to its own `/usr/bin/security` calls → **more
dialogs every 1-2 hours**
4. PRs #456, #462, #463 each fixed one trigger path but the core
approach was wrong

## Root Cause

The headless copilot server **authenticates on its own** at startup via
its native credential store. This has worked reliably across dozens of
worktree switches (different binary paths each time) without any
Keychain intervention from PolyPilot.

The original PR #446 user issue ("Session was not created with
authentication info") was a server auth loss that should have been
solved by **restarting the server** — which
`TryRecoverPersistentServerAsync` already does — not by reading the
Keychain from the UI process.

## Fix

Remove all Keychain-reading code entirely (**-612 lines, +42 lines**):

| Removed | Why |
|---------|-----|
| `ResolveGitHubTokenForServer()` | Keychain + gh CLI reads — triggers
password dialogs |
| `TryReadCopilotKeychainToken()` | 3-service-name loop — 3× password
dialogs per call |
| `RunProcessWithTimeout()` | Only used by above |
| `_tokenResolutionLock` SemaphoreSlim | Guarded the lazy Keychain path
|
| Lazy resolution path in `CheckAuthStatusAsync` | The whole Keychain
auto-resolution mechanism |
| Sentinel logic (`_resolvedGitHubToken ??= ""`) | No longer needed
without the lazy path |

| Kept | Why |
|------|-----|
| `ResolveGitHubTokenFromEnv()` | Env vars are safe, no prompts |
| `CheckAuthStatusAsync` + auth banner | Correct — shows "run copilot
login" guidance |
| `TryRecoverPersistentServerAsync` | Correct — restarts server which
re-authenticates on its own |
| Re-authenticate button | Correct — restarts server to pick up fresh
`copilot login` credentials |

## For users who cannot self-authenticate

The auth banner says: _"run `copilot login` in a terminal, then click
Re-authenticate."_ This was the **original PR #446 design** before
Keychain code was added during review rounds.

## Timeline of the regression

| PR | What it did | What went wrong |
|----|-------------|-----------------|
| #446 | Added Keychain reads to forward token to server | Triggered
password dialogs; corrupted Keychain ACL |
| #456 | Made Keychain reads lazy (only on first auth failure) | Still
fired on every server recovery cycle |
| #462 | Stopped recovery from clearing the token cache | Token was
never set in the first place (no env var) |
| #463 | Added sentinel on auth success | Server's own internal Keychain
reads still fired |
| **#465** | **Removed all Keychain code** | **The right fix — server
self-authenticates** |

## Tests
3057/3057 pass ✅ (7 tests for deleted methods removed, 3 env var tests
kept)

## Why server self-authentication works

The copilot headless server binary bundled inside PolyPilot.app and the
Homebrew `copilot` binary are both signed with the **same GitHub
Developer ID certificate**. macOS Keychain ACLs use code-signing
identity (not binary path) to control access, so:

- `copilot login` (Homebrew binary) writes the token to Keychain with an
ACL scoped to the GitHub Developer ID
- `copilot --headless` (bundled binary) reads the token via keytar —
**same Developer ID = no password prompt**
- PolyPilot's `/usr/bin/security` calls used a **different signer**
(Apple's built-in tool), which triggered the ACL mismatch and the
password dialogs
- Clicking Allow/Deny on those dialogs **rewrote the ACL**, breaking the
server's native keytar access — creating the regression spiral

This was verified via `codesign -dvvv` on both binaries — they share the
same Identifier, Authority chain, and team certificate. The Keychain
entry's partition list (visible in `securityd` logs) confirms it uses
team-id-based access control, not path-based.

**Conclusion:** The server has always been able to self-authenticate.
The only thing that broke it was PolyPilot calling `/usr/bin/security`,
which used a different code-signing identity and corrupted the ACL.
Removing those calls is the correct fix.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…eWeen#462)

## Problem

Every couple of hours, the user gets prompted to "allow copilot-cli" and
must enter their macOS login password ~5 times. This is a regression
from PR PureWeen#456 — that fix addressed the 10-second polling loop but missed
a second cache-clearing path.

## Root Cause

`TryRecoverPersistentServerAsync()` clears `_resolvedGitHubToken = null`
on every recovery cycle (line 1339). Recovery is triggered by:
- Watchdog timeouts (token expiry ~1-8h)
- Wake/sleep health check failures
- Auth polling success → server restart

After clearing, the next `CheckAuthStatusAsync()` call sees
`_resolvedGitHubToken == null` → enters the lazy Keychain resolution
path → spawns `security find-generic-password -s copilot-cli -w` →
**macOS password dialog**.

**Re-reading the Keychain is useless** — the stored token is a static
snapshot from `copilot login`. If it expired, re-reading returns the
same expired token. Only running `copilot login` again would write a new
one.

## Fix

Remove the `_resolvedGitHubToken = null` from
`TryRecoverPersistentServerAsync`. The cached token is still forwarded
to the new server via `tokenToForward`. Only two paths now clear the
cache:
- `ReconnectAsync` — explicit user action (settings change)
- `ReauthenticateAsync` — explicit user action (Re-authenticate button)

When the token expires, the auth banner appears and the user clicks
Re-authenticate → fresh Keychain read (correct — user-initiated).

## Changes

| File | Change |
|------|--------|
| `CopilotService.cs` | Remove `_resolvedGitHubToken = null` from
recovery path |
| `auth-token-safety/SKILL.md` | Update INV-A3 to reflect new invariant
|

## Tests
3064/3064 pass ✅

## Related
- Fixes regression from PureWeen#456 (which fixed regression from PureWeen#446)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…eWeen#463)

## Problem

After PR PureWeen#462 merged, users still get macOS "allow copilot-cli" password
dialogs every ~1-2 hours. Process monitoring confirmed the parent PID is
the PolyPilot .NET process itself (`dotnet exec`), not external copilot
binaries.

## Root Cause

PR PureWeen#462 prevented `TryRecoverPersistentServerAsync` from clearing
`_resolvedGitHubToken`. But the token was **never set in the first
place** — on startup, `ResolveGitHubTokenFromEnv()` returns null (no env
var), and when `CheckAuthStatusAsync` finds the server authenticated, it
returns `true` without setting the field. So `_resolvedGitHubToken`
stays `null`.

Later, any transient auth failure (connection blip, client recreation,
server token rotation) calls `CheckAuthStatusAsync` again. This time
auth fails, the guard `_resolvedGitHubToken == null` is satisfied, and
the lazy path fires:
1. `ResolveGitHubTokenForServer()` → `TryReadCopilotKeychainToken()`
2. Tries 3 service names: `copilot-cli`, `github-copilot`, `GitHub
Copilot`
3. Each spawns `/usr/bin/security find-generic-password -s <name> -w`
(3s timeout)
4. **3 macOS password dialogs** (PolyPilot not in the Keychain ACL for
this entry)

## Evidence

Process monitor output confirmed PolyPilot as the source:
```
security PID=66918, Parent PID=66646
  66646 66630 /usr/local/share/dotnet/dotnet exec --runtimeconfig ...
security PID=67068, Parent PID=66646  (3s later)
security PID=67146, Parent PID=66646  (3s later)
```

## Fix

Set `_resolvedGitHubToken ??= string.Empty` when `CheckAuthStatusAsync`
finds the server authenticated. The empty string sentinel means "server
can self-authenticate, no Keychain read needed." The lazy path guard
`_resolvedGitHubToken == null` is no longer satisfied.

`ServerManager.StartServerAsync` uses
`!string.IsNullOrEmpty(githubToken)` so the empty sentinel is never
forwarded as a `COPILOT_GITHUB_TOKEN` env var.

## Behavior Matrix

| Scenario | Before | After |
|----------|--------|-------|
| Startup, server self-auths | `_resolvedGitHubToken` stays null | Set
to `""` (sentinel) |
| Later transient auth failure | Lazy path fires → 3 password dialogs |
Lazy path skipped → auth banner shown |
| Server genuinely cant auth | Lazy path fires once (correct) | Same —
`""` not set because auth failed |
| ReauthenticateAsync (user) | Fresh Keychain read | Same — clears to
null first |
| ReconnectAsync (settings) | Clears to null | Same |

## Tests
3064/3064 pass ✅

## Related
- Fixes remaining issue after PureWeen#462

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…eWeen#465)

## Problem

PR PureWeen#446 added Keychain-reading code to help users whose headless server
couldn't self-authenticate. This caused a **4-PR regression chain**
(PureWeen#446PureWeen#456PureWeen#462PureWeen#463) of recurring macOS password dialogs:

1. `TryReadCopilotKeychainToken` spawned `/usr/bin/security
find-generic-password` for 3 service names sequentially — **3 password
dialogs per call**
2. Clicking Allow/Deny on those dialogs **rewrote the Keychain ACL**,
breaking the server's own native keytar access
3. The server fell back to its own `/usr/bin/security` calls → **more
dialogs every 1-2 hours**
4. PRs PureWeen#456, PureWeen#462, PureWeen#463 each fixed one trigger path but the core
approach was wrong

## Root Cause

The headless copilot server **authenticates on its own** at startup via
its native credential store. This has worked reliably across dozens of
worktree switches (different binary paths each time) without any
Keychain intervention from PolyPilot.

The original PR PureWeen#446 user issue ("Session was not created with
authentication info") was a server auth loss that should have been
solved by **restarting the server** — which
`TryRecoverPersistentServerAsync` already does — not by reading the
Keychain from the UI process.

## Fix

Remove all Keychain-reading code entirely (**-612 lines, +42 lines**):

| Removed | Why |
|---------|-----|
| `ResolveGitHubTokenForServer()` | Keychain + gh CLI reads — triggers
password dialogs |
| `TryReadCopilotKeychainToken()` | 3-service-name loop — 3× password
dialogs per call |
| `RunProcessWithTimeout()` | Only used by above |
| `_tokenResolutionLock` SemaphoreSlim | Guarded the lazy Keychain path
|
| Lazy resolution path in `CheckAuthStatusAsync` | The whole Keychain
auto-resolution mechanism |
| Sentinel logic (`_resolvedGitHubToken ??= ""`) | No longer needed
without the lazy path |

| Kept | Why |
|------|-----|
| `ResolveGitHubTokenFromEnv()` | Env vars are safe, no prompts |
| `CheckAuthStatusAsync` + auth banner | Correct — shows "run copilot
login" guidance |
| `TryRecoverPersistentServerAsync` | Correct — restarts server which
re-authenticates on its own |
| Re-authenticate button | Correct — restarts server to pick up fresh
`copilot login` credentials |

## For users who cannot self-authenticate

The auth banner says: _"run `copilot login` in a terminal, then click
Re-authenticate."_ This was the **original PR PureWeen#446 design** before
Keychain code was added during review rounds.

## Timeline of the regression

| PR | What it did | What went wrong |
|----|-------------|-----------------|
| PureWeen#446 | Added Keychain reads to forward token to server | Triggered
password dialogs; corrupted Keychain ACL |
| PureWeen#456 | Made Keychain reads lazy (only on first auth failure) | Still
fired on every server recovery cycle |
| PureWeen#462 | Stopped recovery from clearing the token cache | Token was
never set in the first place (no env var) |
| PureWeen#463 | Added sentinel on auth success | Server's own internal Keychain
reads still fired |
| **PureWeen#465** | **Removed all Keychain code** | **The right fix — server
self-authenticates** |

## Tests
3057/3057 pass ✅ (7 tests for deleted methods removed, 3 env var tests
kept)

## Why server self-authentication works

The copilot headless server binary bundled inside PolyPilot.app and the
Homebrew `copilot` binary are both signed with the **same GitHub
Developer ID certificate**. macOS Keychain ACLs use code-signing
identity (not binary path) to control access, so:

- `copilot login` (Homebrew binary) writes the token to Keychain with an
ACL scoped to the GitHub Developer ID
- `copilot --headless` (bundled binary) reads the token via keytar —
**same Developer ID = no password prompt**
- PolyPilot's `/usr/bin/security` calls used a **different signer**
(Apple's built-in tool), which triggered the ACL mismatch and the
password dialogs
- Clicking Allow/Deny on those dialogs **rewrote the ACL**, breaking the
server's native keytar access — creating the regression spiral

This was verified via `codesign -dvvv` on both binaries — they share the
same Identifier, Authority chain, and team certificate. The Keychain
entry's partition list (visible in `securityd` logs) confirms it uses
team-id-based access control, not path-based.

**Conclusion:** The server has always been able to self-authenticate.
The only thing that broke it was PolyPilot calling `/usr/bin/security`,
which used a different code-signing identity and corrupted the ACL.
Removing those calls is the correct fix.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant