fix: set sentinel on auth success to prevent lazy Keychain reads#463
fix: set sentinel on auth success to prevent lazy Keychain reads#463
Conversation
After startup, _resolvedGitHubToken stays null when no env var is set and the server self-authenticates. The lazy Keychain path in CheckAuthStatusAsync is guarded by (_resolvedGitHubToken == null), so any later transient auth failure triggers ResolveGitHubTokenForServer → TryReadCopilotKeychainToken → 3 service names × 3s timeout each = 3 macOS "allow copilot-cli" password dialogs. Fix: set _resolvedGitHubToken to string.Empty when auth succeeds. This sentinel means "server can self-authenticate, no Keychain needed." ServerManager.StartServerAsync uses !string.IsNullOrEmpty() so the empty sentinel is never forwarded as an env var. Confirmed via process monitoring: parent PID of /usr/bin/security calls is the PolyPilot .NET process (dotnet exec), not external copilot binaries. Tests: 3064/3064 pass ✅ (2 flaky timing tests pass on retry) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multi-Model Code Review: PR #463 (R1)PR: fix: set sentinel on auth success to prevent lazy Keychain reads Core Fix Verification — ✅ Correct (3/3 models agree)All three models independently verified:
Consensus Findings🟢 MINOR — Non-atomic
|
| Check | Result |
|---|---|
Regressions ("" at all StartServerAsync sites) |
✅ Safe |
??= semantics (preserves real tokens) |
✅ Correct |
| ReauthenticateAsync / ReconnectAsync interaction | ✅ Correct |
| Security | ✅ No risk |
| Thread safety | 🟢 Theoretical race, pre-existing pattern |
| Data loss / sessions | ✅ None |
| Test coverage | 🟢 No direct tests — acceptable for bugfix |
| SKILL.md documentation | ✅ Accurate |
Recommended action: ✅ Approve
## Problem PR #446 added Keychain-reading code (TryReadCopilotKeychainToken, ResolveGitHubTokenForServer, RunProcessWithTimeout) to help users whose headless server could not self-authenticate. This caused a 4-PR regression chain (#446 → #456 → #462 → #463): 1. Each /usr/bin/security call triggers a macOS password dialog (PolyPilot is not in the copilot-cli Keychain ACL) 2. TryReadCopilotKeychainToken tried 3 service names sequentially, each spawning a separate dialog (3× password prompts per call) 3. Clicking Allow/Deny rewrote the Keychain ACL, breaking the server own native keytar access 4. The server fell back to its own /usr/bin/security calls, creating a spiral of recurring password prompts every 1-2 hours ## Root Cause Analysis The headless copilot server authenticates on its own at startup via its native credential store. This has worked reliably across dozens of worktree switches (different binary paths each time). The original PR #446 user issue ("Session was not created with authentication info") was a server auth loss that should have been solved by restarting the server — which TryRecoverPersistentServerAsync already does — not by reading the Keychain from the UI process. ## Changes - Remove ResolveGitHubTokenForServer (Keychain + gh CLI reads) - Remove TryReadCopilotKeychainToken (3-service-name loop) - Remove RunProcessWithTimeout (only used by above) - Remove _tokenResolutionLock SemaphoreSlim (guarded lazy path) - Remove lazy Keychain resolution path in CheckAuthStatusAsync - Remove sentinel logic (_resolvedGitHubToken ??= string.Empty) - Simplify ReauthenticateAsync to just restart the server - Keep ResolveGitHubTokenFromEnv (env vars are safe, no prompts) - Keep auth banner + Re-authenticate button (correct UX) - Rewrite auth-token-safety skill doc with new invariant - Remove 7 tests for deleted methods, keep 3 env var tests ## For users who cannot self-authenticate The auth banner says: "run copilot login in a terminal, then click Re-authenticate." Re-authenticate restarts the server, which picks up the fresh credentials. This was the original PR #446 design before Keychain code was added during review rounds. Tests: 3057/3057 pass ✅ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Problem PR #446 added Keychain-reading code (TryReadCopilotKeychainToken, ResolveGitHubTokenForServer, RunProcessWithTimeout) to help users whose headless server could not self-authenticate. This caused a 4-PR regression chain (#446 → #456 → #462 → #463): 1. Each /usr/bin/security call triggers a macOS password dialog (PolyPilot is not in the copilot-cli Keychain ACL) 2. TryReadCopilotKeychainToken tried 3 service names sequentially, each spawning a separate dialog (3× password prompts per call) 3. Clicking Allow/Deny rewrote the Keychain ACL, breaking the server own native keytar access 4. The server fell back to its own /usr/bin/security calls, creating a spiral of recurring password prompts every 1-2 hours ## Root Cause Analysis The headless copilot server authenticates on its own at startup via its native credential store. This has worked reliably across dozens of worktree switches (different binary paths each time). The original PR #446 user issue ("Session was not created with authentication info") was a server auth loss that should have been solved by restarting the server — which TryRecoverPersistentServerAsync already does — not by reading the Keychain from the UI process. ## Changes - Remove ResolveGitHubTokenForServer (Keychain + gh CLI reads) - Remove TryReadCopilotKeychainToken (3-service-name loop) - Remove RunProcessWithTimeout (only used by above) - Remove _tokenResolutionLock SemaphoreSlim (guarded lazy path) - Remove lazy Keychain resolution path in CheckAuthStatusAsync - Remove sentinel logic (_resolvedGitHubToken ??= string.Empty) - Simplify ReauthenticateAsync to just restart the server - Keep ResolveGitHubTokenFromEnv (env vars are safe, no prompts) - Keep auth banner + Re-authenticate button (correct UX) - Rewrite auth-token-safety skill doc with new invariant - Remove 7 tests for deleted methods, keep 3 env var tests ## For users who cannot self-authenticate The auth banner says: "run copilot login in a terminal, then click Re-authenticate." Re-authenticate restarts the server, which picks up the fresh credentials. This was the original PR #446 design before Keychain code was added during review rounds. Tests: 3057/3057 pass ✅ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Problem PR #446 added Keychain-reading code to help users whose headless server couldn't self-authenticate. This caused a **4-PR regression chain** (#446 → #456 → #462 → #463) of recurring macOS password dialogs: 1. `TryReadCopilotKeychainToken` spawned `/usr/bin/security find-generic-password` for 3 service names sequentially — **3 password dialogs per call** 2. Clicking Allow/Deny on those dialogs **rewrote the Keychain ACL**, breaking the server's own native keytar access 3. The server fell back to its own `/usr/bin/security` calls → **more dialogs every 1-2 hours** 4. PRs #456, #462, #463 each fixed one trigger path but the core approach was wrong ## Root Cause The headless copilot server **authenticates on its own** at startup via its native credential store. This has worked reliably across dozens of worktree switches (different binary paths each time) without any Keychain intervention from PolyPilot. The original PR #446 user issue ("Session was not created with authentication info") was a server auth loss that should have been solved by **restarting the server** — which `TryRecoverPersistentServerAsync` already does — not by reading the Keychain from the UI process. ## Fix Remove all Keychain-reading code entirely (**-612 lines, +42 lines**): | Removed | Why | |---------|-----| | `ResolveGitHubTokenForServer()` | Keychain + gh CLI reads — triggers password dialogs | | `TryReadCopilotKeychainToken()` | 3-service-name loop — 3× password dialogs per call | | `RunProcessWithTimeout()` | Only used by above | | `_tokenResolutionLock` SemaphoreSlim | Guarded the lazy Keychain path | | Lazy resolution path in `CheckAuthStatusAsync` | The whole Keychain auto-resolution mechanism | | Sentinel logic (`_resolvedGitHubToken ??= ""`) | No longer needed without the lazy path | | Kept | Why | |------|-----| | `ResolveGitHubTokenFromEnv()` | Env vars are safe, no prompts | | `CheckAuthStatusAsync` + auth banner | Correct — shows "run copilot login" guidance | | `TryRecoverPersistentServerAsync` | Correct — restarts server which re-authenticates on its own | | Re-authenticate button | Correct — restarts server to pick up fresh `copilot login` credentials | ## For users who cannot self-authenticate The auth banner says: _"run `copilot login` in a terminal, then click Re-authenticate."_ This was the **original PR #446 design** before Keychain code was added during review rounds. ## Timeline of the regression | PR | What it did | What went wrong | |----|-------------|-----------------| | #446 | Added Keychain reads to forward token to server | Triggered password dialogs; corrupted Keychain ACL | | #456 | Made Keychain reads lazy (only on first auth failure) | Still fired on every server recovery cycle | | #462 | Stopped recovery from clearing the token cache | Token was never set in the first place (no env var) | | #463 | Added sentinel on auth success | Server's own internal Keychain reads still fired | | **#465** | **Removed all Keychain code** | **The right fix — server self-authenticates** | ## Tests 3057/3057 pass ✅ (7 tests for deleted methods removed, 3 env var tests kept) ## Why server self-authentication works The copilot headless server binary bundled inside PolyPilot.app and the Homebrew `copilot` binary are both signed with the **same GitHub Developer ID certificate**. macOS Keychain ACLs use code-signing identity (not binary path) to control access, so: - `copilot login` (Homebrew binary) writes the token to Keychain with an ACL scoped to the GitHub Developer ID - `copilot --headless` (bundled binary) reads the token via keytar — **same Developer ID = no password prompt** - PolyPilot's `/usr/bin/security` calls used a **different signer** (Apple's built-in tool), which triggered the ACL mismatch and the password dialogs - Clicking Allow/Deny on those dialogs **rewrote the ACL**, breaking the server's native keytar access — creating the regression spiral This was verified via `codesign -dvvv` on both binaries — they share the same Identifier, Authority chain, and team certificate. The Keychain entry's partition list (visible in `securityd` logs) confirms it uses team-id-based access control, not path-based. **Conclusion:** The server has always been able to self-authenticate. The only thing that broke it was PolyPilot calling `/usr/bin/security`, which used a different code-signing identity and corrupted the ACL. Removing those calls is the correct fix. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eWeen#463) ## Problem After PR PureWeen#462 merged, users still get macOS "allow copilot-cli" password dialogs every ~1-2 hours. Process monitoring confirmed the parent PID is the PolyPilot .NET process itself (`dotnet exec`), not external copilot binaries. ## Root Cause PR PureWeen#462 prevented `TryRecoverPersistentServerAsync` from clearing `_resolvedGitHubToken`. But the token was **never set in the first place** — on startup, `ResolveGitHubTokenFromEnv()` returns null (no env var), and when `CheckAuthStatusAsync` finds the server authenticated, it returns `true` without setting the field. So `_resolvedGitHubToken` stays `null`. Later, any transient auth failure (connection blip, client recreation, server token rotation) calls `CheckAuthStatusAsync` again. This time auth fails, the guard `_resolvedGitHubToken == null` is satisfied, and the lazy path fires: 1. `ResolveGitHubTokenForServer()` → `TryReadCopilotKeychainToken()` 2. Tries 3 service names: `copilot-cli`, `github-copilot`, `GitHub Copilot` 3. Each spawns `/usr/bin/security find-generic-password -s <name> -w` (3s timeout) 4. **3 macOS password dialogs** (PolyPilot not in the Keychain ACL for this entry) ## Evidence Process monitor output confirmed PolyPilot as the source: ``` security PID=66918, Parent PID=66646 66646 66630 /usr/local/share/dotnet/dotnet exec --runtimeconfig ... security PID=67068, Parent PID=66646 (3s later) security PID=67146, Parent PID=66646 (3s later) ``` ## Fix Set `_resolvedGitHubToken ??= string.Empty` when `CheckAuthStatusAsync` finds the server authenticated. The empty string sentinel means "server can self-authenticate, no Keychain read needed." The lazy path guard `_resolvedGitHubToken == null` is no longer satisfied. `ServerManager.StartServerAsync` uses `!string.IsNullOrEmpty(githubToken)` so the empty sentinel is never forwarded as a `COPILOT_GITHUB_TOKEN` env var. ## Behavior Matrix | Scenario | Before | After | |----------|--------|-------| | Startup, server self-auths | `_resolvedGitHubToken` stays null | Set to `""` (sentinel) | | Later transient auth failure | Lazy path fires → 3 password dialogs | Lazy path skipped → auth banner shown | | Server genuinely cant auth | Lazy path fires once (correct) | Same — `""` not set because auth failed | | ReauthenticateAsync (user) | Fresh Keychain read | Same — clears to null first | | ReconnectAsync (settings) | Clears to null | Same | ## Tests 3064/3064 pass ✅ ## Related - Fixes remaining issue after PureWeen#462 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eWeen#465) ## Problem PR PureWeen#446 added Keychain-reading code to help users whose headless server couldn't self-authenticate. This caused a **4-PR regression chain** (PureWeen#446 → PureWeen#456 → PureWeen#462 → PureWeen#463) of recurring macOS password dialogs: 1. `TryReadCopilotKeychainToken` spawned `/usr/bin/security find-generic-password` for 3 service names sequentially — **3 password dialogs per call** 2. Clicking Allow/Deny on those dialogs **rewrote the Keychain ACL**, breaking the server's own native keytar access 3. The server fell back to its own `/usr/bin/security` calls → **more dialogs every 1-2 hours** 4. PRs PureWeen#456, PureWeen#462, PureWeen#463 each fixed one trigger path but the core approach was wrong ## Root Cause The headless copilot server **authenticates on its own** at startup via its native credential store. This has worked reliably across dozens of worktree switches (different binary paths each time) without any Keychain intervention from PolyPilot. The original PR PureWeen#446 user issue ("Session was not created with authentication info") was a server auth loss that should have been solved by **restarting the server** — which `TryRecoverPersistentServerAsync` already does — not by reading the Keychain from the UI process. ## Fix Remove all Keychain-reading code entirely (**-612 lines, +42 lines**): | Removed | Why | |---------|-----| | `ResolveGitHubTokenForServer()` | Keychain + gh CLI reads — triggers password dialogs | | `TryReadCopilotKeychainToken()` | 3-service-name loop — 3× password dialogs per call | | `RunProcessWithTimeout()` | Only used by above | | `_tokenResolutionLock` SemaphoreSlim | Guarded the lazy Keychain path | | Lazy resolution path in `CheckAuthStatusAsync` | The whole Keychain auto-resolution mechanism | | Sentinel logic (`_resolvedGitHubToken ??= ""`) | No longer needed without the lazy path | | Kept | Why | |------|-----| | `ResolveGitHubTokenFromEnv()` | Env vars are safe, no prompts | | `CheckAuthStatusAsync` + auth banner | Correct — shows "run copilot login" guidance | | `TryRecoverPersistentServerAsync` | Correct — restarts server which re-authenticates on its own | | Re-authenticate button | Correct — restarts server to pick up fresh `copilot login` credentials | ## For users who cannot self-authenticate The auth banner says: _"run `copilot login` in a terminal, then click Re-authenticate."_ This was the **original PR PureWeen#446 design** before Keychain code was added during review rounds. ## Timeline of the regression | PR | What it did | What went wrong | |----|-------------|-----------------| | PureWeen#446 | Added Keychain reads to forward token to server | Triggered password dialogs; corrupted Keychain ACL | | PureWeen#456 | Made Keychain reads lazy (only on first auth failure) | Still fired on every server recovery cycle | | PureWeen#462 | Stopped recovery from clearing the token cache | Token was never set in the first place (no env var) | | PureWeen#463 | Added sentinel on auth success | Server's own internal Keychain reads still fired | | **PureWeen#465** | **Removed all Keychain code** | **The right fix — server self-authenticates** | ## Tests 3057/3057 pass ✅ (7 tests for deleted methods removed, 3 env var tests kept) ## Why server self-authentication works The copilot headless server binary bundled inside PolyPilot.app and the Homebrew `copilot` binary are both signed with the **same GitHub Developer ID certificate**. macOS Keychain ACLs use code-signing identity (not binary path) to control access, so: - `copilot login` (Homebrew binary) writes the token to Keychain with an ACL scoped to the GitHub Developer ID - `copilot --headless` (bundled binary) reads the token via keytar — **same Developer ID = no password prompt** - PolyPilot's `/usr/bin/security` calls used a **different signer** (Apple's built-in tool), which triggered the ACL mismatch and the password dialogs - Clicking Allow/Deny on those dialogs **rewrote the ACL**, breaking the server's native keytar access — creating the regression spiral This was verified via `codesign -dvvv` on both binaries — they share the same Identifier, Authority chain, and team certificate. The Keychain entry's partition list (visible in `securityd` logs) confirms it uses team-id-based access control, not path-based. **Conclusion:** The server has always been able to self-authenticate. The only thing that broke it was PolyPilot calling `/usr/bin/security`, which used a different code-signing identity and corrupted the ACL. Removing those calls is the correct fix. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem
After PR #462 merged, users still get macOS "allow copilot-cli" password dialogs every ~1-2 hours. Process monitoring confirmed the parent PID is the PolyPilot .NET process itself (
dotnet exec), not external copilot binaries.Root Cause
PR #462 prevented
TryRecoverPersistentServerAsyncfrom clearing_resolvedGitHubToken. But the token was never set in the first place — on startup,ResolveGitHubTokenFromEnv()returns null (no env var), and whenCheckAuthStatusAsyncfinds the server authenticated, it returnstruewithout setting the field. So_resolvedGitHubTokenstaysnull.Later, any transient auth failure (connection blip, client recreation, server token rotation) calls
CheckAuthStatusAsyncagain. This time auth fails, the guard_resolvedGitHubToken == nullis satisfied, and the lazy path fires:ResolveGitHubTokenForServer()→TryReadCopilotKeychainToken()copilot-cli,github-copilot,GitHub Copilot/usr/bin/security find-generic-password -s <name> -w(3s timeout)Evidence
Process monitor output confirmed PolyPilot as the source:
Fix
Set
_resolvedGitHubToken ??= string.EmptywhenCheckAuthStatusAsyncfinds the server authenticated. The empty string sentinel means "server can self-authenticate, no Keychain read needed." The lazy path guard_resolvedGitHubToken == nullis no longer satisfied.ServerManager.StartServerAsyncuses!string.IsNullOrEmpty(githubToken)so the empty sentinel is never forwarded as aCOPILOT_GITHUB_TOKENenv var.Behavior Matrix
_resolvedGitHubTokenstays null""(sentinel)""not set because auth failedTests
3064/3064 pass ✅
Related