Skip to content

fix: Recover orphaned sessions lost from active-sessions.json#508

Merged
PureWeen merged 4 commits intomainfrom
fix/recover-orphaned-sessions
Apr 5, 2026
Merged

fix: Recover orphaned sessions lost from active-sessions.json#508
PureWeen merged 4 commits intomainfrom
fix/recover-orphaned-sessions

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

@PureWeen PureWeen commented Apr 5, 2026

Problem

Sessions can be silently lost from active-sessions.json when a new session replaces an old one with the same display name. The merge logic in MergeSessionEntries drops the persisted entry, orphaning all its history on disk.

Discovered when: Session 67e8ab33 (24.9 MB, 12,640 events of history across 15+ turns of work including PR reviews and commits) was orphaned during a relaunch cycle. A scan of ~/.copilot/session-state/ found 84 orphaned PolyPilot sessions from the last 7 days alone.

Root cause: When a new session is created with the same display name as an existing session (e.g., during reconnect or lazy-resume fallback), the merge logic checks activeNames.Contains(existing.DisplayName) and silently drops the persisted entry — even though it has a different session ID and valuable history.

Fix

1. MergeSessionEntries: Rename instead of drop

When a persisted entry has a name collision with an active entry (different session ID, same display name), the persisted entry is now kept under a renamed display name ("MyChat (previous)") instead of being silently dropped.

2. RecoverOrphanedSessionsAsync: Startup orphan scan

New background scan that runs after normal session restore:

  • Scans ~/.copilot/session-state/ for sessions whose workspace.yaml points to a PolyPilot worktree
  • Filters to recent sessions (events.jsonl modified within last 7 days)
  • Excludes sessions already tracked in _sessions or _closedSessionIds
  • Filters out worker/evaluator sessions via IsLikelyWorkerSession() heuristic
  • Recovered sessions are loaded with full history and added to the default group

3. IsLikelyWorkerSession heuristic

Detects orchestrated worker sessions by:

  • Branch patterns: -worker-, -orchestrator-, evaluator, Skill-Validator
  • Summary prefixes: You are the Implementer, You are the Challenger, You are a PR reviewer

Testing

  • All 3,129 tests pass (0 failures)
  • Updated Merge_DuplicateDisplayName_ActiveWins_PersistedDropped → now verifies the persisted entry is kept with (previous) suffix
  • Added 7 new tests:
    • Name collision recovery with multiple persisted entries
    • Closed entries still excluded during name collision
    • Missing dir entries still excluded during name collision
    • IsLikelyWorkerSession for worker/evaluator/normal branches and summaries

@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Apr 5, 2026

PR #508 Final Review (R4) — ✅ SHIP IT

3/3 reviewers completed. R4 blocker (infinite loop) fixed. All clear.

R4 Finding Status
While loop could infinite-loop (2/3 flagged) ✅ Fixed — local counter
Clone completeness (all 15 properties) ✅ Verified
"abort" event type correct ✅ Verified
closedNames ordering correct ✅ Verified
Test assertions correct ✅ Verified

Merging.

When MergeSessionEntries encounters a persisted entry with the same
display name as an active entry but a different session ID, it now
renames the persisted entry to 'Name (previous)' instead of silently
dropping it. This prevents data loss when a session is replaced during
restore (e.g., CLI resume failure → CreateSessionAsync fallback).

Also adds IsLikelyWorkerSession() heuristic for future use in
filtering orchestrated worker/evaluator sessions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the fix/recover-orphaned-sessions branch from b2bb36b to 60960da Compare April 5, 2026 17:00
PureWeen and others added 3 commits April 5, 2026 12:16
When the last event in events.jsonl is 'abort' (user cancelled), the
session was incorrectly treated as still processing during restore.
This caused the session to appear stuck with a spinner, blocking user
interaction. During rapid relaunches (where the abort was <600s old
and the staleness check didn't save us), this led users to create new
sessions — which then triggered the merge name collision that dropped
the original session's history.

Adding 'abort' to the terminal events list is the root cause fix.
The merge rename (previous commit) is the safety net.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…code

- Clone ActiveSessionEntry before renaming to avoid mutating caller's
  persisted list (MergeSessionEntries is a static method)
- Use while loop for fallback name generation to guarantee uniqueness
  even with colliding sessionId prefixes
- Remove IsLikelyWorkerSession and its 5 tests — dead code with no
  production callers. Will re-add when orphan recovery is implemented.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The while loop used allMergedNames.Count as a suffix, which is invariant
inside the loop — if the generated name already existed, the loop would
spin forever. Use an incrementing local counter instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit e134fc6 into main Apr 5, 2026
@PureWeen PureWeen deleted the fix/recover-orphaned-sessions branch April 5, 2026 18:02
PureWeen added a commit that referenced this pull request Apr 5, 2026
…#510)

Split from #508. Handles mobile bridge prompts arriving while a session
is mid-turn.

- Introduces `SessionBusyException` typed exception (replaces fragile
string matching)
- `DispatchBridgePromptAsync` catches busy sessions and queues via
`EnqueueMessage`
- Guards orchestrator path: drops message with log instead of silently
bypassing pipeline
- Fixes 3 pre-existing CS1503 build errors in `WsBridgeIntegrationTests`
(ChangeModelAsync signature)

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant