Simplify built-in presets: keep PR Review Squad + add Implement & Challenge by PureWeen · Pull Request #225 · PureWeen/PolyPilot

PureWeen · 2026-02-26T15:30:34Z

Summary

Simplify built-in multi-agent presets from 6 down to 2:

Kept

📋 PR Review Squad — Orchestrator mode, 5 specialized reviewers
⚔️ Implement & Challenge (new) — OrchestratorReflect adversarial loop

Removed

Code Review Team (overlapped with PR Review Squad)
Multi-Perspective Analysis (broadcast mode, niche use case)
Quick Reflection Cycle (overlapped with Implement & Challenge)
Deep Research (niche use case)

Implement & Challenge Preset

Uses OrchestratorReflect mode with two workers:

Implementer (claude-sonnet-4.6): Codes the solution
Challenger (claude-opus-4.6): Reviews, finds issues, suggests improvements

The loop iterates until the challenger is satisfied ([[GROUP_REFLECT_COMPLETE]]), with stall detection and max 5 iterations (overridable with --max N).

Users who need the removed presets can recreate them as custom presets or via .squad/ team definitions.

Test Changes

Updated all test references from removed presets to remaining ones
All 1400 tests pass

PureWeen · 2026-02-26T16:01:22Z

TODO: Remaining work for this PR and follow-ups

What this PR does

Simplifies built-in multi-agent presets from 6 → 2:

📋 PR Review Squad (kept) — Orchestrator mode with 5 specialized code reviewers
⚔️ Implement & Challenge (new) — OrchestratorReflect adversarial loop with an implementer (claude-sonnet-4.6) and challenger (claude-opus-4.6)

Removed: Code Review Team, Multi-Perspective Analysis, Quick Reflection Cycle, Deep Research (redundant or niche).

How Implement & Challenge works

Uses OrchestratorReflect mode — loops until challenger emits [[GROUP_REFLECT_COMPLETE]] sentinel
Default MaxIterations = 5, user can set up to 100 via the Iterations spinner in the group header
Stall detection: exits if 2 consecutive iterations have >90% Jaccard token similarity
Challenger system prompt instructs it to be adversarial — find bugs, edge cases, missing tests
Implementer system prompt instructs it to write production-quality code and address all feedback

Related PRs

PR Fix Android session never loading for long conversations #214 (merged) — Fixed Android session loading: thread-safe history snapshot + capped turn-end history to 200 messages
PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215 (open) — Bridge history thread-safety hardening: HistoryLock + GetHistorySnapshot() on AgentSessionInfo, locked all background History writes, server-side MaxHistoryPayloadMessages=500 cap, retry logic for failed history requests

Open items from code review of PR #214 (addressed in PR #215)

HistoryLock + GetHistorySnapshot() added to AgentSessionInfo for thread-safe reads
All 6+ background-thread write sites in CopilotService.Bridge.cs now lock on HistoryLock
Server-side MaxHistoryPayloadMessages = 500 cap in WsBridgeServer.SendSessionHistoryToClient
Retry logic for failed history requests in SyncRemoteSessions

Still outstanding (future work)

Silent return on snapshot failure — If GetHistorySnapshot() fails in SendSessionHistoryToClient, we log + return without sending. Client has _requestedHistorySessions keyed, so it won't auto-retry. PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215 added retry logic in SyncRemoteSessions but a more robust retry/backoff mechanism would help.
UI-thread History reads still unlocked — ~60 sites in Dashboard.razor read session.History without locking. These tolerate stale data but could theoretically see torn reads. A proper fix would be a snapshot method or ReadOnlyCollection wrapper on AgentSessionInfo.History so all callers benefit.
LoadFullRemoteHistoryAsync has no server-side cap — passes limit: null. Server now caps at 500 via MaxHistoryPayloadMessages, but the client-side API still allows requesting unlimited.
Consider raising UI max iterations above 100 — Currently the Iterations spinner in Dashboard.razor and SessionSidebar.razor caps at max="100". For truly long-running implement & challenge loops, this could be raised or made unlimited.
Implement & Challenge could benefit from shared worktree context — The preset sets DefaultWorktreeStrategy = WorktreeStrategy.Shared and SharedContext explaining the workflow, but the orchestrator routing prompt could be enhanced with more specific task-splitting guidance.

Key files

PolyPilot/Models/ModelCapabilities.cs — GroupPreset definitions, BuiltIn array
PolyPilot/Models/ReflectionCycle.cs — Reflection state machine, stall detection, sentinel parsing
PolyPilot/Services/CopilotService.Organization.cs — SendViaOrchestratorReflectAsync, orchestration engine
PolyPilot/Models/AgentSessionInfo.cs — HistoryLock, GetHistorySnapshot() (in PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215)
PolyPilot/Services/CopilotService.Bridge.cs — Remote mode event handlers, history sync
PolyPilot/Services/WsBridgeServer.cs — WebSocket server, SendSessionHistoryToClient

Testing

All 1400 tests pass. Test files updated:

SessionOrganizationTests.cs — preset count, group creation scenarios
MultiAgentRegressionTests.cs — renamed persona test to Implement & Challenge
SquadDiscoveryTests.cs — updated built-in preset name reference
multi-agent-scenarios.json — all scenario preset references updated

Two-agent OrchestratorReflect preset: an Implementer builds the solution, a Challenger reviews it, and they loop until the Challenger approves (emits [[GROUP_REFLECT_COMPLETE]]) or max iterations reached. - Implementer: claude-sonnet-4.6 (fast, good at coding) - Challenger: claude-opus-4.6 (thorough, good at finding bugs) - Orchestrator: claude-opus-4.6 (routes between the two) - RoutingContext enforces strict alternation pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…llenge Remove 4 redundant built-in presets (Code Review Team, Multi-Perspective Analysis, Quick Reflection Cycle, Deep Research) and add new Implement & Challenge preset using OrchestratorReflect mode. Built-in presets now: - PR Review Squad (Orchestrator, 5 reviewers) - Implement & Challenge (OrchestratorReflect, implementer + challenger loop) The Implement & Challenge preset uses an adversarial loop where an implementer codes a solution and a challenger reviews it, iterating until the challenger approves via [[GROUP_REFLECT_COMPLETE]] sentinel. Default max iterations: 5, overridable with --max N. Update all test references to use remaining presets. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…for repo presets, update NuGet packages - Built-in presets are now always shown in the Built-in section (repo/user presets with same name are skipped) - Added delete button (trash icon) for From Repo presets in the preset picker - Added UserPresets.DeleteRepoPreset() to remove .squad/ directories - Updated NuGet packages: GitHub.Copilot.SDK 0.1.26, Markdig 1.0.0, MauiDevFlow 0.11.0, Test.Sdk 18.3.0 - Updated test: GetAll_RepoDoesNotOverride_BuiltInByName Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…eset priority docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

FlushCurrentResponse on AssistantTurnEndEvent was clearing CurrentResponse before CompleteResponse (on SessionIdleEvent) could read it for the TCS result. This caused SendPromptAndWaitAsync to return "" and ParseTaskAssignments to find 0 worker assignments. Add FlushedResponse StringBuilder to SessionState that accumulates text flushed mid-turn. CompleteResponse now combines FlushedResponse + CurrentResponse for the TCS result, ensuring orchestrator dispatch gets the full plan text. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Worker sessions dispatched by orchestrators have no interactive user to approve tool permission requests. Without OnPermissionRequest, the SDK returns 'denied-no-approval-rule-and-could-not-request-from-user' for every tool call (view, edit, powershell, grep, glob, etc). Set OnPermissionRequest = AutoApprovePermissions on all SessionConfig and ResumeSessionConfig instances so tools execute without prompting. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Update worker prompts: Implementer must make actual code changes, build, test, and commit. Challenger must run git diff and verify changes. - Fix RoutingContext: reference actual worker names (worker-1, worker-2) instead of abstract role names (Implementer, Challenger). - Add MaxReflectIterations to GroupPreset and SessionGroup so presets can configure how many reflect loops to run (Implement & Challenge = 10). - Wire up MaxReflectIterations in Dashboard and SessionSidebar UI so the preset default is used when the user hasn't overridden it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The dropdown now defaults to 'Preset default' instead of 'Each agent gets own worktree'. This lets each preset define its own optimal strategy — e.g., Implement & Challenge defaults to Shared (workers alternate, need to see each other's file changes), while PR Review Squad defaults to FullyIsolated (workers run in parallel on different branches). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

AutoStartReflectionIfNeeded was only called from the group-level input boxes (SendToMultiAgentGroup/SendToExpandedMultiAgentGroup), but NOT from the orchestrator's own chat input path at line 1278. This caused ReflectionState to remain null, making SendViaOrchestratorReflectAsync fall back to plain orchestrator mode — no worker dispatch, no reflect loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@worker

The orchestrator prompt had two issues causing it to use SDK tools (task, grep, edit) directly instead of producing @worker: blocks: 1. Line 989 said 'If you can handle the request entirely yourself, just respond normally without any @worker blocks' — giving the model explicit permission to skip delegation. 2. The RoutingContext used tool-like language ('run git diff', 'build, test, commit') that the model pattern-matched to its SDK tools. Fix: Rewrite the planning prompt to say the orchestrator is a DISPATCHER ONLY with no tools, and reword the RoutingContext to describe the dispatch pattern without tool-like language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Worker sessions that were created but never received messages have no events.jsonl, so the CLI server returns 'Session not found' on resume. This caused multi-agent workers to silently vanish from the sidebar after relaunch. Now falls back to CreateSessionAsync when resume fails with 'Session not found', preserving the session in _sessions so it remains visible and functional. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Session deletion triggered rapid render batch churn: 1. Session.DisposeAsync() talked to CLI, triggering SDK events 2. OnStateChanged fired from CloseSessionAsync 3. ReconcileOrganization fired another OnStateChanged This caused Blazor render batch ordering races ('r.parentNode.removeChild' on null) and 'unexpected acknowledgement for render batch N' errors. Fix: move DisposeAsync to fire-and-forget after UI update, and remove redundant ReconcileOrganization call (session already removed from _sessions, reconciliation just caused extra state churn). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Previous fix removed the JS DOM portal but kept the dialog in Blazor's render tree. The overlay showed but the dialog was clipped/invisible due to ancestor overflow constraints that position:fixed alone can't escape in WebView with scoped CSS. Adopt the approach from PR #226: render the close-confirmation dialog entirely via JS (window.showCloseSessionDialog), creating DOM elements imperatively and appending to document.body. Blazor never tracks these elements, so no render batch desync on open/close/delete. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Track dispatched workers across iterations in the reflect loop - Override [[GROUP_REFLECT_COMPLETE]] if not all workers have participated - Include RoutingContext in synthesis prompt so orchestrator remembers I&C pattern - Show worker participation status (missing workers) in synthesis prompt - Add DISPATCHER ONLY language to replan prompt for iteration 2+ - Pass RoutingContext to replan prompt for consistent routing awareness Fixes: orchestrator prematurely completing after only worker-1 (Implementer) runs, without waiting for worker-2 (Challenger) to review. Also fixes the loop stopping after Challenger finds issues instead of continuing to have Implementer fix them. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Track only successfully completed workers (not just dispatched) so failed workers don't count as having participated - Replace continue statements with flow-through logic so stall detection, SaveOrganization, and UI updates always run - Add comment documenting HashSet sequential access safety Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SDK 0.1.26 (PR github/copilot-sdk#485) now natively maps maccatalyst RIDs to osx RIDs for CLI download. Our _FixCopilotRidForMacCatalyst workaround's condition ('$(_CopilotPlatform)' == '') no longer fires since the SDK sets it first, causing _CopilotOriginalRid to never be set, which skipped both the maccatalyst-arm64 runtime copy AND the MonoBundle copy. Clean builds would produce an app with no copilot CLI. Changes: - Remove _FixCopilotRidForMacCatalyst (SDK handles this now) - Update _CopyCopilotCliForMacCatalyst to detect maccatalyst via GetTargetPlatformIdentifier instead of _CopilotOriginalRid - Update _IncludeCopilotCliInBundle with same condition fix - Use $(RuntimeIdentifier) directly for the maccatalyst output path Fixes: github/copilot-sdk#454 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

SDK 0.1.26 (github/copilot-sdk#485) handles the download mapping, but MAUI still won't bundle the binary because the SDK's ContentWithTargetPath registration lacks PublishFolderType metadata. The fix is a single target that re-registers the item with PublishFolderType=Assembly and a flat TargetPath so MAUI places it in MonoBundle. Replaced 5 custom MSBuild targets (55 lines) with 1 target (10 lines). Verified via clean build: copilot binary lands in .app/Contents/MonoBundle. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

CloseSessionAsync removed sessions from _sessions dict and active-sessions.json but NOT from Organization.Sessions. The ReconcileOrganization call was removed earlier to fix render crashes. On restart, the session metadata in organization.json caused the deleted session to reappear. Fix: explicitly remove from Organization.Sessions and call SaveOrganization() in CloseSessionAsync, avoiding the full ReconcileOrganization() that triggers render batch churn. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Make DISPATCHER ONLY language conditional via dispatcherOnly parameter so PR Review Squad orchestrator can still use tools (only I&C reflect mode sets dispatcherOnly=true) - Track attempted vs successful workers separately to distinguish 'never dispatched' from 'dispatched but failed' in override messages - Hoist allWorkersDispatched computation before evaluator/self-eval branch to eliminate variable duplication Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When multiple active-sessions.json entries share the same display name but different session IDs (e.g., from mode switches or reconnects), CloseSessionAsync only tracked the current session's ID in _closedSessionIds. The other entries survived the merge because their IDs weren't in the closed set, and their session-state directories still existed on disk. Fix: add _closedSessionNames tracking alongside _closedSessionIds. MergeSessionEntries now filters by both session ID and display name, ensuring all entries for a closed session are excluded. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- CloseSessionAsync: use FlushSaveOrganization() instead of debounced SaveOrganization() to prevent ReconcileOrganization re-adding the deleted session during the 2-second debounce window - DeleteRepoPreset: validate SourcePath ends with .squad or .ai-team before recursive delete to prevent accidental deletion of unrelated dirs - SessionErrorEvent: reorder FlushCurrentResponse before FlushedResponse .Clear() so partial response text is preserved in history on errors - Add 3 tests for closedNames merge filtering: by name, case-insensitive, and duplicate session IDs with same display name Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Strip [[GROUP_REFLECT_COMPLETE]] from worker responses in BuildSynthesisPrompt to prevent Challenger's approval sentinel from leaking into synthesis and causing premature reflection loop termination - Remove unused allWorkersAttempted variable (dead code) - Add guard in StartGroupReflection to skip if reflection is already active - Add 6 tests: sentinel stripping (4), reflection guard (2) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… FlushedResponse - SaveOrganization debounce: timer callback now re-serializes live state instead of writing a pre-captured stale JSON snapshot. Prevents race where FlushSaveOrganization writes newer state, then stale timer overwrites it. - LoadOrganization self-healing: detects groups with IsMultiAgent=false that have orchestrator sessions and restores IsMultiAgent=true on load. - GetOrCreateRepoGroup defensive check: skips groups with orchestrator sessions even if IsMultiAgent is somehow false, preventing repo group overwrite. - AbortAsync: now combines FlushedResponse + CurrentResponse for partial response preservation (was only using CurrentResponse). - Fix BuildOrchestratorPlanningPrompt test parameter count mismatch. - Add 3 new tests: LoadOrganization healing, GetOrCreateRepoGroup defensive skip, SaveOrganization debounce writes live state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…story, add [Collection] to flaky test - CopilotService.Events.cs: wrap FlushCurrentResponse(state) call in Invoke() in ToolExecutionStartEvent handler to ensure it runs on the UI thread - CopilotService.cs: AbortSessionAsync uses only CurrentResponse for partial history save — FlushedResponse content was already committed to History by FlushCurrentResponse, so combining both caused duplicate history entries - SessionOrganizationTests.cs: add [Collection("BaseDir")] to GroupingStabilityTests to prevent parallel SetBaseDirForTesting calls causing flaky test failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… guard - SaveOrganization timer callback now marshals to UI thread via InvokeOnUI to prevent ThreadPool/UI thread race on List<T> serialization - DeleteRepoPreset: added Path.GetFullPath containment check to prevent path traversal attacks via crafted SourcePath - Reflection guard: added debug logging, documented UI-thread-only - Fixed test compilation: CreateTestService -> CreateService, GroupPreset.DeleteRepoPreset -> UserPresets.DeleteRepoPreset - Added 7 new tests: sentinel stripping (2), path traversal (2), debounce InvokeOnUI validation (1), debounce live-state (1), nonexistent preset (1) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Move History.Add calls into Invoke/InvokeOnUI blocks to prevent concurrent List<ChatMessage> writes from background SDK event threads and UI thread. - ToolExecutionStartEvent: moved FlushCurrentResponse + History.Add + event invocations into single Invoke() block - ApplyReasoningUpdate: wrapped new reasoning message History.Add in InvokeOnUI() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PR #222 fixed the test race that corrupts repos.json, but the damage was already done — real repos were replaced with test data (repo-1). The bare clones still exist on disk but aren't tracked in repos.json. Load() now calls HealMissingRepos() which scans the repos/ directory for bare clones that exist on disk but aren't in the state file, reads their remote URL from git config, and re-adds them. Also reconstructs missing worktree entries from the worktrees/ directory. Added 4 tests for the self-healing logic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- ApplyReasoningUpdate: PendingReasoningMessages ConcurrentDictionary prevents duplicate reasoning ChatMessages when rapid deltas arrive before InvokeOnUI fires. Messages registered in map before posting to UI thread; removed after History.Add completes. Map cleared in CompleteResponse, SendPromptAsync, AbortSessionAsync, error handler, watchdog, and reconnect paths. - SendViaOrchestratorReflectAsync: moved reflectState.IsActive=false and CompletedAt into finally block so they always run, even on OperationCanceledException. Previously, cancellation would permanently block future reflections for that group. - Added 4 reasoning dedup tests (ReasoningDedupTests.cs) - Added 2 reflection cancellation tests (ReflectionCycleTests.cs) - All 1546 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When ResumeSessionAsync fails with 'Session not found' / 'corrupt' / 'session file' errors, the fallback path was calling CreateSessionAsync without recovering history from the old session's events.jsonl. This caused sessions to appear empty after app relaunch. The fix adds history recovery to the fallback path: 1. Load history from old session via LoadHistoryFromDisk(entry.SessionId) 2. Inject recovered messages into the new session's Info.History 3. Set MessageCount and LastReadMessageCount 4. Call RestoreUsageStats(entry) to preserve CreatedAt, token counts 5. Sync recovered history to chat DB under the new session ID 6. Add a system message indicating the session was recreated Bug introduced in PR #225 (commit 19219f1), worsened by PR #308 (commit 72886a2) which expanded the catch conditions without adding history recovery. Adds 5 structural regression tests guarding the fallback path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

## Problem When `ResumeSessionAsync` fails during app restart ("Session not found" / "corrupt" / "session file" errors), the fallback path called `CreateSessionAsync` which created a **blank** session with zero messages. The old session's `events.jsonl` was never loaded, so all conversation history was lost. **User impact:** Sessions appeared empty after app relaunch — the user would see their session restored but with 0 messages, forcing them to create duplicate sessions. ## Root Cause Introduced in PR #225 (commit `19219f1`) which added the "Session not found" fallback. Worsened in PR #308 which expanded the catch conditions to include "corrupt" and "session file" errors. Neither PR added history recovery to the fallback path. `ResumeSessionAsync` (the success path) correctly loads history via `LoadHistoryFromDisk(sessionId)`. The fallback path skipped this entirely. ## Fix In the fallback path of `RestorePreviousSessionsAsync`: 1. **Load history** from the old session's `events.jsonl` via `LoadHistoryFromDisk(entry.SessionId)` before creating the new session 2. **Inject history** into the recreated session's `Info.History` 3. **Set message counts** (`MessageCount`, `LastReadMessageCount`) so the UI shows the correct state 4. **Restore usage stats** (`CreatedAt`, token counts) via `RestoreUsageStats(entry)` 5. **Sync to DB** via `BulkInsertAsync` under the new session ID 6. **Add indicator** system message: "🔄 Session recreated — conversation history recovered from previous session." ## Duplicate Session Issue The user also observed a duplicate session (`session-20260311-001729`) created with the same repo as the original. This was a downstream consequence: after seeing the original session restored with zero messages, the user manually created a new session. With this fix, the full history is preserved, eliminating the need for duplicates. ## Tests Added 5 structural regression tests in `SessionPersistenceTests.cs`: - `RestoreFallback_LoadsHistoryFromOldSession` — verifies `LoadHistoryFromDisk` appears before `CreateSessionAsync` - `RestoreFallback_InjectsHistoryIntoRecreatedSession` — verifies history injection + message count - `RestoreFallback_RestoresUsageStats` — verifies `RestoreUsageStats` call - `RestoreFallback_SyncsHistoryToDatabase` — verifies `BulkInsertAsync` call - `RestoreFallback_AddsReconnectionIndicator` — verifies system message All 2464 tests pass. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dinisdopion-sys · 2026-03-13T05:51:33Z

TODO: Осталась работа над этим PR и последующими действиями.

Что делает этот PR?

Упрощает настройку встроенных многоагентных предустановок с 6 до 2:

📋 Команда по проверке запросов на слияние (сохранена) — режим оркестратора с 5 специализированными рецензентами кода.

⚔️ Реализация и оспаривание (новое) — Цикл состязательного взаимодействия OrchestratorReflect с реализатором (claude-sonnet-4.6) и претендентом (claude-opus-4.6)

Удалено: Группа проверки кода, Многосторонний анализ, Цикл быстрой рефлексии, Глубокое исследование (избыточное или узкоспециализированное).

Как работает метод внедрения и анализа

Использует OrchestratorReflectрежим — цикл повторяется до тех пор, пока претендент не выпустит [[GROUP_REFLECT_COMPLETE]]сигнальный модуль.

По умолчанию MaxIterations = 5пользователь может установить до 100 итераций с помощью выпадающего списка в заголовке группы.

Обнаружение задержек: завершает работу, если две последовательные итерации имеют сходство токенов Жаккара более 90%.

Система Challenger дает указание действовать враждебно — находить ошибки, граничные случаи, отсутствующие тесты.

Система, запускаемая программистом, предписывает ему писать код производственного качества и учитывать все отзывы.

Связанные пресс-релизы

Исправлена ошибка загрузки сессии Android при длительных диалогах #214 (объединено) — Исправлена загрузка сессии Android: потокобезопасный снимок истории + ограничение истории завершения реплик до 200 сообщений

PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215 (open) — Bridge history thread-safety hardening: HistoryLock + GetHistorySnapshot() on AgentSessionInfo, locked all background History writes, server-side MaxHistoryPayloadMessages=500 cap, retry logic for failed history requests

Open items from code review of PR #214 (addressed in PR #215)

HistoryLock + GetHistorySnapshot() added to AgentSessionInfo for thread-safe reads

All 6+ background-thread write sites in CopilotService.Bridge.cs now lock on HistoryLock

Server-side MaxHistoryPayloadMessages = 500 cap in WsBridgeServer.SendSessionHistoryToClient

Retry logic for failed history requests in SyncRemoteSessions

Still outstanding (future work)

Silent return on snapshot failure — If GetHistorySnapshot() fails in SendSessionHistoryToClient, we log + return without sending. Client has _requestedHistorySessions keyed, so it won't auto-retry. PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215 added retry logic in SyncRemoteSessions but a more robust retry/backoff mechanism would help.

UI-thread History reads still unlocked — ~60 sites in Dashboard.razor read session.History without locking. These tolerate stale data but could theoretically see torn reads. A proper fix would be a snapshot method or ReadOnlyCollection wrapper on AgentSessionInfo.History so all callers benefit.

LoadFullRemoteHistoryAsync has no server-side cap — passes limit: null. Server now caps at 500 via MaxHistoryPayloadMessages, but the client-side API still allows requesting unlimited.

Consider raising UI max iterations above 100 — Currently the Iterations spinner in Dashboard.razor and SessionSidebar.razor caps at max="100". For truly long-running implement & challenge loops, this could be raised or made unlimited.

Implement & Challenge could benefit from shared worktree context — The preset sets DefaultWorktreeStrategy = WorktreeStrategy.Shared and SharedContext explaining the workflow, but the orchestrator routing prompt could be enhanced with more specific task-splitting guidance.

Key files

PolyPilot/Models/ModelCapabilities.cs — GroupPreset definitions, BuiltIn array

PolyPilot/Models/ReflectionCycle.cs — Reflection state machine, stall detection, sentinel parsing

PolyPilot/Services/CopilotService.Organization.cs — SendViaOrchestratorReflectAsync, orchestration engine

PolyPilot/Models/AgentSessionInfo.cs — HistoryLock, GetHistorySnapshot() (in PR Harden bridge History thread-safety with comprehensive HistoryLock coverage #215)

PolyPilot/Services/CopilotService.Bridge.cs — Remote mode event handlers, history sync

PolyPilot/Services/WsBridgeServer.cs — WebSocket server, SendSessionHistoryToClient

Testing

All 1400 tests pass. Test files updated:

SessionOrganizationTests.cs — preset count, group creation scenarios

MultiAgentRegressionTests.cs — renamed persona test to Implement & Challenge

SquadDiscoveryTests.cs — updated built-in preset name reference

multi-agent-scenarios.json — all scenario preset references updated

## Problem When `ResumeSessionAsync` fails during app restart ("Session not found" / "corrupt" / "session file" errors), the fallback path called `CreateSessionAsync` which created a **blank** session with zero messages. The old session's `events.jsonl` was never loaded, so all conversation history was lost. **User impact:** Sessions appeared empty after app relaunch — the user would see their session restored but with 0 messages, forcing them to create duplicate sessions. ## Root Cause Introduced in PR PureWeen#225 (commit `19219f1`) which added the "Session not found" fallback. Worsened in PR PureWeen#308 which expanded the catch conditions to include "corrupt" and "session file" errors. Neither PR added history recovery to the fallback path. `ResumeSessionAsync` (the success path) correctly loads history via `LoadHistoryFromDisk(sessionId)`. The fallback path skipped this entirely. ## Fix In the fallback path of `RestorePreviousSessionsAsync`: 1. **Load history** from the old session's `events.jsonl` via `LoadHistoryFromDisk(entry.SessionId)` before creating the new session 2. **Inject history** into the recreated session's `Info.History` 3. **Set message counts** (`MessageCount`, `LastReadMessageCount`) so the UI shows the correct state 4. **Restore usage stats** (`CreatedAt`, token counts) via `RestoreUsageStats(entry)` 5. **Sync to DB** via `BulkInsertAsync` under the new session ID 6. **Add indicator** system message: "🔄 Session recreated — conversation history recovered from previous session." ## Duplicate Session Issue The user also observed a duplicate session (`session-20260311-001729`) created with the same repo as the original. This was a downstream consequence: after seeing the original session restored with zero messages, the user manually created a new session. With this fix, the full history is preserved, eliminating the need for duplicates. ## Tests Added 5 structural regression tests in `SessionPersistenceTests.cs`: - `RestoreFallback_LoadsHistoryFromOldSession` — verifies `LoadHistoryFromDisk` appears before `CreateSessionAsync` - `RestoreFallback_InjectsHistoryIntoRecreatedSession` — verifies history injection + message count - `RestoreFallback_RestoresUsageStats` — verifies `RestoreUsageStats` call - `RestoreFallback_SyncsHistoryToDatabase` — verifies `BulkInsertAsync` call - `RestoreFallback_AddsReconnectionIndicator` — verifies system message All 2464 tests pass. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen force-pushed the bridge-history-hardening-v2 branch from 9a1a6b7 to bc3b5b9 Compare February 26, 2026 17:34

PureWeen mentioned this pull request Feb 26, 2026

Rework multi-agent orchestrator: intercept SDK task tool to surface sub-agents as PolyPilot sessions #229

Closed

PureWeen force-pushed the bridge-history-hardening-v2 branch 4 times, most recently from d4d406d to 4ecb7e2 Compare February 27, 2026 01:26

PureWeen and others added 23 commits February 26, 2026 21:53

Update copilot instructions: add --no-build ban, Windows commands, pr…

37a1838

…eset priority docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen and others added 5 commits February 26, 2026 21:53

Fix flaky WsBridge tests: Task.Delay -> WaitForAsync

4bebd97

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen force-pushed the bridge-history-hardening-v2 branch from 95f7421 to c383c9f Compare February 27, 2026 03:59

PureWeen merged commit 19219f1 into main Feb 27, 2026

PureWeen deleted the bridge-history-hardening-v2 branch February 27, 2026 04:49

PureWeen mentioned this pull request Mar 12, 2026

Fix: Restore fallback preserves conversation history #364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify built-in presets: keep PR Review Squad + add Implement & Challenge#225

Simplify built-in presets: keep PR Review Squad + add Implement & Challenge#225
PureWeen merged 29 commits intomainfrom
bridge-history-hardening-v2

PureWeen commented Feb 26, 2026

Uh oh!

PureWeen commented Feb 26, 2026

Uh oh!

dinisdopion-sys commented Mar 13, 2026

TODO: Осталась работа над этим PR и последующими действиями.

Что делает этот PR?

Как работает метод внедрения и анализа

Связанные пресс-релизы

Open items from code review of PR #214 (addressed in PR #215)

Still outstanding (future work)

Key files

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

PureWeen commented Feb 26, 2026

Summary

Kept

Removed

Implement & Challenge Preset

Test Changes

Uh oh!

PureWeen commented Feb 26, 2026

TODO: Remaining work for this PR and follow-ups

What this PR does

How Implement & Challenge works

Related PRs

Open items from code review of PR #214 (addressed in PR #215)

Still outstanding (future work)

Key files

Testing

Uh oh!

dinisdopion-sys commented Mar 13, 2026

TODO: Осталась работа над этим PR и последующими действиями.

Что делает этот PR?

Как работает метод внедрения и анализа

Связанные пресс-релизы

Open items from code review of PR #214 (addressed in PR #215)

Still outstanding (future work)

Key files

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants