Simplify built-in presets: keep PR Review Squad + add Implement & Challenge#225
Merged
Simplify built-in presets: keep PR Review Squad + add Implement & Challenge#225
Conversation
Owner
Author
TODO: Remaining work for this PR and follow-upsWhat this PR doesSimplifies built-in multi-agent presets from 6 → 2:
Removed: Code Review Team, Multi-Perspective Analysis, Quick Reflection Cycle, Deep Research (redundant or niche). How Implement & Challenge works
Related PRs
Open items from code review of PR #214 (addressed in PR #215)
Still outstanding (future work)
Key files
TestingAll 1400 tests pass. Test files updated:
|
9a1a6b7 to
bc3b5b9
Compare
d4d406d to
4ecb7e2
Compare
Two-agent OrchestratorReflect preset: an Implementer builds the solution, a Challenger reviews it, and they loop until the Challenger approves (emits [[GROUP_REFLECT_COMPLETE]]) or max iterations reached. - Implementer: claude-sonnet-4.6 (fast, good at coding) - Challenger: claude-opus-4.6 (thorough, good at finding bugs) - Orchestrator: claude-opus-4.6 (routes between the two) - RoutingContext enforces strict alternation pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…llenge Remove 4 redundant built-in presets (Code Review Team, Multi-Perspective Analysis, Quick Reflection Cycle, Deep Research) and add new Implement & Challenge preset using OrchestratorReflect mode. Built-in presets now: - PR Review Squad (Orchestrator, 5 reviewers) - Implement & Challenge (OrchestratorReflect, implementer + challenger loop) The Implement & Challenge preset uses an adversarial loop where an implementer codes a solution and a challenger reviews it, iterating until the challenger approves via [[GROUP_REFLECT_COMPLETE]] sentinel. Default max iterations: 5, overridable with --max N. Update all test references to use remaining presets. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…for repo presets, update NuGet packages - Built-in presets are now always shown in the Built-in section (repo/user presets with same name are skipped) - Added delete button (trash icon) for From Repo presets in the preset picker - Added UserPresets.DeleteRepoPreset() to remove .squad/ directories - Updated NuGet packages: GitHub.Copilot.SDK 0.1.26, Markdig 1.0.0, MauiDevFlow 0.11.0, Test.Sdk 18.3.0 - Updated test: GetAll_RepoDoesNotOverride_BuiltInByName Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eset priority docs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
FlushCurrentResponse on AssistantTurnEndEvent was clearing CurrentResponse before CompleteResponse (on SessionIdleEvent) could read it for the TCS result. This caused SendPromptAndWaitAsync to return "" and ParseTaskAssignments to find 0 worker assignments. Add FlushedResponse StringBuilder to SessionState that accumulates text flushed mid-turn. CompleteResponse now combines FlushedResponse + CurrentResponse for the TCS result, ensuring orchestrator dispatch gets the full plan text. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Worker sessions dispatched by orchestrators have no interactive user to approve tool permission requests. Without OnPermissionRequest, the SDK returns 'denied-no-approval-rule-and-could-not-request-from-user' for every tool call (view, edit, powershell, grep, glob, etc). Set OnPermissionRequest = AutoApprovePermissions on all SessionConfig and ResumeSessionConfig instances so tools execute without prompting. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Update worker prompts: Implementer must make actual code changes, build, test, and commit. Challenger must run git diff and verify changes. - Fix RoutingContext: reference actual worker names (worker-1, worker-2) instead of abstract role names (Implementer, Challenger). - Add MaxReflectIterations to GroupPreset and SessionGroup so presets can configure how many reflect loops to run (Implement & Challenge = 10). - Wire up MaxReflectIterations in Dashboard and SessionSidebar UI so the preset default is used when the user hasn't overridden it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dropdown now defaults to 'Preset default' instead of 'Each agent gets own worktree'. This lets each preset define its own optimal strategy — e.g., Implement & Challenge defaults to Shared (workers alternate, need to see each other's file changes), while PR Review Squad defaults to FullyIsolated (workers run in parallel on different branches). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AutoStartReflectionIfNeeded was only called from the group-level input boxes (SendToMultiAgentGroup/SendToExpandedMultiAgentGroup), but NOT from the orchestrator's own chat input path at line 1278. This caused ReflectionState to remain null, making SendViaOrchestratorReflectAsync fall back to plain orchestrator mode — no worker dispatch, no reflect loop. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The orchestrator prompt had two issues causing it to use SDK tools (task, grep, edit) directly instead of producing @worker: blocks: 1. Line 989 said 'If you can handle the request entirely yourself, just respond normally without any @worker blocks' — giving the model explicit permission to skip delegation. 2. The RoutingContext used tool-like language ('run git diff', 'build, test, commit') that the model pattern-matched to its SDK tools. Fix: Rewrite the planning prompt to say the orchestrator is a DISPATCHER ONLY with no tools, and reword the RoutingContext to describe the dispatch pattern without tool-like language. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Worker sessions that were created but never received messages have no events.jsonl, so the CLI server returns 'Session not found' on resume. This caused multi-agent workers to silently vanish from the sidebar after relaunch. Now falls back to CreateSessionAsync when resume fails with 'Session not found', preserving the session in _sessions so it remains visible and functional. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Session deletion triggered rapid render batch churn:
1. Session.DisposeAsync() talked to CLI, triggering SDK events
2. OnStateChanged fired from CloseSessionAsync
3. ReconcileOrganization fired another OnStateChanged
This caused Blazor render batch ordering races ('r.parentNode.removeChild'
on null) and 'unexpected acknowledgement for render batch N' errors.
Fix: move DisposeAsync to fire-and-forget after UI update, and remove
redundant ReconcileOrganization call (session already removed from
_sessions, reconciliation just caused extra state churn).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previous fix removed the JS DOM portal but kept the dialog in Blazor's render tree. The overlay showed but the dialog was clipped/invisible due to ancestor overflow constraints that position:fixed alone can't escape in WebView with scoped CSS. Adopt the approach from PR #226: render the close-confirmation dialog entirely via JS (window.showCloseSessionDialog), creating DOM elements imperatively and appending to document.body. Blazor never tracks these elements, so no render batch desync on open/close/delete. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Track dispatched workers across iterations in the reflect loop - Override [[GROUP_REFLECT_COMPLETE]] if not all workers have participated - Include RoutingContext in synthesis prompt so orchestrator remembers I&C pattern - Show worker participation status (missing workers) in synthesis prompt - Add DISPATCHER ONLY language to replan prompt for iteration 2+ - Pass RoutingContext to replan prompt for consistent routing awareness Fixes: orchestrator prematurely completing after only worker-1 (Implementer) runs, without waiting for worker-2 (Challenger) to review. Also fixes the loop stopping after Challenger finds issues instead of continuing to have Implementer fix them. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Track only successfully completed workers (not just dispatched) so failed workers don't count as having participated - Replace continue statements with flow-through logic so stall detection, SaveOrganization, and UI updates always run - Add comment documenting HashSet sequential access safety Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SDK 0.1.26 (PR github/copilot-sdk#485) now natively maps maccatalyst RIDs to osx RIDs for CLI download. Our _FixCopilotRidForMacCatalyst workaround's condition ('$(_CopilotPlatform)' == '') no longer fires since the SDK sets it first, causing _CopilotOriginalRid to never be set, which skipped both the maccatalyst-arm64 runtime copy AND the MonoBundle copy. Clean builds would produce an app with no copilot CLI. Changes: - Remove _FixCopilotRidForMacCatalyst (SDK handles this now) - Update _CopyCopilotCliForMacCatalyst to detect maccatalyst via GetTargetPlatformIdentifier instead of _CopilotOriginalRid - Update _IncludeCopilotCliInBundle with same condition fix - Use $(RuntimeIdentifier) directly for the maccatalyst output path Fixes: github/copilot-sdk#454 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SDK 0.1.26 (github/copilot-sdk#485) handles the download mapping, but MAUI still won't bundle the binary because the SDK's ContentWithTargetPath registration lacks PublishFolderType metadata. The fix is a single target that re-registers the item with PublishFolderType=Assembly and a flat TargetPath so MAUI places it in MonoBundle. Replaced 5 custom MSBuild targets (55 lines) with 1 target (10 lines). Verified via clean build: copilot binary lands in .app/Contents/MonoBundle. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CloseSessionAsync removed sessions from _sessions dict and active-sessions.json but NOT from Organization.Sessions. The ReconcileOrganization call was removed earlier to fix render crashes. On restart, the session metadata in organization.json caused the deleted session to reappear. Fix: explicitly remove from Organization.Sessions and call SaveOrganization() in CloseSessionAsync, avoiding the full ReconcileOrganization() that triggers render batch churn. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Make DISPATCHER ONLY language conditional via dispatcherOnly parameter so PR Review Squad orchestrator can still use tools (only I&C reflect mode sets dispatcherOnly=true) - Track attempted vs successful workers separately to distinguish 'never dispatched' from 'dispatched but failed' in override messages - Hoist allWorkersDispatched computation before evaluator/self-eval branch to eliminate variable duplication Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When multiple active-sessions.json entries share the same display name but different session IDs (e.g., from mode switches or reconnects), CloseSessionAsync only tracked the current session's ID in _closedSessionIds. The other entries survived the merge because their IDs weren't in the closed set, and their session-state directories still existed on disk. Fix: add _closedSessionNames tracking alongside _closedSessionIds. MergeSessionEntries now filters by both session ID and display name, ensuring all entries for a closed session are excluded. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- CloseSessionAsync: use FlushSaveOrganization() instead of debounced SaveOrganization() to prevent ReconcileOrganization re-adding the deleted session during the 2-second debounce window - DeleteRepoPreset: validate SourcePath ends with .squad or .ai-team before recursive delete to prevent accidental deletion of unrelated dirs - SessionErrorEvent: reorder FlushCurrentResponse before FlushedResponse .Clear() so partial response text is preserved in history on errors - Add 3 tests for closedNames merge filtering: by name, case-insensitive, and duplicate session IDs with same display name Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Strip [[GROUP_REFLECT_COMPLETE]] from worker responses in BuildSynthesisPrompt to prevent Challenger's approval sentinel from leaking into synthesis and causing premature reflection loop termination - Remove unused allWorkersAttempted variable (dead code) - Add guard in StartGroupReflection to skip if reflection is already active - Add 6 tests: sentinel stripping (4), reflection guard (2) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… FlushedResponse - SaveOrganization debounce: timer callback now re-serializes live state instead of writing a pre-captured stale JSON snapshot. Prevents race where FlushSaveOrganization writes newer state, then stale timer overwrites it. - LoadOrganization self-healing: detects groups with IsMultiAgent=false that have orchestrator sessions and restores IsMultiAgent=true on load. - GetOrCreateRepoGroup defensive check: skips groups with orchestrator sessions even if IsMultiAgent is somehow false, preventing repo group overwrite. - AbortAsync: now combines FlushedResponse + CurrentResponse for partial response preservation (was only using CurrentResponse). - Fix BuildOrchestratorPlanningPrompt test parameter count mismatch. - Add 3 new tests: LoadOrganization healing, GetOrCreateRepoGroup defensive skip, SaveOrganization debounce writes live state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…story, add [Collection] to flaky test
- CopilotService.Events.cs: wrap FlushCurrentResponse(state) call in Invoke()
in ToolExecutionStartEvent handler to ensure it runs on the UI thread
- CopilotService.cs: AbortSessionAsync uses only CurrentResponse for partial
history save — FlushedResponse content was already committed to History by
FlushCurrentResponse, so combining both caused duplicate history entries
- SessionOrganizationTests.cs: add [Collection("BaseDir")] to GroupingStabilityTests
to prevent parallel SetBaseDirForTesting calls causing flaky test failures
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… guard - SaveOrganization timer callback now marshals to UI thread via InvokeOnUI to prevent ThreadPool/UI thread race on List<T> serialization - DeleteRepoPreset: added Path.GetFullPath containment check to prevent path traversal attacks via crafted SourcePath - Reflection guard: added debug logging, documented UI-thread-only - Fixed test compilation: CreateTestService -> CreateService, GroupPreset.DeleteRepoPreset -> UserPresets.DeleteRepoPreset - Added 7 new tests: sentinel stripping (2), path traversal (2), debounce InvokeOnUI validation (1), debounce live-state (1), nonexistent preset (1) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move History.Add calls into Invoke/InvokeOnUI blocks to prevent concurrent List<ChatMessage> writes from background SDK event threads and UI thread. - ToolExecutionStartEvent: moved FlushCurrentResponse + History.Add + event invocations into single Invoke() block - ApplyReasoningUpdate: wrapped new reasoning message History.Add in InvokeOnUI() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR #222 fixed the test race that corrupts repos.json, but the damage was already done — real repos were replaced with test data (repo-1). The bare clones still exist on disk but aren't tracked in repos.json. Load() now calls HealMissingRepos() which scans the repos/ directory for bare clones that exist on disk but aren't in the state file, reads their remote URL from git config, and re-adds them. Also reconstructs missing worktree entries from the worktrees/ directory. Added 4 tests for the self-healing logic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
95f7421 to
c383c9f
Compare
- ApplyReasoningUpdate: PendingReasoningMessages ConcurrentDictionary prevents duplicate reasoning ChatMessages when rapid deltas arrive before InvokeOnUI fires. Messages registered in map before posting to UI thread; removed after History.Add completes. Map cleared in CompleteResponse, SendPromptAsync, AbortSessionAsync, error handler, watchdog, and reconnect paths. - SendViaOrchestratorReflectAsync: moved reflectState.IsActive=false and CompletedAt into finally block so they always run, even on OperationCanceledException. Previously, cancellation would permanently block future reflections for that group. - Added 4 reasoning dedup tests (ReasoningDedupTests.cs) - Added 2 reflection cancellation tests (ReflectionCycleTests.cs) - All 1546 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen
added a commit
that referenced
this pull request
Mar 12, 2026
When ResumeSessionAsync fails with 'Session not found' / 'corrupt' / 'session file' errors, the fallback path was calling CreateSessionAsync without recovering history from the old session's events.jsonl. This caused sessions to appear empty after app relaunch. The fix adds history recovery to the fallback path: 1. Load history from old session via LoadHistoryFromDisk(entry.SessionId) 2. Inject recovered messages into the new session's Info.History 3. Set MessageCount and LastReadMessageCount 4. Call RestoreUsageStats(entry) to preserve CreatedAt, token counts 5. Sync recovered history to chat DB under the new session ID 6. Add a system message indicating the session was recreated Bug introduced in PR #225 (commit 19219f1), worsened by PR #308 (commit 72886a2) which expanded the catch conditions without adding history recovery. Adds 5 structural regression tests guarding the fallback path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen
added a commit
that referenced
this pull request
Mar 13, 2026
## Problem
When `ResumeSessionAsync` fails during app restart ("Session not found"
/ "corrupt" / "session file" errors), the fallback path called
`CreateSessionAsync` which created a **blank** session with zero
messages. The old session's `events.jsonl` was never loaded, so all
conversation history was lost.
**User impact:** Sessions appeared empty after app relaunch — the user
would see their session restored but with 0 messages, forcing them to
create duplicate sessions.
## Root Cause
Introduced in PR #225 (commit `19219f1`) which added the "Session not
found" fallback. Worsened in PR #308 which expanded the catch conditions
to include "corrupt" and "session file" errors. Neither PR added history
recovery to the fallback path.
`ResumeSessionAsync` (the success path) correctly loads history via
`LoadHistoryFromDisk(sessionId)`. The fallback path skipped this
entirely.
## Fix
In the fallback path of `RestorePreviousSessionsAsync`:
1. **Load history** from the old session's `events.jsonl` via
`LoadHistoryFromDisk(entry.SessionId)` before creating the new session
2. **Inject history** into the recreated session's `Info.History`
3. **Set message counts** (`MessageCount`, `LastReadMessageCount`) so
the UI shows the correct state
4. **Restore usage stats** (`CreatedAt`, token counts) via
`RestoreUsageStats(entry)`
5. **Sync to DB** via `BulkInsertAsync` under the new session ID
6. **Add indicator** system message: "🔄 Session recreated — conversation
history recovered from previous session."
## Duplicate Session Issue
The user also observed a duplicate session (`session-20260311-001729`)
created with the same repo as the original. This was a downstream
consequence: after seeing the original session restored with zero
messages, the user manually created a new session. With this fix, the
full history is preserved, eliminating the need for duplicates.
## Tests
Added 5 structural regression tests in `SessionPersistenceTests.cs`:
- `RestoreFallback_LoadsHistoryFromOldSession` — verifies
`LoadHistoryFromDisk` appears before `CreateSessionAsync`
- `RestoreFallback_InjectsHistoryIntoRecreatedSession` — verifies
history injection + message count
- `RestoreFallback_RestoresUsageStats` — verifies `RestoreUsageStats`
call
- `RestoreFallback_SyncsHistoryToDatabase` — verifies `BulkInsertAsync`
call
- `RestoreFallback_AddsReconnectionIndicator` — verifies system message
All 2464 tests pass.
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
arisng
pushed a commit
to arisng/PolyPilot
that referenced
this pull request
Apr 4, 2026
## Problem
When `ResumeSessionAsync` fails during app restart ("Session not found"
/ "corrupt" / "session file" errors), the fallback path called
`CreateSessionAsync` which created a **blank** session with zero
messages. The old session's `events.jsonl` was never loaded, so all
conversation history was lost.
**User impact:** Sessions appeared empty after app relaunch — the user
would see their session restored but with 0 messages, forcing them to
create duplicate sessions.
## Root Cause
Introduced in PR PureWeen#225 (commit `19219f1`) which added the "Session not
found" fallback. Worsened in PR PureWeen#308 which expanded the catch conditions
to include "corrupt" and "session file" errors. Neither PR added history
recovery to the fallback path.
`ResumeSessionAsync` (the success path) correctly loads history via
`LoadHistoryFromDisk(sessionId)`. The fallback path skipped this
entirely.
## Fix
In the fallback path of `RestorePreviousSessionsAsync`:
1. **Load history** from the old session's `events.jsonl` via
`LoadHistoryFromDisk(entry.SessionId)` before creating the new session
2. **Inject history** into the recreated session's `Info.History`
3. **Set message counts** (`MessageCount`, `LastReadMessageCount`) so
the UI shows the correct state
4. **Restore usage stats** (`CreatedAt`, token counts) via
`RestoreUsageStats(entry)`
5. **Sync to DB** via `BulkInsertAsync` under the new session ID
6. **Add indicator** system message: "🔄 Session recreated — conversation
history recovered from previous session."
## Duplicate Session Issue
The user also observed a duplicate session (`session-20260311-001729`)
created with the same repo as the original. This was a downstream
consequence: after seeing the original session restored with zero
messages, the user manually created a new session. With this fix, the
full history is preserved, eliminating the need for duplicates.
## Tests
Added 5 structural regression tests in `SessionPersistenceTests.cs`:
- `RestoreFallback_LoadsHistoryFromOldSession` — verifies
`LoadHistoryFromDisk` appears before `CreateSessionAsync`
- `RestoreFallback_InjectsHistoryIntoRecreatedSession` — verifies
history injection + message count
- `RestoreFallback_RestoresUsageStats` — verifies `RestoreUsageStats`
call
- `RestoreFallback_SyncsHistoryToDatabase` — verifies `BulkInsertAsync`
call
- `RestoreFallback_AddsReconnectionIndicator` — verifies system message
All 2464 tests pass.
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Simplify built-in multi-agent presets from 6 down to 2:
Kept
Removed
Implement & Challenge Preset
Uses OrchestratorReflect mode with two workers:
The loop iterates until the challenger is satisfied (
[[GROUP_REFLECT_COMPLETE]]), with stall detection and max 5 iterations (overridable with--max N).Users who need the removed presets can recreate them as custom presets or via
.squad/team definitions.Test Changes