Skip to content

fix: bridge SendMessage bypasses orchestration routing for multi-agent groups#297

Merged
PureWeen merged 3 commits intomainfrom
fix/bridge-orchestration-routing
Mar 7, 2026
Merged

fix: bridge SendMessage bypasses orchestration routing for multi-agent groups#297
PureWeen merged 3 commits intomainfrom
fix/bridge-orchestration-routing

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

@PureWeen PureWeen commented Mar 7, 2026

Problem

When sending a message to an orchestrator session from the mobile app (via WebSocket bridge), the message went directly to instead of through the multi-agent dispatch pipeline. The orchestrator responded as a normal chat session — no worker dispatch, no task planning, no synthesis.

This was the root cause of the PR Review Squad orchestrator not working after the latest deploy.

Root Cause

WsBridgeServer.HandleClientMessage called SendPromptAsync directly via Task.Run, completely bypassing the orchestration routing that exists in Dashboard.razor. The bridge had no awareness of orchestrator sessions.

Evidence from diagnostics:

  • Successful orchestrator sends: thread=1 (UI thread, via Dashboard.razor → orchestration pipeline)
  • Failed sends at 02:00/02:04: thread=44/65 (background, from Task.Run in WsBridgeServer)
  • No [DISPATCH-ROUTE] or [DISPATCH] logs — orchestration pipeline never executed

Fix

  1. WsBridgeServer.cs — Now calls GetOrchestratorGroupId() before sending. If the target session is an orchestrator, routes through SendToMultiAgentGroupAsync instead of direct SendPromptAsync.
  2. CopilotService.cs (CreateSessionAsync queue drain) — Same pattern for newly created orchestrator sessions.
  3. RightClickContextMenuTests.cs — Fix pre-existing test failure (proximity threshold too tight after new attributes were added to the session-item div).

Tests

@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Mar 7, 2026

PR #297 Review -- 5-Model Consensus Report

PR: fix: bridge SendMessage bypasses orchestration routing for multi-agent groups
CI Status: ⚠️ No checks configured (pre-existing)
Models: claude-opus-4.6 (×2), claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex
Consensus filter: 2+ models


Findings

🔴 HIGH -- Thread-unsafe read of Organization.Sessions in GetOrchestratorGroupId (2+ models)

WsBridgeServer.cs / CopilotService.Organization.cs -- GetOrchestratorGroupId() reads Organization.Sessions (plain List) on the WebSocket background thread. Organization.Sessions is UI-thread-only. Concurrent enumeration + mutation throws InvalidOperationException or corrupts internal array.

Fix: Marshal the call to UI thread via InvokeOnUIAsync, or take a .ToArray() snapshot under lock, or cache the orchestrator group ID at session creation time.

🔴 HIGH -- SendToMultiAgentGroupAsync reads Organization state from thread pool (2+ models)

CopilotService.Organization.cs -- SendToMultiAgentGroupAsync is called from Task.Run (via WsBridgeServer). It reads Organization.Groups and Organization.Sessions without marshaling to UI thread. Same thread-safety issue as above.

Fix: Marshal the entire method body to the UI thread, or snapshot the needed data (group members list) on the UI thread before dispatching to background.

🟡 MEDIUM -- Silent orchestration bypass when images are present (2+ models)

When imagePaths is non-empty, the code silently falls back to direct SendPromptAsync instead of routing through the orchestrator. No user feedback is shown -- the message just goes to the wrong session. Users sending images in a multi-agent group won't understand why the orchestrator didn't route their message.

Fix: Show a toast/system message like 'Images sent directly (orchestration routing does not support images yet)' so users understand the behavior.


Recommended Action: ⚠️ Request Changes

The two HIGH thread-safety issues must be fixed before merge. The Organization.Sessions and Organization.Groups lists are plain List -- accessing them from the WS background thread is a data race that can crash the app or corrupt state.

@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Mar 7, 2026

PR #297 Re-Review — 5-Model Consensus Report

PR: fix: bridge SendMessage bypasses orchestration routing for multi-agent groups
Commit: 4fb9768c (unchanged since last review)
CI Status: ⚠️ No checks configured (pre-existing)
Models: claude-opus-4.6 (×2), claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex
Consensus filter: 2+ models must flag independently


Previous Findings — All 3 Confirmed STILL PRESENT (5/5 models)

🔴 HIGH — Thread-unsafe read of Organization.Sessions in GetOrchestratorGroupId (5/5 models)

WsBridgeServer.cs:~395_copilot.GetOrchestratorGroupId(sendSession) is called on the WebSocket background thread. This reads Organization.Sessions (plain List<SessionMeta>) which is only safe on the UI thread. Moving it before Task.Run (vs. inside) changes which background thread reads the list, not whether it is safe. Concurrent UI-thread mutations (CreateGroup, ReconcileOrganization, etc.) can cause InvalidOperationException or corrupted reads.

Fix: Wrap in await InvokeOnUIAsync(() => _copilot.GetOrchestratorGroupId(sendSession)), or snapshot the needed data on the UI thread before dispatching.

🔴 HIGH — SendToMultiAgentGroupAsync reads Organization state from thread pool (5/5 models)

WsBridgeServer.cs:~402SendToMultiAgentGroupAsync runs inside Task.Run on a thread-pool thread. It internally reads Organization.Groups and Organization.Sessions — both plain List<T> with UI-thread-only access contracts. Same data race as Finding 1, deeper in the call stack.

Fix: Marshal the orchestration decision + dispatch to the UI thread as a single atomic operation, or snapshot the member list on the UI thread before dispatching to background.

🟡 MODERATE — Silent orchestration bypass when images are present (5/5 models)

CopilotService.cs:~1701 — When nextImagePaths is non-empty, the code silently falls back to SendPromptAsync (direct to orchestrator) instead of routing through the multi-agent pipeline. No user feedback explains why orchestration was skipped. The bridge path in WsBridgeServer.cs does not check for images at all — if image support is added to the bridge protocol later, orchestration will silently not work.

Fix: Add a system message or toast: "Images sent directly — orchestration routing does not support images yet".


New Findings (2+ model consensus)

🟡 MODERATE — TOCTOU race: orchestration decision computed before execution (3/5 models)

WsBridgeServer.cs:~395-402orchGroupId is evaluated eagerly outside Task.Run, then consumed inside the lambda. If group membership changes between the check and the actual SendToMultiAgentGroupAsync call, the message routes down the wrong path. The orchestration decision and execution should be atomic.

🟡 MODERATE — agentMode silently dropped on orchestration path (2/5 models)

WsBridgeServer.cs:~402, CopilotService.cs:~1701 — When routing through orchestration, SendToMultiAgentGroupAsync(orchGroupId, sendMessage, ct) does not forward sendAgentMode / nextAgentMode. If the caller specified an agent mode, it is silently lost on the orchestration path.

🟢 MINOR — RightClickContextMenuTests assertion weakened (3/5 models)

RightClickContextMenuTests.cs:69 — Character proximity threshold doubled from 200→400. This is a fragile structural assertion that gets weaker over time — another attribute addition could break it again. Low risk but noted.


Test Quality Assessment (consensus)

  • Functional logic coverage is good: 4 unit tests thoroughly validate GetOrchestratorGroupId across orchestrator-reflect, sequential, non-multi-agent, and PR Review Squad scenarios.
  • Integration tests exist but are weak regression guards: The 3 bridge tests verify messages appear in session history, but both SendPromptAsync and SendToMultiAgentGroupAsync produce the same observable effect in demo mode. The tests would pass even if the orchestration routing branch were deleted entirely (2/5 models flagged this).
  • Thread safety is untested: No test exercises concurrent access to Organization.Sessions. The two 🔴 HIGH bugs cannot be caught by the current test suite.
  • Image fallback path is untested: No test covers the case where images cause orchestration bypass.

Recommended Action: ⚠️ Request Changes

The two 🔴 HIGH thread-safety issues from the first review remain unaddressed. Specific asks:

  1. Wrap the orchestration check + dispatch in InvokeOnUIAsync (or snapshot Organization state on UI thread before entering Task.Run)
  2. Forward agentMode through the orchestration path (or document why it is intentionally dropped)
  3. Add user feedback for the image fallback path (even a Console.WriteLine log would be better than silence)

The core bug fix (routing bridge sends through orchestration) is correct in intent. The thread-safety issues are the blocker.

@PureWeen PureWeen force-pushed the fix/bridge-orchestration-routing branch from b3348dd to 90e6d86 Compare March 7, 2026 15:17
@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Mar 7, 2026

PR #297 — Re-Review Round 3 (5-Model Consensus)

Commit: 4fb9768c (unchanged)

Previous Findings Status

# Sev Finding Status Agreement
1 🔴 HIGH GetOrchestratorGroupId reads Organization.Sessions on WS bg thread STILL PRESENT 5/5
2 🔴 HIGH SendToMultiAgentGroupAsync from Task.Run without UI-thread marshaling STILL PRESENT 5/5
3 🟡 MOD Silent orchestration bypass when images present STILL PRESENT 5/5
4 🟡 MOD TOCTOU race in orchestrator group ID lookup STILL PRESENT 5/5
5 🟡 MOD Dropped agentMode parameter STILL PRESENT 5/5

Required Fix

Marshal GetOrchestratorGroupId + SendToMultiAgentGroupAsync to UI thread (or snapshot under lock).

Verdict: ⚠️ Request Changes — 3rd review pass, all findings confirmed by 5 models unanimously.

PureWeen and others added 3 commits March 7, 2026 14:29
…t groups

When a mobile client sent a message to an orchestrator session via the
WebSocket bridge, WsBridgeServer.HandleClientMessage called SendPromptAsync
directly, bypassing the multi-agent dispatch pipeline. The orchestrator
responded as a normal chat session instead of planning and dispatching
tasks to workers.

Root cause: WsBridgeServer had no awareness of orchestrator sessions.
All bridge sends went through SendPromptAsync regardless of session role.

Fix:
- WsBridgeServer now calls GetOrchestratorGroupId before sending. If the
  target session is an orchestrator, routes through SendToMultiAgentGroupAsync.
- CreateSessionAsync queue drain also checks for orchestrator routing.
- Fix pre-existing RightClickContextMenuTests failure (proximity threshold).

Evidence from diagnostics: successful orchestrator sends ran on thread=1
(UI thread, via Dashboard.razor). Failed sends at 02:00/02:04 ran on
thread=44/65 (background, from Task.Run in WsBridgeServer) with no
DISPATCH-ROUTE or DISPATCH logs — orchestration pipeline never executed.

Tests: 7 new tests (4 unit + 3 integration), all 2066 passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- WsBridgeServer: Replace Task.Run with InvokeOnUIAsync so
  GetOrchestratorGroupId + SendToMultiAgentGroupAsync run atomically
  on UI thread (fixes thread-unsafe Organization reads and TOCTOU race)
- Dashboard.razor + CopilotService.cs: Add system message warning when
  orchestrator has images and falls back to direct send
- agentMode is now correctly forwarded on the non-orchestrator path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…and create-drain paths

Mirror Dashboard.razor's AutoStartReflectionIfNeeded behavior in WsBridgeServer and
the CreateSessionAsync queue-drain path. Previously, messages to OrchestratorReflect
groups via the mobile bridge silently fell back to single-pass Orchestrator mode because
StartGroupReflection was never called (no ReflectionState was initialized).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the fix/bridge-orchestration-routing branch from 90e6d86 to 8b76099 Compare March 7, 2026 21:29
@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Mar 7, 2026

PR #297 — Final Re-Review (Round 4, 3-Model Consensus)

Commit: \8b76099\ (3rd commit — fix StartGroupReflection for OrchestratorReflect)
CI Status: ⚠️ No checks configured (pre-existing)
Models: claude-opus-4.6 (×2), gemini-3-pro-preview
Test results: 2022/2030 non-integration tests passing (8 failures are pre-existing: symlink tests + DB resilience tests)


All Previous Findings — Status

# Sev Finding Status
1 🔴 \GetOrchestratorGroupId\ reads \Organization.Sessions\ on WS bg thread ✅ FIXED
2 🔴 \SendToMultiAgentGroupAsync\ from \Task.Run\ ✅ FIXED
3 🟡 Silent orchestration bypass when images present ✅ FIXED
4 �� TOCTOU race in orchestrator group ID lookup ✅ FIXED
5 🟡 \�gentMode\ parameter dropped ✅ BY DESIGN (consistent with Dashboard.razor)
6 🟡 Bridge path skips \StartGroupReflection\ for OrchestratorReflect ✅ FIXED (commit 3)
7 🟡 CreateSessionAsync drain skips \StartGroupReflection\ ✅ FIXED (commit 3)

New Findings (3-model consensus)

None. The fix is clean and complete.

One model (Gemini) flagged thread-safety issues at WsBridgeServer.cs:612 (\MultiAgentBroadcast) and :621 (\MultiAgentCreateGroup) — but these are pre-existing code paths not modified by this PR, and no other model agreed (1/3 models, below the 2-model consensus threshold).

Summary of Applied Fixes

  • Commit 2 (\90e6d86): Replaced \Task.Run\ with \InvokeOnUIAsync\ for atomic UI-thread dispatch of \GetOrchestratorGroupId\ + \SendToMultiAgentGroupAsync. Added system message warning for image-bypass fallback in both Dashboard.razor and the CreateSessionAsync drain path.

  • Commit 3 (\8b76099): Mirror \AutoStartReflectionIfNeeded\ behavior in WsBridgeServer and the CreateSessionAsync drain path — call \StartGroupReflection\ before \SendToMultiAgentGroupAsync\ when the group is \OrchestratorReflect\ mode, so the reflection loop starts correctly for bridge-originated messages.

Verdict: ✅ Ready to merge

All 7 previously-reported findings are resolved. The fix correctly routes bridge sends through the multi-agent orchestration pipeline with proper thread safety, image fallback feedback, TOCTOU elimination, and OrchestratorReflect support.

@PureWeen PureWeen merged commit 18b0af8 into main Mar 7, 2026
@PureWeen PureWeen deleted the fix/bridge-orchestration-routing branch March 7, 2026 22:31
arisng pushed a commit to arisng/PolyPilot that referenced this pull request Apr 4, 2026
…t groups (PureWeen#297)

## Problem

When sending a message to an orchestrator session from the **mobile
app** (via WebSocket bridge), the message went directly to instead of
through the multi-agent dispatch pipeline. The orchestrator responded as
a normal chat session — no worker dispatch, no task planning, no
synthesis.

This was the root cause of the PR Review Squad orchestrator not working
after the latest deploy.

## Root Cause

`WsBridgeServer.HandleClientMessage` called `SendPromptAsync` directly
via `Task.Run`, completely bypassing the orchestration routing that
exists in `Dashboard.razor`. The bridge had no awareness of orchestrator
sessions.

**Evidence from diagnostics:**
- Successful orchestrator sends: thread=1 (UI thread, via
Dashboard.razor → orchestration pipeline)
- Failed sends at 02:00/02:04: thread=44/65 (background, from `Task.Run`
in WsBridgeServer)
- No `[DISPATCH-ROUTE]` or `[DISPATCH]` logs — orchestration pipeline
never executed

## Fix

1. **`WsBridgeServer.cs`** — Now calls `GetOrchestratorGroupId()` before
sending. If the target session is an orchestrator, routes through
`SendToMultiAgentGroupAsync` instead of direct `SendPromptAsync`.
2. **`CopilotService.cs`** (CreateSessionAsync queue drain) — Same
pattern for newly created orchestrator sessions.
3. **`RightClickContextMenuTests.cs`** — Fix pre-existing test failure
(proximity threshold too tight after new attributes were added to the
session-item div).

## Tests

- 4 new unit tests in `MultiAgentRegressionTests.cs` (Bug PureWeen#8 region)
- 3 new integration tests in `WsBridgeIntegrationTests.cs` (bridge
orchestration routing)
- Fix for 1 pre-existing test failure
- **All 2066 tests passing**

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant