fix: prevent duplicate groups via race condition in GetOrCreateRepoGroup#638
fix: prevent duplicate groups via race condition in GetOrCreateRepoGroup#638
Conversation
GetOrCreateRepoGroup and GetOrCreateLocalFolderGroup had a check-then-create pattern without holding _organizationLock. During app startup, concurrent callers (session restore on ThreadPool + ReconcileOrganization) could both see 'no existing group' and create duplicates. This caused the '4 maui groups' bug where manual cleanup was undone by every restart. Fix: wrap the entire check-then-create in lock(_organizationLock). Monitor is reentrant so nested AddGroup() calls are safe. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Design-level concerns (outside diff)🟡 MODERATE —
|
There was a problem hiding this comment.
Multi-Model Review — PR #638: Fix duplicate groups via race condition
Summary
The core fix is correct and well-targeted: wrapping check-then-create in lock(_organizationLock) prevents the duplicate group race between ReconcileOrganization (ThreadPool) and session restore. Monitor reentrancy with nested AddGroup() is sound. The fix addresses the stated bug.
However, two MODERATE issues should be resolved before merge.
Findings (ranked by severity)
| # | Severity | Consensus | Location | Issue |
|---|---|---|---|---|
| 1 | 🟡 MODERATE | 3/3 | Lines 1474–1477, 1522–1523 (in diff) | OnStateChanged?.Invoke() and SaveOrganization() called inside _organizationLock — breaks file-wide convention (all 4 Focus methods release lock before notifying), extends lock hold time with arbitrary subscriber code, and creates a latent deadlock surface |
| 2 | 🟡 MODERATE | 3/3 | Lines 1536–1588 (outside diff) | PromoteOrCreateLocalFolderGroup has the same TOCTOU pattern, entirely unprotected — concurrent callers can race on promotion, producing non-deterministic LocalPath |
| 3 | 🟢 MINOR | 3/3 | Tests (outside diff) | No concurrent stress test validates the fix; all existing tests are single-threaded |
CI Status
CI checks not available for validation.
Recommendation
REQUEST_CHANGES — The two MODERATE findings are straightforward to address:
- Move
SaveOrganization()+OnStateChanged?.Invoke()outside the lock (matching the pattern used everywhere else in the file) - Wrap
PromoteOrCreateLocalFolderGroupbody inlock(_organizationLock)for consistency
Both fixes are small, low-risk, and align with the PR author's own reasoning about Monitor reentrancy.
Generated by Expert Code Review (auto) for issue #638
| AddGroup(group); | ||
| SaveOrganization(); | ||
| OnStateChanged?.Invoke(); | ||
| return group; |
There was a problem hiding this comment.
🟡 MODERATE — OnStateChanged?.Invoke() and SaveOrganization() called inside _organizationLock (3/3 reviewers)
Every other method in this file that holds _organizationLock follows the pattern of releasing the lock before notifying — see AddToFocus (line 930), RemoveFromFocus (949), PromoteFocusSession (964), DemoteFocusSession (983):
lock (_organizationLock) { /* mutate */ }
if (changed) { SaveOrganization(); OnStateChanged?.Invoke(); }This PR inverts that established convention. OnStateChanged?.Invoke() fires synchronously on the calling thread while the lock is held, invoking all subscribers including SessionSidebar.RefreshSessions and Dashboard.RefreshState.
While there is no deadlock today (Monitor is reentrant on the same thread), this creates a latent deadlock surface: any future OnStateChanged subscriber that synchronously acquires a second lock L2 while another path holds L2 then _organizationLock would deadlock.
Fix: Keep check+create+AddGroup() inside the lock, move SaveOrganization() and OnStateChanged?.Invoke() outside:
SessionGroup? group;
lock (_organizationLock)
{
var existing = /* ... */;
if (existing != null) return existing;
// ... early-return guards ...
Organization.DeletedRepoGroupRepoIds.Remove(repoId);
group = new SessionGroup { /* ... */ };
AddGroup(group);
}
SaveOrganization();
OnStateChanged?.Invoke();
return group;Same fix applies to GetOrCreateLocalFolderGroup at line 1523.
Multi-Model Code Review — PR #638SummaryThe core fix is correct and necessary — wrapping the check-then-create pattern in CI Status: ✅ All checks passing (first commit); pending (second commit)Findings (Initial Review)
Re-Review After Fixes (3/3 reviewers)
New Findings: NoneAll 3 reviewers independently confirmed no new bugs, regressions, security issues, or data-loss risks. One reviewer noted Test Coverage
Recommendation✅ Approve — All previous findings addressed. No new issues found. The race-condition fix is correct, follows file-wide conventions, and has concurrent test coverage. |
…erGroup Address review findings: 1. Move SaveOrganization() + OnStateChanged?.Invoke() outside _organizationLock in GetOrCreateRepoGroup and GetOrCreateLocalFolderGroup — matches the file-wide convention (lock → mutate → release → notify) and eliminates latent deadlock risk from event subscribers. 2. Wrap PromoteOrCreateLocalFolderGroup in _organizationLock to fix the same TOCTOU race (concurrent callers could race on candidate promotion). 3. Add 3 concurrent stress tests validating that each method produces exactly one group under 20 parallel Task.Run callers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem
After the worktree fix (PR #527) and the previous session's cleanup script, duplicate "maui" groups kept reappearing on every app restart — 4 groups for the same repo.
Root Cause
GetOrCreateRepoGroup()andGetOrCreateLocalFolderGroup()had a check-then-create race condition. The pattern was:During startup, multiple threads call these concurrently:
ThreadPoolviaTask.RunReconcileOrganization()callsGetOrCreateRepoGroupfor each session and tracked repoBoth threads see "no existing group" and both create one → duplicates.
Fix
Wrap the entire check-then-create body in
lock(_organizationLock). C#Monitoris reentrant, so the nestedAddGroup()call (which also takes the lock) is safe.Testing
SessionOrganizationTestspassdotnet-maui-local-66cccd41repo entry