Skip to content

Build coordinator for system-wide MSBuild node management#13335

Closed
JakeRadMSFT wants to merge 2 commits intodotnet:mainfrom
JakeRadMSFT:build-coordinator
Closed

Build coordinator for system-wide MSBuild node management#13335
JakeRadMSFT wants to merge 2 commits intodotnet:mainfrom
JakeRadMSFT:build-coordinator

Conversation

@JakeRadMSFT
Copy link
Copy Markdown
Member

@JakeRadMSFT JakeRadMSFT commented Mar 8, 2026

Replaced by #13338

( and #13336 , #13337 )

Fix 3 macOS/Unix node reuse bugs:
- Node reuse handshake timeout was 0ms (poll-only), increased to 1000ms
- getsid() returns per-terminal values on Unix, breaking cross-terminal reuse; use sessionId=0
- ClientConnectTimeout was 60s, reduced to 5s

Add hash-based pipe naming on Unix:
- Nodes create pipes as MSBuild-{hash}-{PID} for O(1) discovery
- Replaces probing all dotnet processes on the system

Add build coordinator (dotnet msbuild --coordinator):
- System-wide node budget management across concurrent MSBuild instances
- Epoch-based heartbeat-gated promotion prevents budget overrun
- Fair-share budget with dynamic rebalancing as builds start/finish
- PID-aware staleness reaper for crashed builds
- Pipe file watchdog for self-healing
- ShutdownExcessNodes for mid-build node trimming
- Coordinator-aware node lifecycle (no-reuse on shutdown)

Results: 10 concurrent builds peak at 11 nodes (was 55+), immediate cleanup.
- BuildCoordinator_Tests.cs: 29 tests covering coordinator protocol,
  fair-share budgeting, client retry logic, concurrent connections,
  staleness reaper, and cleanup behavior
- HashBasedPipeNaming_Tests.cs: 14 tests covering ComputeHash
  determinism/caching, GetHashBasedPipeName format, FindNodesByHandshakeHash
  file matching, and SessionId=0 fix on Unix
- Reduce DefaultNodeConnectionTimeout from 15 minutes to 30 seconds so
  idle nodes clean up promptly instead of lingering
@JakeRadMSFT JakeRadMSFT closed this Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant