Skip to content

Fix flaky TestResourceManagerIsSafeForConcurrentAccessAndEnumeration timeout#125573

Merged
danmoseley merged 1 commit intodotnet:mainfrom
lewing:fix/resource-manager-test-timeout
Mar 16, 2026
Merged

Fix flaky TestResourceManagerIsSafeForConcurrentAccessAndEnumeration timeout#125573
danmoseley merged 1 commit intodotnet:mainfrom
lewing:fix/resource-manager-test-timeout

Conversation

@lewing
Copy link
Member

@lewing lewing commented Mar 15, 2026

Problem

TestResourceManagerIsSafeForConcurrentAccessAndEnumeration is flaky (#125448, 3 hits in 7 days). This same test has been filed as a Known Build Error three times over 3+ years (#80277, #86013, #125448).

The test spins up 10 threads that concurrently enumerate a resource set with Thread.Sleep(1) between entries, then waits with:

Assert.True(Task.WaitAll(tasks, TimeSpan.FromSeconds(30)));

On loaded CI agents, 30 seconds isn't always enough. The failure message is unhelpful — just Assert.True() Failure: Expected: True, Actual: False — because Task.WaitAll swallows any actual thread-safety exceptions.

Fix

Task allTasks = Task.WhenAll(tasks);
Task completed = await Task.WhenAny(allTasks, Task.Delay(TimeSpan.FromSeconds(120)));
Assert.True(completed == allTasks, "Timed out waiting for concurrent enumeration tasks");
await allTasks; // propagates any real exceptions

Three improvements:

  1. Task.WhenAll instead of Task.WaitAll — if a real thread-safety bug is hit, the actual exception propagates instead of being swallowed behind Assert.True(false)
  2. 120s instead of 30s — 4x more headroom for loaded CI agents while still providing a timeout safety net
  3. Clear timeout message — on timeout, the assertion message says what happened instead of a bare Assert.True failure

The WhenAny/Task.Delay pattern is used instead of Task.WhenAll(...).WaitAsync(...) for .NET Framework TFM compatibility. The method signature changes from void to async Task, which xUnit handles natively.

Fixes #125448

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-resources
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to de-flake PreserializedResourceWriterTests.TestResourceManagerIsSafeForConcurrentAccessAndEnumeration by improving how the test waits for concurrent worker tasks to complete, increasing timeout headroom and (intended to) improve exception surfacing during failures.

Changes:

  • Converted the test from void to async Task.
  • Replaced Task.WaitAll(..., 30s) with await Task.WhenAll(...).WaitAsync(120s).

You can also share your feedback on Copilot code review. Take the survey.

…timeout

The test uses Task.WaitAll with a 30-second timeout to wait for 10
threads concurrently enumerating resources. On loaded CI agents, 30s
is not always enough, causing Assert.True(false) with no useful
diagnostic.

Switch to await Task.WhenAll().WaitAsync(120s):
- WhenAll propagates actual thread-safety exceptions instead of
  swallowing them behind Assert.True(false)
- 120s gives 4x more headroom for slow CI agents while still
  providing a timeout safety net

This test has been filed as a Known Build Error three times (dotnet#80277,
dotnet#86013, dotnet#125448) across 3+ years.

Fixes dotnet#125448

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lewing lewing force-pushed the fix/resource-manager-test-timeout branch from 7f04738 to edc976e Compare March 15, 2026 04:06
Copy link
Member Author

@lewing lewing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — replaced Task.WhenAll(...).WaitAsync(120s) with Task.WhenAny(Task.WhenAll(tasks), Task.Delay(120s)) pattern that compiles on all TFMs including $(NetFrameworkCurrent). Thanks for catching that!

@lewing lewing requested a review from Copilot March 15, 2026 04:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces flakiness in TestResourceManagerIsSafeForConcurrentAccessAndEnumeration by increasing the timeout and improving exception propagation when concurrent enumeration fails.

Changes:

  • Makes the test method async Task to enable async waiting patterns in xUnit.
  • Replaces Task.WaitAll(..., 30s) with an async timeout-based wait around Task.WhenAll(...).
  • Adds a clearer assertion message on timeout and awaits the aggregated task to propagate real exceptions.

You can also share your feedback on Copilot code review. Take the survey.

@danmoseley
Copy link
Member

/ba-g generic chrome crash.

@danmoseley
Copy link
Member

merging -- i did /ba-g and it still won't pass 5 mins later.

@danmoseley
Copy link
Member

ah, I apparently don't have the power 😐

@danmoseley
Copy link
Member

/ba-g chrome generic crash.

@danmoseley danmoseley merged commit 8e05ac9 into dotnet:main Mar 16, 2026
90 of 95 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

System.Resources.Extensions.Tests.PreserializedResourceWriterTests.TestResourceManagerIsSafeForConcurrentAccessAndEnumeration

3 participants