Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[release/2.1] Ensure ConcurrentBag's TryTake is linearizable (#30947)#31009

Merged
danmoseley merged 1 commit into
dotnet:release/2.1from
stephentoub:port30947
Jul 17, 2018
Merged

[release/2.1] Ensure ConcurrentBag's TryTake is linearizable (#30947)#31009
danmoseley merged 1 commit into
dotnet:release/2.1from
stephentoub:port30947

Conversation

@stephentoub
Copy link
Copy Markdown
Member

@stephentoub stephentoub commented Jul 12, 2018

Port #30947 to release/2.1.
Fixes #30781

Note: Reverted in #31132 from 2.1.3, will be submitted again later - see #30781 for more details.

For .NET Core 2.0, I ported the ThreadPool's work-stealing implementation to ConcurrentBag, leading to significant performance throughput and allocation improvements.  However, there's a subtle difference in the concurrency guarantees the ThreadPool's implementation provided from what ConcurrentBag needs, which ends up breaking certain usage patterns on top of ConcurrentBag.

Specifically, ThreadPool's "steal" implementation need not be fully linearizable.  It's possible for a thread to see the bag's count as 1, and then while the thread is doing a take/steal for its count to never drop below 1, but for the steal to still fail, even though there was always an item available.  This is ok for the thread pool because it manages a known count of work items in the queues separately, and if it sees that there are still items available after a steal has failed, it'll try again.  That "try again" logic provided above the work-stealing queue thus didn't make it over to ConcurrentBag, which breaks some usages of ConcurrentBag, in particular cases where a type like BlockingCollection is wrapping the bag and managing its own count.  It's possible now for BlockingCollection to know that there's an item in the bag but to then fail to take it, which causes problems such as exceptions being thrown.

The fix is to port back the relevant portion of ConcurrentBag from .NET Core 1.x / .NET Framework, where local push operations on a list track the number of times the list transitions from empty to non-empty.  A steal operation then looks at those counts prior to doing the steal, and if the steal fails, it looks again after: if the count has increased, it retries.  This unfortunately means that local pushes on small lists are now more expensive than in .NET Core 2.0/2.1, as if there are <= 2 items in the list, it takes the lock, but this seems unavoidable given the work-stealing design.
@stephentoub stephentoub added area-System.Collections Servicing-consider Issue for next servicing release review labels Jul 12, 2018
@stephentoub stephentoub added this to the 2.1.x milestone Jul 12, 2018
@stephentoub stephentoub added the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Jul 12, 2018
@danmoseley
Copy link
Copy Markdown
Member

approved

@danmoseley danmoseley merged commit b6be5d7 into dotnet:release/2.1 Jul 17, 2018
@danmoseley danmoseley removed the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Jul 17, 2018
@danmoseley
Copy link
Copy Markdown
Member

Pulling this temporarily as it missed 2.1.3 and they want a clean branch in case they need to rebuild 2.1.3 for some reason.

danmoseley added a commit that referenced this pull request Jul 17, 2018
danmoseley added a commit that referenced this pull request Jul 18, 2018
@danmoseley danmoseley removed the Servicing-consider Issue for next servicing release review label Aug 17, 2018
@stephentoub stephentoub deleted the port30947 branch September 24, 2018 14:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants