Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

[WIP] Lazy alloc ThreadPool local queues#21695

Closed
benaadams wants to merge 3 commits into
dotnet:masterfrom
benaadams:lazy-local-tpqueues
Closed

[WIP] Lazy alloc ThreadPool local queues#21695
benaadams wants to merge 3 commits into
dotnet:masterfrom
benaadams:lazy-local-tpqueues

Conversation

@benaadams
Copy link
Copy Markdown
Member

@benaadams benaadams commented Dec 28, 2018

Defer local queue creation (and add to list of local queues to steal from) to first local item queued to that thread rather than thread creation.

This means if 300 threadpool threads are active but no local queuing has happened; there will be no time spent checking if there is something to steal from other threads.

This is using an assumption that for the scenario outlined in https://github.com/dotnet/coreclr/issues/19088; that if you are using a MinWorkerThreads an order of magnitude higher than CPU count (x10 - x20) to get work done then you are less likely to be queuing local items to these 160-320 threads; so adding their queues to the list of queues to check can be deferred until something is queued.

Also am only going for "Contributes to" rather than "Resolves" as it could quickly return to every thread having a queue based on the workload (e.g. releases of blocked SemaphoreSlim.WaitAsync which queue locally)

Should also improve the performance for apps that use the threadpool, but only use the global queue and never end up queuing items to the thread's local queues for the app lifetime - though not sure how common that would be?*

*Could perhaps remove a local queue if its not had anything on it for a while.

Contributes to https://github.com/dotnet/coreclr/issues/19088

/cc @stephentoub @vancem @kouvel @VSadov

@stephentoub
Copy link
Copy Markdown
Member

stephentoub commented Dec 28, 2018

Contributes to #19088

How? Under the assumption that most pool threads never have items queued and thus don't need to be checked for steals?

@benaadams
Copy link
Copy Markdown
Member Author

benaadams commented Dec 28, 2018

Under the assumption that most pool threads never have items queued and thus don't need to be checked for steals?

Yes, more or less.

Under the assumption that if you are using a MinWorkerThreads an order of magnitude higher than CPU count (x10 - x20) to get work done then you are less likely to be queuing local items to these 160-320 threads; so adding their queues to the list of queues to check can be deferred until something is queued.

Deferring the queue creation to first queue on that thread rather than thread creation shouldn't be a significant performance impact to regular behaviour?

@benaadams
Copy link
Copy Markdown
Member Author

Also am only going for "Contributes to" rather than "Resolves" as it could quickly return to every thread having a queue based on the workload.

@stephentoub
Copy link
Copy Markdown
Member

stephentoub commented Dec 28, 2018

Deferring the queue creation to first queue on that thread rather than thread creation shouldn't be a significant performance impact to regular behaviour?

I don't know. Hopefully not, but there is an extra null check / branch now on every local queuing.

@benaadams benaadams changed the title [Wip] Lazy alloc ThreadPool local queues Lazy alloc ThreadPool local queues Dec 28, 2018
@benaadams benaadams closed this Dec 28, 2018
@benaadams benaadams reopened this Dec 28, 2018
@benaadams benaadams force-pushed the lazy-local-tpqueues branch 2 times, most recently from 5f199d8 to bac0251 Compare December 28, 2018 21:56
@benaadams
Copy link
Copy Markdown
Member Author

Updated summary

@stephentoub
Copy link
Copy Markdown
Member

if you are using a MinWorkerThreads an order of magnitude higher than CPU count (x10 - x20) to get work done then you are less likely to be queuing local items to these 160-320 threads

I'm not understanding this part. Why is it less likely? You're assuming some kind of heterogeneous workload where only some work items queue locally and they somehow aren't involved in bursty work? I'd guess that if any pool thread had it's queue Initialized, most would.

I see the value of a change like this, but primarily only if nothing in the app ever queues to a local queue, such that no local queues are Initialized. Is that the case in a bare-bones kestrel app? MVC? Typical app that lead to #19088?

If there's no measurable cost to doing this, then ok. I'm just skeptical it'll have the desired impact on real apps. Happy to be proven wrong.

@benaadams
Copy link
Copy Markdown
Member Author

benaadams commented Dec 29, 2018

Is that the case in a bare-bones kestrel app? MVC?

Yes(ish) currently; though there are many paths that end up queuing locally and the local queues are a good mechanism. The main issue I see (with this change) is they only ratchet up, so once you have a local queue its always there to be checked.

@benaadams
Copy link
Copy Markdown
Member Author

Is that the case in a bare-bones kestrel app?

The Platform benchmarks do occasionally from what looks to be a TCS with RunContinuationsAsynchronously

Creating Local Queue
   at System.Threading.ThreadPoolWorkQueue.CreateLocalQueue(ThreadPoolWorkQueueThreadLocals locals)
   at System.Threading.ThreadPoolWorkQueue.Enqueue(Object callback, Boolean forceGlobal)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AwaitUnsafeOnCompleted[TAwaiter,TStateMachine](TAwaiter& awaiter, TStateMachine& stateMachine)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketTransport.HandleConnectionAsync(SocketConnection connection)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task`1.TrySetResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetExistingTaskResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.SetResult()
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.SocketConnection.StartAsync()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task`1.TrySetResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetExistingTaskResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.SetResult()
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.SocketConnection.DoSend()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(IAsyncStateMachineBox box, Boolean allowInlining)
   at System.Threading.Tasks.Task.RunContinuations(Object continuationObject)
   at System.Threading.Tasks.Task`1.TrySetResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.SetExistingTaskResult(TResult result)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder.SetResult()
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.SocketConnection.ProcessSends()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1.AsyncStateMachineBox`1.MoveNext(Thread threadPoolThread)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal.IOQueue.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Although its only occasional, the main issue is the queues ratchet up, but never down (in this PR).

@benaadams benaadams changed the title Lazy alloc ThreadPool local queues [WIP] Lazy alloc ThreadPool local queues Dec 29, 2018
@benaadams
Copy link
Copy Markdown
Member Author

I think can just go full on, and use idle time to remove the queues #21713

@benaadams benaadams closed this Dec 30, 2018
@benaadams benaadams deleted the lazy-local-tpqueues branch December 30, 2018 02:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants