Is your feature request related to a problem? Please describe.
Now that System.Net.Sockets has added a mode to inline continuations on Unix which can be enabled with export DOTNET_SYSTEM_NET_SOCKETS_INLINE_COMPLETIONS=1 (dotnet/runtime#34945), we should add a similar option for Kestrel's Socket Transport.
If you set DOTNET_SYSTEM_NET_SOCKETS_INLINE_COMPLETIONS=1 and run the TechEmpower JSON platform benchmark with Kestrel, this degrades performance by ~12% even though there's no blocking I/O. Kestrel's own scheduling seems to negate the benefits of inlining Socket completions. If we change Kestrel's Socket transport to inline its own continuations, inlining Socket completions as well yields a ~7% RPS improvement.
Here's the change I sued to test inlining in Kestrel's Socket Transport:
--- a/src/Servers/Kestrel/Transport.Sockets/src/Internal/SocketConnection.cs
+++ b/src/Servers/Kestrel/Transport.Sockets/src/Internal/SocketConnection.cs
@@ -68,8 +68,8 @@ namespace Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.Internal
maxReadBufferSize ??= 0;
maxWriteBufferSize ??= 0;
- var inputOptions = new PipeOptions(MemoryPool, PipeScheduler.ThreadPool, scheduler, maxReadBufferSize.Value, maxReadBufferSize.Value / 2, useSynchronizationContext: false);
- var outputOptions = new PipeOptions(MemoryPool, scheduler, PipeScheduler.ThreadPool, maxWriteBufferSize.Value, maxWriteBufferSize.Value / 2, useSynchronizationContext: false);
+ var inputOptions = new PipeOptions(MemoryPool, PipeScheduler.Inline, PipeScheduler.Inline, maxReadBufferSize.Value, maxReadBufferSize.Value / 2, useSynchronizationContext: false);
+ var outputOptions = new PipeOptions(MemoryPool, PipeScheduler.Inline, PipeScheduler.Inline, maxWriteBufferSize.Value, maxWriteBufferSize.Value / 2, useSynchronizationContext: false);
var pair = DuplexPipe.CreateConnectionPair(inputOptions, outputOptions);
And below are the benchmark results. I gave two results for each scenario to give a vague idea on the variance.
Sockets Default + Kestrel Default : 1,168,142 RPS & 1,156,151 RPS
Sockets Inline + Kestrel Inline: 1,248,695 RPS & 1,243,950 RPS
Sockets Inline + Kestrel Default: 1,031,740 RPS & 1,022,775 RPS (This is what I measured before and it's still worse as expected)
Sockets Default + Kestrel Inline: 1,113,734 RPS & 1,155,986 RPS
Kestrel used to have a similar concept with KestrelServerOption.ApplicationSchedulingMode, but that always used a "pubternal" SchedulingMode type and didn't affect transport-level scheduling post-2.0. When refactoring Kestrel's transport abstraction in 3.0, we removed this API.
Describe the solution you'd like
We should add a boolean property to SocketTransportOptions that can be set to tell the Socket transport to use PipeScheduler.Inline for both the readers and writers of the Input and Output pipes instead of dispatching to the ThreadPool and IOQueue like we do by default.
Given that this can cause big problems, particularly if application code blocks, we should come up with a sufficiently scary name for this. Something along the lines of DangerousInlinePipeScheduling or something along those lines.
Is your feature request related to a problem? Please describe.
Now that System.Net.Sockets has added a mode to inline continuations on Unix which can be enabled with
export DOTNET_SYSTEM_NET_SOCKETS_INLINE_COMPLETIONS=1(dotnet/runtime#34945), we should add a similar option for Kestrel's Socket Transport.If you set DOTNET_SYSTEM_NET_SOCKETS_INLINE_COMPLETIONS=1 and run the TechEmpower JSON platform benchmark with Kestrel, this degrades performance by ~12% even though there's no blocking I/O. Kestrel's own scheduling seems to negate the benefits of inlining Socket completions. If we change Kestrel's Socket transport to inline its own continuations, inlining Socket completions as well yields a ~7% RPS improvement.
Here's the change I sued to test inlining in Kestrel's Socket Transport:
And below are the benchmark results. I gave two results for each scenario to give a vague idea on the variance.
Kestrel used to have a similar concept with KestrelServerOption.ApplicationSchedulingMode, but that always used a "pubternal" SchedulingMode type and didn't affect transport-level scheduling post-2.0. When refactoring Kestrel's transport abstraction in 3.0, we removed this API.
Describe the solution you'd like
We should add a boolean property to SocketTransportOptions that can be set to tell the Socket transport to use PipeScheduler.Inline for both the readers and writers of the Input and Output pipes instead of dispatching to the ThreadPool and IOQueue like we do by default.
Given that this can cause big problems, particularly if application code blocks, we should come up with a sufficiently scary name for this. Something along the lines of
DangerousInlinePipeSchedulingor something along those lines.