Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Making performance improvements to sockets#2366

Closed
davidfowl wants to merge 2 commits into
devfrom
davidfowl/sockets-perf
Closed

Making performance improvements to sockets#2366
davidfowl wants to merge 2 commits into
devfrom
davidfowl/sockets-perf

Conversation

@davidfowl
Copy link
Copy Markdown
Member

  • Run SocketAwaitable continuations asynchronously for reads (Didn't seem to make much of a difference for writes)
  • Run using the global thread pool queues instead
    of using local queues.

Theories:

  • We're freeing up the IO thread immediately to do more work (get more completions) so we can do more work if we dispatch.
  • Queuing to the local queue incurred more work stealing instead of the current thread picking it up to do more.

Plaintext Linux

Libuv:

Thread count default

[02:36:04.381] RequestsPerSecond:           1552129
[02:36:04.381] Latency on load (ms):        2.9
[02:36:04.381] Max CPU (%):                 98
[02:36:04.381] WorkingSet (MB):             403
[02:36:04.382] Startup Main (ms):           492
[02:36:04.382] First Request (ms):          192.4
[02:36:04.382] Latency (ms):                0.7
[02:36:04.382] Socket Errors:               0
[02:36:04.382] Bad Responses:               0
[02:36:04.382] SDK:                         2.2.0-preview1-007522
[02:36:04.382] Runtime:                     2.1.0-preview2-26225-03
[02:36:04.382] ASP.NET Core:                2.1.0-preview2-30242

Thread count 2

[02:40:17.288] RequestsPerSecond:           1591637
[02:40:17.288] Latency on load (ms):        1.6
[02:40:17.288] Max CPU (%):                 97
[02:40:17.288] WorkingSet (MB):             405
[02:40:17.288] Startup Main (ms):           477
[02:40:17.288] First Request (ms):          191
[02:40:17.288] Latency (ms):                0.8
[02:40:17.288] Socket Errors:               0
[02:40:17.288] Bad Responses:               0
[02:40:17.289] SDK:                         2.2.0-preview1-007522
[02:40:17.289] Runtime:                     2.1.0-preview2-26225-03
[02:40:17.289] ASP.NET Core:                2.1.0-preview2-30242

Sockets:

Before:

[02:42:05.259] RequestsPerSecond:           1450885
[02:42:05.259] Latency on load (ms):        2.5
[02:42:05.259] Max CPU (%):                 99
[02:42:05.259] WorkingSet (MB):             407
[02:42:05.259] Startup Main (ms):           407
[02:42:05.260] First Request (ms):          192.3
[02:42:05.260] Latency (ms):                0.9
[02:42:05.260] Socket Errors:               0
[02:42:05.260] Bad Responses:               0
[02:42:05.260] SDK:                         2.2.0-preview1-007522
[02:42:05.260] Runtime:                     2.1.0-preview2-26225-03
[02:42:05.260] ASP.NET Core:                2.1.0-preview2-30242

After:

[02:43:48.984] RequestsPerSecond:           1534075
[02:43:48.984] Latency on load (ms):        1.5
[02:43:48.984] Max CPU (%):                 99
[02:43:48.984] WorkingSet (MB):             410
[02:43:48.984] Startup Main (ms):           410
[02:43:48.984] First Request (ms):          189
[02:43:48.985] Latency (ms):                0.9
[02:43:48.985] Socket Errors:               0
[02:43:48.985] Bad Responses:               0
[02:43:48.985] SDK:                         2.2.0-preview1-007522
[02:43:48.985] Runtime:                     2.1.0-preview2-26225-03
[02:43:48.985] ASP.NET Core:                2.1.0-preview2-30242

Plaintext Windows

Libuv:

Thread count default

[02:31:14.077] RequestsPerSecond:           1569046
[02:31:14.077] Latency on load (ms):        3.6
[02:31:14.077] Max CPU (%):                 92
[02:31:14.078] WorkingSet (MB):             385
[02:31:14.078] Startup Main (ms):           440
[02:31:14.078] First Request (ms):          143.9
[02:31:14.078] Latency (ms):                0.8
[02:31:14.078] Socket Errors:               0
[02:31:14.078] Bad Responses:               0
[02:31:14.078] SDK:                         2.2.0-preview1-007522
[02:31:14.078] Runtime:                     2.1.0-preview2-26225-03
[02:31:14.078] ASP.NET Core:                2.1.0-preview2-30242

Thread count 2

[02:38:43.195] RequestsPerSecond:           1833182
[02:38:43.195] Latency on load (ms):        1.3
[02:38:43.195] Max CPU (%):                 92
[02:38:43.195] WorkingSet (MB):             385
[02:38:43.195] Startup Main (ms):           418
[02:38:43.195] First Request (ms):          146.3
[02:38:43.195] Latency (ms):                0.8
[02:38:43.195] Socket Errors:               0
[02:38:43.195] Bad Responses:               0
[02:38:43.195] SDK:                         2.2.0-preview1-007522
[02:38:43.196] Runtime:                     2.1.0-preview2-26225-03
[02:38:43.196] ASP.NET Core:                2.1.0-preview2-30242

Sockets:

Before

[02:33:22.076] RequestsPerSecond:           1637189
[02:33:22.076] Latency on load (ms):        9.2
[02:33:22.076] Max CPU (%):                 93
[02:33:22.076] WorkingSet (MB):             386
[02:33:22.077] Startup Main (ms):           377
[02:33:22.077] First Request (ms):          141.9
[02:33:22.077] Latency (ms):                0.8
[02:33:22.077] Socket Errors:               0
[02:33:22.077] Bad Responses:               0
[02:33:22.077] SDK:                         2.2.0-preview1-007522
[02:33:22.077] Runtime:                     2.1.0-preview2-26225-03
[02:33:22.077] ASP.NET Core:                2.1.0-preview2-30242

After

[01:54:15.852] RequestsPerSecond:           1810187
[01:54:15.852] Latency on load (ms):        1.7
[01:54:15.852] Max CPU (%):                 93
[01:54:15.852] WorkingSet (MB):             379
[01:54:15.852] Startup Main (ms):           380
[01:54:15.852] First Request (ms):          140.6
[01:54:15.853] Latency (ms):                0.7
[01:54:15.853] Socket Errors:               0
[01:54:15.853] Bad Responses:               0

Json Linux

Libuv

[02:50:53.012] RequestsPerSecond:           404439
[02:50:53.012] Latency on load (ms):        1.1
[02:50:53.012] Max CPU (%):                 98
[02:50:53.012] WorkingSet (MB):             403
[02:50:53.012] Startup Main (ms):           468
[02:50:53.012] First Request (ms):          282.1
[02:50:53.012] Latency (ms):                0.8
[02:50:53.012] Socket Errors:               0
[02:50:53.012] Bad Responses:               0
[02:50:53.012] SDK:                         2.2.0-preview1-007522
[02:50:53.012] Runtime:                     2.1.0-preview2-26225-03
[02:50:53.013] ASP.NET Core:                2.1.0-preview2-30242

Sockets

Before

[02:48:02.883] RequestsPerSecond:           258183
[02:48:02.883] Latency on load (ms):        1.4
[02:48:02.883] Max CPU (%):                 98
[02:48:02.883] WorkingSet (MB):             403
[02:48:02.883] Startup Main (ms):           409
[02:48:02.884] First Request (ms):          283
[02:48:02.884] Latency (ms):                0.7
[02:48:02.884] Socket Errors:               0
[02:48:02.884] Bad Responses:               0
[02:48:02.884] SDK:                         2.2.0-preview1-007522
[02:48:02.884] Runtime:                     2.1.0-preview2-26225-03
[02:48:02.884] ASP.NET Core:                2.1.0-preview2-30242

After

[02:46:09.887] RequestsPerSecond:           304399
[02:46:09.887] Latency on load (ms):        1
[02:46:09.887] Max CPU (%):                 98
[02:46:09.887] WorkingSet (MB):             407
[02:46:09.887] Startup Main (ms):           404
[02:46:09.888] First Request (ms):          292.7
[02:46:09.888] Latency (ms):                0.7
[02:46:09.888] Socket Errors:               0
[02:46:09.888] Bad Responses:               0
[02:46:09.888] SDK:                         2.2.0-preview1-007522
[02:46:09.888] Runtime:                     2.1.0-preview2-26225-03
[02:46:09.888] ASP.NET Core:                2.1.0-preview2-30242

Json Windows

Libuv

[02:52:30.244] RequestsPerSecond:           451311
[02:52:30.244] Latency on load (ms):        0.7
[02:52:30.244] Max CPU (%):                 85
[02:52:30.244] WorkingSet (MB):             386
[02:52:30.244] Startup Main (ms):           423
[02:52:30.244] First Request (ms):          235.5
[02:52:30.244] Latency (ms):                0.8
[02:52:30.244] Socket Errors:               0
[02:52:30.245] Bad Responses:               0
[02:52:30.245] SDK:                         2.2.0-preview1-007522
[02:52:30.245] Runtime:                     2.1.0-preview2-26225-03
[02:52:30.245] ASP.NET Core:                2.1.0-preview2-30242

Sockets

Before

[02:54:28.433] RequestsPerSecond:           299985
[02:54:28.433] Latency on load (ms):        7
[02:54:28.433] Max CPU (%):                 84
[02:54:28.434] WorkingSet (MB):             372
[02:54:28.434] Startup Main (ms):           376
[02:54:28.434] First Request (ms):          229.2
[02:54:28.434] Latency (ms):                0.8
[02:54:28.434] Socket Errors:               0
[02:54:28.434] Bad Responses:               0
[02:54:28.434] SDK:                         2.2.0-preview1-007522
[02:54:28.434] Runtime:                     2.1.0-preview2-26225-03
[02:54:28.434] ASP.NET Core:                2.1.0-preview2-30242

After

[02:55:24.440] RequestsPerSecond:           349936
[02:55:24.440] Latency on load (ms):        1.4
[02:55:24.440] Max CPU (%):                 84
[02:55:24.440] WorkingSet (MB):             368
[02:55:24.440] Startup Main (ms):           377
[02:55:24.440] First Request (ms):          228
[02:55:24.440] Latency (ms):                0.7
[02:55:24.440] Socket Errors:               0
[02:55:24.440] Bad Responses:               0
[02:55:24.440] SDK:                         2.2.0-preview1-007522
[02:55:24.440] Runtime:                     2.1.0-preview2-26225-03
[02:55:24.440] ASP.NET Core:                2.1.0-preview2-30242

@davidfowl davidfowl requested a review from halter73 March 6, 2018 10:59
@davidfowl
Copy link
Copy Markdown
Member Author

davidfowl commented Mar 6, 2018

Failures:

  • OSX: Broken with the known issue
  • on Windows we have a flaky libuv test (not windows specific):
dotnet.exe : [xUnit.net 00:00:09.1627717]     WriteAsyncWithTokenAfterCallWithoutIsCancelled(maxResponseBufferSize: 1024) [FAIL]
325At C:\Users\appveyor\.dotnet\buildtools\korebuild\2.1.0-preview2-15726\scripts\common.psm1:7 char:5
326+     & $_cmd @args
327+     ~~~~~~~~~~~~~
328    + CategoryInfo          : NotSpecified: ([xUnit.net 00:0...e: 1024) [FAIL]:String) [], RemoteException
329    + FullyQualifiedErrorId : NativeCommandError
330 
331[xUnit.net 00:00:09.1632381]       Assert.True() Failure
332[xUnit.net 00:00:09.1633962]       Expected: True
333[xUnit.net 00:00:09.1635380]       Actual:   False
334[xUnit.net 00:00:09.1645863]       Stack Trace:
335[xUnit.net 00:00:09.1658683]         C:\projects\kestrelhttpserver\test\Kestrel.Transport.Libuv.Tests\LibuvOutputConsumerTests.cs(562,0): at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Tests.LibuvOutputConsumerTests.<>c__DisplayClass16_0.<<WriteAsyncWithTokenAfterCallWithoutIsCancelled>b__0>d.MoveNext()
336[xUnit.net 00:00:09.1660616]         --- End of stack trace from previous location where exception was thrown ---
337[xUnit.net 00:00:09.1662406]         C:\projects\kestrelhttpserver\test\Kestrel.Transport.Libuv.Tests\LibuvOutputConsumerTests.cs(505,0): at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Tests.LibuvOutputConsumerTests.WriteAsyncWithTokenAfterCallWithoutIsCancelled(Int32 maxResponseBufferSize)
338[xUnit.net 00:00:09.1664034]         --- End of stack trace from previous location where exception was thrown ---
339Failed   WriteAsyncWithTokenAfterCallWithoutIsCancelled(maxResponseBufferSize: 1024)
340Error Message:
341 Assert.True() Failure
342Expected: True
343Actual:   False
344Stack Trace:
345   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Tests.LibuvOutputConsumerTests.<>c__DisplayClass16_0.<<WriteAsyncWithTokenAfterCallWithoutIsCancelled>b__0>d.MoveNext() in C:\projects\kestrelhttpserver\test\Kestrel.Transport.Libuv.Tests\LibuvOutputConsumerTests.cs:line 562
346--- End of stack trace from previous location where exception was thrown ---
347   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Tests.LibuvOutputConsumerTests.WriteAsyncWithTokenAfterCallWithoutIsCancelled(Int32 maxResponseBufferSize) in C:\projects\kestrelhttpserver\test\Kestrel.Transport.Libuv.Tests\LibuvOutputConsumerTests.cs:line 505
348--- End of stack trace from previous location where exception was thrown ---
349[xUnit.net 00:00:11.4722859]   Finished:    Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Tests
350[xUnit.net 00:00:00.0049864] xUnit.net VSTest Adapter v2.4.0-beta.1.build3945 (64-bit .NET Core 4.6.26222.07)
351

I believe this is one that @pakrym encountered when porting to the latest pipelines

  • Linux - Test run aborted without a message, this is pretty scary and I've seen it a few times now.

@davidfowl
Copy link
Copy Markdown
Member Author

Also the OSX logs had this:

Log Critical[0]: Unexpected exception in HttpConnection.ProcessRequestsAsync. System.ObjectDisposedException: Safe handle has been closed
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.NativeMethods.uv_async_send(UvAsyncHandle handle)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.async_send(UvAsyncHandle handle) in /_/src/Kestrel.Transport.Libuv/Internal/Networking/LibuvFunctions.cs:line 179
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.LibuvThread.Post[T](Action`1 callback, T state) in /_/src/Kestrel.Transport.Libuv/Internal/LibuvThread.cs:line 188
   at System.IO.Pipelines.Pipe.CompleteWriter(Exception exception)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<WriteOutputAsync>d__15.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 106
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<RunAsync>d__14.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 44
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.HttpConnection.<ProcessRequestsAsync>d__47`1.MoveNext() in /_/src/Kestrel.Core/Internal/HttpConnection.cs:line 178
Log Critical[0]: ConnectionHandler.Execute() 0HLC3A9HCB63L System.Exception: Unexpected critical error. Log Critical[0]: Unexpected exception in HttpConnection.ProcessRequestsAsync. System.ObjectDisposedException: Safe handle has been closed
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.NativeMethods.uv_async_send(UvAsyncHandle handle)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.async_send(UvAsyncHandle handle) in /_/src/Kestrel.Transport.Libuv/Internal/Networking/LibuvFunctions.cs:line 179
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.LibuvThread.Post[T](Action`1 callback, T state) in /_/src/Kestrel.Transport.Libuv/Internal/LibuvThread.cs:line 188
   at System.IO.Pipelines.Pipe.CompleteWriter(Exception exception)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<WriteOutputAsync>d__15.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 106
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<RunAsync>d__14.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 44
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.HttpConnection.<ProcessRequestsAsync>d__47`1.MoveNext() in /_/src/Kestrel.Core/Internal/HttpConnection.cs:line 178 ---> System.ObjectDisposedException: Safe handle has been closed
   at System.Runtime.InteropServices.SafeHandle.DangerousAddRef(Boolean& success)
   at System.StubHelpers.StubHelpers.SafeHandleAddRef(SafeHandle pHandle, Boolean& success)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.NativeMethods.uv_async_send(UvAsyncHandle handle)
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.Networking.LibuvFunctions.async_send(UvAsyncHandle handle) in /_/src/Kestrel.Transport.Libuv/Internal/Networking/LibuvFunctions.cs:line 179
   at Microsoft.AspNetCore.Server.Kestrel.Transport.Libuv.Internal.LibuvThread.Post[T](Action`1 callback, T state) in /_/src/Kestrel.Transport.Libuv/Internal/LibuvThread.cs:line 188
   at System.IO.Pipelines.Pipe.CompleteWriter(Exception exception)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<WriteOutputAsync>d__15.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 106
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Adapter.Internal.AdaptedPipeline.<RunAsync>d__14.MoveNext() in /_/src/Kestrel.Core/Adapter/Internal/AdaptedPipeline.cs:line 44
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.HttpConnection.<ProcessRequestsAsync>d__47`1.MoveNext() in /_/src/Kestrel.Core/Internal/HttpConnection.cs:line 178
   --- End of inner exception stack trace ---
   at Microsoft.AspNetCore.Testing.TestApplicationErrorLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter) in /_/test/shared/TestApplicationErrorLogger.cs:line 61
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.KestrelTrace.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter) in /_/src/Kestrel.Core/Internal/Infrastructure/KestrelTrace.cs:line 198
   at Microsoft.Extensions.Logging.LoggerExtensions.LogCritical(ILogger logger, EventId eventId, Exception exception, String message, Object[] args)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.HttpConnection.<ProcessRequestsAsync>d__47`1.MoveNext() in /_/src/Kestrel.Core/Internal/HttpConnection.cs:line 182
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.ConnectionHandler.<Execute>d__6.MoveNext() in 

@davidfowl davidfowl force-pushed the davidfowl/sockets-perf branch from 7f29d52 to c7044fe Compare March 6, 2018 18:03
var t = (Tuple<Action<object>, object>)s;
t.Item1(t.Item2);
},
Tuple.Create(action, state));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tuple?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this is only used per connection so not a big deal

@benaadams
Copy link
Copy Markdown
Contributor

There is something wrong with this (in principle) :(

Numbers are good though :)

I think corefx needs a fix

_error = socketError;
_bytesTransfered = bytesTransferred;
Interlocked.Exchange(ref _callback, _callbackCompleted)?.Invoke();
var continuation = Interlocked.Exchange(ref _callback, _callbackCompleted);
Copy link
Copy Markdown
Member

@halter73 halter73 Mar 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this double scheduling true-async reads? If so, shouldn't this readerScheduler be switched to PipeScheduler.Inline for the socket transport to avoid double dispatching (or even triple dispatching by the time the data flows through the request body pipe as currently implemented).

@halter73
Copy link
Copy Markdown
Member

halter73 commented Mar 7, 2018

So I talked to @davidfowl about the purpose of double dispatching, and his theory is that without this change, a single socket could dominate the usage of the limited I/O threads for as long as there's data already available in the OS's input buffer by the time ReceiveAsync is called.

I figured if that's the case, certain connections must be starving others of resources in dev, and that would mean this change should reduce wrk's reported standard deviation for latency. It looks like @davidfowl is on to something, because the latency numbers for the Socket transport are immensely better with this PR.

Json Windows

Sockets

Before

[01:17:11.217] Starting 'Wrk' worker
[01:17:11.217] Measuring first request latency on http://10.0.0.110:5000/json
[01:17:11.219] 1.5123 ms
[01:17:11.219] Measuring subsequent requests latency
[01:17:11.227] 0.7173 ms
[01:17:11.227] wrk -H "Host: localhost" -H "Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7" -H "Connection: keep-alive" --latency -d 15 -c 256 --timeout 8 -t 32  http://10.0.0.110:5000/json
[01:17:11.290] Running 15s test @ http://10.0.0.110:5000/json
[01:17:11.290]   32 threads and 256 connections
[01:17:26.398]   Thread Stats   Avg      Stdev     Max   +/- Stdev
[01:17:26.408]     Latency     7.18ms   93.79ms   2.64s    99.32%
[01:17:26.408]     Req/Sec     9.29k   503.29    14.79k    85.89%
[01:17:26.408]   Latency Distribution
[01:17:26.408]      50%    0.85ms
[01:17:26.408]      75%    0.87ms
[01:17:26.408]      90%    0.90ms
[01:17:26.408]      99%   10.06ms
[01:17:26.408]   4439914 requests in 15.10s, 643.60MB read
[01:17:26.408] Requests/sec: 294054.23
[01:17:26.408] Transfer/sec:     42.63MB
[01:17:26.518] Failed to parse bad responses
[01:17:26.518] Failed to parse bad responses
[01:17:27.113] Deleting job [ID:258 Connections:256 Threads:32 Duration:15 Method:GET ServerUrl:http://10.0.0.110:5000/json Headers:{"Host":"localhost","Accept":"application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7","Connection":"keep-alive"}]

After

[01:18:44.580] Starting 'Wrk' worker
[01:18:44.580] Measuring first request latency on http://10.0.0.110:5000/json
[01:18:44.581] 1.0201 ms
[01:18:44.581] Measuring subsequent requests latency
[01:18:44.587] 0.6895 ms
[01:18:44.587] wrk -H "Host: localhost" -H "Accept: application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7" -H "Connection: keep-alive" --latency -d 15 -c 256 --timeout 8 -t 32  http://10.0.0.110:5000/json
[01:18:44.650] Running 15s test @ http://10.0.0.110:5000/json
[01:18:44.650]   32 threads and 256 connections
[01:18:59.752]   Thread Stats   Avg      Stdev     Max   +/- Stdev
[01:18:59.755]     Latency     1.36ms   14.06ms 865.28ms   99.56%
[01:18:59.755]     Req/Sec    11.07k   621.43    15.27k    89.95%
[01:18:59.755]   Latency Distribution
[01:18:59.755]      50%  706.00us
[01:18:59.755]      75%  741.00us
[01:18:59.755]      90%  786.00us
[01:18:59.756]      99%    0.99ms
[01:18:59.756]   5265427 requests in 15.10s, 763.27MB read
[01:18:59.756] Requests/sec: 348722.18
[01:18:59.756] Transfer/sec:     50.55MB
[01:18:59.887] Failed to parse bad responses
[01:18:59.887] Failed to parse bad responses
[01:19:00.452] Deleting job [ID:260 Connections:256 Threads:32 Duration:15 Method:GET ServerUrl:http://10.0.0.110:5000/json Headers:{"Host":"localhost","Accept":"application/json,text/html;q=0.9,application/xhtml+xml;q=0.9,application/xml;q=0.8,*/*;q=0.7","Connection":"keep-alive"}]

So not only did this chage reduce the standard deviation for latency from 93.79ms to 14.06ms, it also reduced the average latency from 7.18ms to 1.36ms.

The 99th latency percentile is even more impressive. The 50/75/90th percentiles are slightly better than before (note the difference between ms and us), but the 99th percentile went from 10.06ms to 0.99ms! It's interesting that the 99th percentile for latency is still lower than the average after than this change. It really shows that my stats prof was right about the mean being heavily impacted by outliers.

@pakrym
Copy link
Copy Markdown
Contributor

pakrym commented Mar 7, 2018

@davidfowl @halter73 very interesting find 👍
Do we need more IO threads then?

@davidfowl
Copy link
Copy Markdown
Member Author

davidfowl commented Mar 7, 2018

Do we need more IO threads then?

Bumping up IO threads doesn't solve the problem. We just need to efficiently get off the IO thread if the next read is synchronous.

@halter73 As an alternative, we could try dispatching if we're not on a thread pool thread and IO completes synchronously.

Edit: Thinking about it though, it's pretty similar to what's there today since the read loop is always going to be on an IO thread...

@vancem
Copy link
Copy Markdown

vancem commented Mar 8, 2018

cc: @vancem @geoffkizer

@davidfowl
Copy link
Copy Markdown
Member Author

@vancem we're going to be taking #2368 instead

- Run SocketAwaitable continuations asynchronously
- Run using the global thread pool queues instead
of using local queues.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants