Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

Add Multithread perf test#1222

Closed
benaadams wants to merge 4 commits into
aspnet:devfrom
benaadams:multithreaded
Closed

Add Multithread perf test#1222
benaadams wants to merge 4 commits into
aspnet:devfrom
benaadams:multithreaded

Conversation

@benaadams
Copy link
Copy Markdown
Contributor

Results on 16 core (32 HT) Server

BenchmarkDotNet=v0.10.0
OS=Windows
Processor=?, ProcessorCount=32
Frequency=2343747 Hz, Resolution=426.6672 ns, Timer=TSC
Host Runtime=.NET Core 4.6.24628.01, Arch=64-bit [RyuJIT]
GC=Concurrent Server
dotnet cli version=1.0.0-preview2-1-003177
Job Runtime(s):
.NET Core 4.6.24410.01, Arch=64-bit [RyuJIT]

RemoveOutliers=False Runtime=Core Server=True
LaunchCount=3 RunStrategy=Throughput TargetCount=10
WarmupCount=5

Single threaded

                   Method |          Mean |      StdDev | Scaled |           RPS |
------------------------- |-------------- |------------ |------- |-------------- |
           ParsePlaintext |   993.3004 ns |  23.2932 ns |   1.00 |  1,006,744.82 |
  ParsePipelinedPlaintext |   752.0448 ns |  15.0722 ns |   0.76 |  1,329,708.02 |
          ParseLiveAspNet | 4,347.4022 ns |  81.1657 ns |   4.38 |    230,022.43 |
 ParsePipelinedLiveAspNet | 4,026.5624 ns |  60.3856 ns |   4.06 |    248,350.80 |
             ParseUnicode | 6,859.2000 ns | 147.5681 ns |   6.91 |    145,789.60 |
    ParseUnicodePipelined | 6,679.8011 ns | 210.6329 ns |   6.73 |    149,705.06 |

Multi threaded

                   Method |          Mean |      StdDev | Scaled |           RPS |
------------------------- |-------------- |------------ |------- |-------------- |
           ParsePlaintext |   188.0574 ns |   1.5506 ns |   1.00 |  5,317,526.25 |
  ParsePipelinedPlaintext |    67.4741 ns |   1.9620 ns |   0.36 | 14,820,495.27 |
          ParseLiveAspNet |   576.7441 ns |   3.2424 ns |   3.07 |  1,733,871.11 |
 ParsePipelinedLiveAspNet |   309.2693 ns |  11.8032 ns |   1.64 |  3,233,428.06 |
             ParseUnicode |   727.9797 ns |  23.8009 ns |   3.87 |  1,373,664.74 |
    ParseUnicodePipelined |   493.8310 ns |  21.8476 ns |   2.63 |  2,024,984.33 |

@benaadams
Copy link
Copy Markdown
Contributor Author

@stephentoub is this the correct way to use Parallel.For for maximum throughput? File MultiThreadedRequestParsing.cs

@stephentoub
Copy link
Copy Markdown
Contributor

is this the correct way

I expect that'll be about as good as you can get with it, assuming nothing else is taking up thread pool threads while this runs. There will be a small amount of overhead in getting started, but it shouldn't be measurable assuming enough work is done in the body.

@benaadams
Copy link
Copy Markdown
Contributor Author

4 Core desktop

BenchmarkDotNet=v0.10.0
OS=Windows
Processor=?, ProcessorCount=4
Frequency=2825619 Hz, Resolution=353.9048 ns, Timer=TSC
Host Runtime=.NET Core 4.6.24628.01, Arch=64-bit [RyuJIT]
GC=Concurrent Server
dotnet cli version=1.0.0-preview2-1-003155
Job Runtime(s):
.NET Core 4.6.24410.01, Arch=64-bit [RyuJIT]

RemoveOutliers=False Runtime=Core Server=True
LaunchCount=3 RunStrategy=Throughput TargetCount=10
WarmupCount=5

Single threaded

                   Method |          Mean |     StdDev | Scaled |          RPS |
------------------------- |-------------- |----------- |------- |------------- |
           ParsePlaintext |   793.1210 ns | 13.8271 ns |   1.00 | 1,260,841.62 |
  ParsePipelinedPlaintext |   626.8362 ns |  6.6949 ns |   0.79 | 1,595,313.01 |
          ParseLiveAspNet | 3,842.9023 ns | 46.0246 ns |   4.85 |   260,219.99 |
 ParsePipelinedLiveAspNet | 3,542.9391 ns | 44.6045 ns |   4.47 |   282,251.53 |
             ParseUnicode | 6,459.0080 ns | 33.4147 ns |   8.15 |   154,822.54 |
    ParseUnicodePipelined | 6,237.0520 ns | 67.3694 ns |   7.87 |   160,332.16 |

Multi threaded

                   Method |          Mean |     StdDev | Scaled |          RPS |
------------------------- |-------------- |----------- |------- |------------- |
           ParsePlaintext |   295.8569 ns |  3.4383 ns |   1.00 | 3,380,012.21 |
  ParsePipelinedPlaintext |   184.1479 ns |  4.0847 ns |   0.62 | 5,430,417.46 |
          ParseLiveAspNet | 1,238.1511 ns | 41.2725 ns |   4.19 |   807,655.84 |
 ParsePipelinedLiveAspNet | 1,039.3710 ns | 17.6234 ns |   3.51 |   962,120.36 |
             ParseUnicode | 1,985.8980 ns | 33.2729 ns |   6.71 |   503,550.54 |
    ParseUnicodePipelined | 1,825.3181 ns | 30.2562 ns |   6.17 |   547,849.71 |

@benaadams benaadams force-pushed the multithreaded branch 2 times, most recently from b06a513 to ff8131c Compare November 22, 2016 05:45
"netcoreapp1.0": {
"dependencies": {
"Microsoft.NETCore.App": {
"version": "1.0.1-*",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Easy to miss things like that in major PRs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure it would work; BenchMarkDotnet currently doesn't like netcoreapp1.1

"DNT: 1\r\n" +
"Referer: http://stackoverflow.com/?tab=month\r\n" +
"Pragma: no-cache\r\n" +
"Cookie: prov=20629ccd-8b0f-e8ef-2935-cd26609fc0bc; __qca=P0-1591065732-1479167353442; _ga=GA1.2.1298898376.1479167354; _gat=1; sgt=id=9519gfde_3347_4762_8762_df51458c8ec2; acct=t=why-is-%e0%a5%a7%e0%a5%a8%e0%a5%a9-numeric&s=why-is-%e0%a5%a7%e0%a5%a8%e0%a5%a9-numeric\r\n\r\n";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Share the request data in some static class in case we ever change anything with them.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared

private static readonly byte[] _unicodePipelinedRequests = Encoding.ASCII.GetBytes(string.Concat(Enumerable.Repeat(unicodeRequest, Pipelining)));
private static readonly byte[] _unicodeRequest = Encoding.ASCII.GetBytes(unicodeRequest);

private static readonly int ThreadCount = Environment.ProcessorCount;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For convenience, can we print this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to Program.cs

@benaadams
Copy link
Copy Markdown
Contributor Author

benaadams commented Dec 4, 2016

Updated for BenchmarkDotNet to 0.10.1

@halter73 @CesarBS @pakrym worth the upgrade; is 1.1.0 and new memory diag built in

New output (minus a few columns); note last column

                   Method |          Mean | Scaled |          RPS | Allocated |
------------------------- |-------------- |------- |------------- |---------- |
           ParsePlaintext |   849.5240 ns |   1.00 | 1,177,129.73 |     116 B |
  ParsePipelinedPlaintext |   647.2198 ns |   0.76 | 1,545,070.11 |     104 B |
          ParseLiveAspNet | 3,918.9647 ns |   4.61 |   255,169.44 |   1.11 kB |
 ParsePipelinedLiveAspNet | 3,600.5674 ns |   4.24 |   277,734.00 |    1.1 kB |
             ParseUnicode | 6,751.1812 ns |   7.95 |   148,122.23 |   1.94 kB |
    ParseUnicodePipelined | 6,394.2506 ns |   7.53 |   156,390.49 |   1.93 kB |

public static readonly byte[] PlaintextPipelinedRequests = Encoding.ASCII.GetBytes(string.Concat(Enumerable.Repeat(plaintextRequest, Pipelining)));
public static readonly byte[] PlaintextRequest = Encoding.ASCII.GetBytes(plaintextRequest);

public static readonly byte[] LiveaspnentPipelinedRequests = Encoding.ASCII.GetBytes(string.Concat(Enumerable.Repeat(liveaspnetRequest, Pipelining)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename aspnetnt -> AspNet

public static readonly byte[] PlaintextRequest = Encoding.ASCII.GetBytes(plaintextRequest);

public static readonly byte[] LiveaspnentPipelinedRequests = Encoding.ASCII.GetBytes(string.Concat(Enumerable.Repeat(liveaspnetRequest, Pipelining)));
public static readonly byte[] LiveaspnentRequest = Encoding.ASCII.GetBytes(liveaspnetRequest);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename aspnetnt -> AspNet


private const string plaintextRequest = "GET /plaintext HTTP/1.1\r\nHost: www.example.com\r\n\r\n";

private const string liveaspnetRequest = "GET https://live.asp.net/ HTTP/1.1\r\n" +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uber nit: why aspnet and not AspNet?

Copy link
Copy Markdown
Contributor Author

@benaadams benaadams Dec 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just deleted the . from the domain name; going on the important words being Site; Request and Pipelined wasn't thinking too hard about it :)

@benaadams
Copy link
Copy Markdown
Contributor Author

Renamed

@pakrym
Copy link
Copy Markdown
Contributor

pakrym commented Dec 6, 2016

@benaadams which part of kestrel code does this benchmark vs single-threaded version?

@benaadams
Copy link
Copy Markdown
Contributor Author

@pakrym same code, just maxes it out across cores rather than single core; it should be embarrassingly parallel; though the increase I am seeing isn't as much as x N cores - which may be interesting

@halter73
Copy link
Copy Markdown
Member

halter73 commented Dec 6, 2016

This doesn't test any kind of contention. It might be more useful to have a test that keeps the producer going during consumption to test contention acquiring SocketInput's _sync lock.

Maybe you could use the threadpool to dispatch calls to InsertData and ensure that there is enough data always in the buffer to prevent the consumer from ever catching up with the producer. By looking at the return values of TakeStartLine and TakeMessageHeaders it should be possible to see if the consumer ever stalls.

@davidfowl davidfowl closed this Feb 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants