Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

SocketAsyncEngine.Unix: improve performance of context lookup#36358

Merged
stephentoub merged 1 commit into
dotnet:masterfrom
tmds:socket_context_lookup
Apr 11, 2019
Merged

SocketAsyncEngine.Unix: improve performance of context lookup#36358
stephentoub merged 1 commit into
dotnet:masterfrom
tmds:socket_context_lookup

Conversation

@tmds
Copy link
Copy Markdown
Member

@tmds tmds commented Mar 26, 2019

No description provided.

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 26, 2019

@davidfowl @sebastienros can you benchmark this to see if it matters in scenarios you care about?

CC @stephentoub @geoffkizer

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 26, 2019

Linux x64_Release has infrastructure related failure:

/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error : ArgumentException: Provided Job List Uri is not accessible [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Client.HelixApi.HandleFailedRequest(RestApiException ex) in /_/src/Microsoft.DotNet.Helix/Client/CSharp/HelixApi.cs:line 29 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Client.Job.NewInternalAsync(JobCreationRequest body, CancellationToken cancellationToken) in /_/src/Microsoft.DotNet.Helix/Client/CSharp/generated-code/Job.cs:line 259 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Client.Job.NewAsync(JobCreationRequest body, CancellationToken cancellationToken) in /_/src/Microsoft.DotNet.Helix/Client/CSharp/generated-code/Job.cs:line 190 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Client.HelixApi.RetryAsync[T](Func`1 function, Action`1 logRetry, Func`2 isRetryable) [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Client.JobDefinition.SendAsync(Action`1 log) in /_/src/Microsoft.DotNet.Helix/JobSender/JobDefinition.cs:line 206 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Sdk.SendHelixJob.ExecuteCore() in /_/src/Microsoft.DotNet.Helix/Sdk/SendHelixJob.cs:line 234 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :    at Microsoft.DotNet.Helix.Sdk.HelixTask.Execute() in /_/src/Microsoft.DotNet.Helix/Sdk/HelixTask.cs:line 43 [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error : RestApiException: The response contained an invalid status code 400 Bad Request [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :  [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error : Body: {"Message":"Provided Job List Uri is not accessible"} [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :  [/__w/1/s/eng/sendtohelix.proj]
/home/vsts_azpcontainer/.nuget/packages/microsoft.dotnet.helix.sdk/2.0.0-beta.19171.6/tools/Microsoft.DotNet.Helix.Sdk.MonoQueue.targets(51,5): error :  [/__w/1/s/eng/sendtohelix.proj]

@dotnet-bot test corefx-ci (Linux x64_Release) please

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 26, 2019

/azp run corefx-ci (Linux x64_Release)

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 26, 2019

/azp run corefx-ci

Comment thread src/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEngine.Unix.cs Outdated
try
{
bool shutdown = false;
SocketAsyncContext[] contexts = new SocketAsyncContext[EventBufferCount];
Copy link
Copy Markdown
Member

@stephentoub stephentoub Mar 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make _handleToContextMap a ConcurrentDictionary<IntPtr, SocketAsyncContext>? That would result in an allocation on every add, but adds are rare-ish, only when new sockets are added (right?), in which case there's already other allocation happening (e.g. the socket itself and all associated state), and we could avoid the lock on reads entirely, plus avoid this array allocation per event loop, and avoid needing to iterate through the events twice.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a branch that implements this and we can benchmark both.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch uses ConcurrentDictionary: tmds@9cc7527. @stephentoub , you can add some review comments on that commit if you want.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I prefer the ConcurrentDictionary version, but we should see what perf looks like for both.

@sebastienros
Copy link
Copy Markdown
Member

@tmds Would you mind sharing the dlls with and without the changes you want to benchmark? If you send me an email or tweet I can create a shared folder for you to drop the files in.

@stephentoub
Copy link
Copy Markdown
Member

@sebastienros, @tmds, were you guys able to get any benchmarking done? Anything I can help with?

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 28, 2019

@sebastienros, @tmds, were you guys able to get any benchmarking done? Anything I can help with?

I sent dlls to Sébastien. I didn't get benchmark results yet.

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Mar 28, 2019

I got these benchmark results from @sebastienros:

Description RPS CPU (%) Memory (MB) Avg. Latency (ms) Startup (ms) First Request (ms) Latency (ms) Ratio
Baseline 471,819 98 174 0.84 235 80.06 0.55 1.00
Lookup 473,732 98 174 0.84 231 81.65 0.75 1.00
Concurrent 472,634 98 175 0.89 230 79.38 0.65 1.00

That is: a 0.17% increase with the ConcurrentDictionary, and a 0.4% increase with Dictionary + lock.

[This benchmark is: Benchmarked Plaintext non-pipelined on Linux. Latest runtime, aspnet and sdk. All runs done 5 times, excluding highest lowest result for each run and average the 3 remaining ones.]

@stephentoub
Copy link
Copy Markdown
Member

That is: a 0.17% increase with the ConcurrentDictionary, and a 0.4% increase with Dictionary + lock.

Thanks. How representative of access patterns do we think this test is? If it's representative, the dictionary+lock seems fine. But if we expect there may be other access patterns, I'd be tempted to suggest the concurrent dictionary route, since it won't ever have contention on reads, whereas the dictionary+lock does, and with a global lock.

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Apr 5, 2019

With these small gains, we can just go for the simpler implementation. I'm changing this to the ConcurrentDictionary version.

@tmds tmds force-pushed the socket_context_lookup branch from 36b7c23 to 9cc7527 Compare April 8, 2019 12:01
@tmds
Copy link
Copy Markdown
Member Author

tmds commented Apr 8, 2019

Changed to the ConcurrentDictionary implementation.

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Apr 9, 2019

@stephentoub probably this is ok to merge?

@tmds
Copy link
Copy Markdown
Member Author

tmds commented Apr 11, 2019

@wfurt @davidsh if this looks good to you, can you merge this please?

@benaadams
Copy link
Copy Markdown
Member

Regular Dictionary with IntPtr key should be much faster than a ConcurrentDictionary, due to the optimizations it has around struct keys; then if the lock is changed from a static lock to an instance lock; it should generally be uncontended on the receive? (unlike the static lock)

@stephentoub
Copy link
Copy Markdown
Member

stephentoub commented Apr 16, 2019

Regular Dictionary with IntPtr key should be much faster than a ConcurrentDictionary, due to the optimizations it has around struct keys; then if the lock is changed from a static lock to an instance lock; it should generally be uncontended on the receive

Doesn't matter if there's contention; the lock still adds non-trivial overheads...

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;
using System.Collections.Concurrent;

[InProcess]
[MemoryDiagnoser]
public class Test
{
    public static void Main()
    {
        BenchmarkRunner.Run<Test>();
    }

    public Test()
    {
        for (int i = 0; i < 1000; i++)
        {
            _cd.TryAdd((IntPtr)i, new object());
            _d.Add((IntPtr)i, new object());
        }
    }

    private ConcurrentDictionary<IntPtr, object> _cd = new ConcurrentDictionary<IntPtr, object>();
    private Dictionary<IntPtr, object> _d = new Dictionary<IntPtr, object>();

    [Benchmark]
    public bool Concurrent()
    {
        bool result = true;
        for (int i = 0; i < 1000; i++) result &= _cd.TryGetValue((IntPtr)i, out _);
        return result;
    }

    [Benchmark]
    public bool Locked()
    {
        bool result = true;
        for (int i = 0; i < 1000; i++) lock (_d) result &= _d.TryGetValue((IntPtr)i, out _);
        return result;
    }
}
     Method |     Mean |     Error |    StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
----------- |---------:|----------:|----------:|------------:|------------:|------------:|--------------------:|
 Concurrent | 10.75 us | 0.2015 us | 0.1885 us |           - |           - |           - |                   - |
     Locked | 24.87 us | 0.4959 us | 0.4870 us |           - |           - |           - |                   - |

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Net.Sockets os-linux Linux OS (any supported distro)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants