Expose Interlocked.MemoryBarrierProcessWide#17512
Conversation
|
Depends on dotnet/coreclr#10476 |
2ab1235 to
787d190
Compare
| { | ||
| // Taking this lock on the same thread repeatedly is very fast because of it has no interlocked operations. | ||
| // Switching the thread where the lock is taken is expensive because of allocation and FlushProcessWriteBuffers. | ||
| class AsymmetricLock |
| internal bool Taken; | ||
| } | ||
|
|
||
| LockCookie _current = new LockCookie(-1); |
|
|
||
| LockCookie _current = new LockCookie(-1); | ||
|
|
||
| // Returning LockCookie to call Exit on is the fastest implementation because of it works naturally with the RCU pattern. |
There was a problem hiding this comment.
Nit: because of it => because it
| private LockCookie EnterSlow() | ||
| { | ||
| // Attempt to steal the ownership. Take a regular lock to make sure that only thread is trying to steal it at a time. | ||
| lock (this) |
There was a problem hiding this comment.
Nit: only thread => only one thread
|
@dotnet-bot test this please |
| // we do not need here. We really just need to make sure that the compiler won't reorder the read with the above write. | ||
| // RyuJIT won't reorder them today, but more advanced optimizers might. | ||
| // | ||
| if (Volatile.Read(ref _current) == entry) |
There was a problem hiding this comment.
Is this defeating the purpose a little bit? The point of sys_membar is that you do not need to match fences on the write side with fences on the read side.
I see that it is here to force no reordering by JIT, but perhaps any other [MethodImpl(noinline)] method would achieve the same without implied fencing?
There was a problem hiding this comment.
It depends on whether on you are optimizing for x86/x64 or arm/arm64.
Volatile on x86/x64 does regular load without explicit barrier. It will be faster than the noinline variant.
Volatile on arm/arm64 has explicit barrier. Then it is about measuring whether the explicit barrier is more expensive compared to call overhead. I think they will be pretty close, but I do not have a arm/arm64 machine around to check it.
There was a problem hiding this comment.
Considering this is a test, no barrier seems more appropriate.
Real implementation should take whatever is fastest, of course.
There was a problem hiding this comment.
Ok, I have changed it to no barrier. Thanks for the feedback.
|
LGTM |
|
@dotnet-bot test this please |
* Expose Interlocked.MemoryBarrierProcessWide Fixes https://github.com/dotnet/corefx/issues/16799
) (#29082) Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Fixes https://github.com/dotnet/corefx/issues/16799