SocketAsyncEngine.Unix: Use eventfd(2) to signal shutdown#1538
Conversation
4c25f28 to
6e6c3ef
Compare
|
I don't understand the build error in the Linux x64 CI environment. eventfd is correctly found, so |
Since we don't really care what is being written to the pipe to wake up the event loop, we can substitute it for an eventfd(2) on Linux and save ourselves a file descriptor per SocketAsyncEngine. (For manycore systems, that's potentially dozens of file descriptors that aren't needed anymore.) (It's a tiny wee little bit more efficient to write to an eventfd than it is writing to a pipe, although in this case it doesn't matter that much.)
6e6c3ef to
4774829
Compare
|
It could be the return type of eventfd being converted from uint to int on return |
| // | ||
| // Since eventfd is a Linux-only feature, there's no need to define our own | ||
| // flag values: pass these values through as the Linux system call expects. | ||
| // |
There was a problem hiding this comment.
We like to treat the shims as platform-agnostic, and the values we pass in here be for the PAL rather than for a particular platform. If the values we define happen to be the same as for that platform, then the PAL implementation can avoid doing any casting and just pass them through, but that's up to the PAL.
What that actually means for your changes:
- Remove this comment
- Define these values in pal_io.h
_ Add static asserts that the PAL EFD_SEMAPHORE matches the target platform's EFD_SEMAPHORE
That will keep the PAL platform-agnostic from a managed code perspective while also not adding runtime overhead.
We do this elsewhere.
| int32_t flags); // 0 for defaults or PAL_O_CLOEXEC for close-on-exec | ||
|
|
||
| /** | ||
| * Creates an eventfd on Linux. Returns error everywhere else. |
There was a problem hiding this comment.
| * Creates an eventfd on Linux. Returns error everywhere else. | |
| * Creates an eventfd if available. Returns ENOTSUP error everywhere else. |
| * Returns 0 for success, -1 for failure. Sets errno on failure. | ||
| */ | ||
| DLLEXPORT int32_t SystemNative_EventFD(uint32_t initialValue, | ||
| int32_t flags); // Passes through to eventfd() without conversion |
There was a problem hiding this comment.
Going along with my previous comment:
| int32_t flags); // Passes through to eventfd() without conversion | |
| int32_t flags); |
| int bytesWritten = Interop.Sys.Write(_shutdownWritePipe, &b, 1); | ||
| if (bytesWritten != 1) | ||
| byte[] wakeThreadUp = {0, 0, 0, 0, 0, 0, 0, 1}; | ||
| fixed (byte *b = &wakeThreadUp[0]) |
There was a problem hiding this comment.
This allocates an 8-byte array on the heap. That isn't necessary. You can instead make it:
byte* wakeThreadUp = stackalloc byte[8] { 0, 0, 0, 0, 0, 0, 0, 1 };and not need the fixed at all.
There was a problem hiding this comment.
Thanks. Still working on the intuition to understand how C# code is generated/executed.
| // When using eventfd, read/write ends of the "pipe" have the same file | ||
| // descriptor. | ||
| // | ||
| Interop.Sys.Close((IntPtr)_shutdownReadPipe); |
There was a problem hiding this comment.
This would be better as:
if (_shutdownReadPipe == _shutdownWritePipe)
{
if (_shutdownReadPipe != -1)
{
...
}
}| if (_shutdownWritePipe != -1) | ||
| { | ||
| Interop.Sys.Close((IntPtr)_shutdownWritePipe); | ||
| } |
There was a problem hiding this comment.
Or better yet:
if (_shutdownReadPipe != -1)
{
Interop.Sys.Close((IntPtr)_shutdownReadPipe);
}
if (_shutdownReadPipe != _shutdownWritePipe && _shutdownWritePipe != -1)
{
Interop.Sys.Close((IntPtr)_shutdownWritePipe);
}| int bytesWritten = Interop.Sys.Write(_shutdownWritePipe, b, 8); | ||
| if (bytesWritten != 8) | ||
| { | ||
| throw new InternalException(bytesWritten); |
There was a problem hiding this comment.
This looks suspect. In general write isn't guaranteed to write everything you pass in, which is why it returns how many bytes it wrote. When we were writing 1 byte, this was fine, because it won't return 0 unless there was an error, but now that it's being called with 8, it could theoretically successfully return 1 through 7. Does something prevent that from being the case when the fd is a pipe or an eventfd, on all platforms we target now and in the future?
There was a problem hiding this comment.
I'm aware of that. However, for eventfd, this is always fine (it's always ready to accept a 64-bit int). When using pipes, it's very unlikely that the buffer wouldn't have at least 8 bytes free, especially since these pipes are used only during shutdown and this is the only thing that's ever written to them (to make it more robust on Linux the pipe could be opened with O_DIRECT, so read/writes are atomic, but this doesn't work under WSL1 and BSDs AFAICT). So this is fine.
There was a problem hiding this comment.
very unlikely
I don't think "very unlikely" is sufficient for code that we need to be extremely robust.
I suggest instead doing:
Debug.Assert(bytesWritten == 8);
if (bytesWritten < 1)
{
throw new InternalException(bytesWritten);
}or something like that.
There was a problem hiding this comment.
Pipe buffers are at least a page in all systems we care about. It's not very unlikely, it's very unlikely.
But, yeah, the proposed change is fine. Even if the assert is removed in release builds, this would fine: the amount of bytes written here (on pipes, where short writes are possible) doesn't really matter as it's only used to wake the thread blocked on poll().
|
We've been considering setting the number of engines to be 1 always instead of scaling with proccount (cc: @VSadov). If we do that, the benefits of this change would be significantly reduced, to saving 1 file descriptor per process rather than N file descriptors per process, and I'm not sure it'll be worth the additional code and complication it brings to the codebase. If we decide to continue scaling, seems fine. |
Agreed. If that's what's planned, I can just close this PR then. |
|
Yes. It is likely that the number of engines will be revised to 1 or some modest number close to that. |
@VSadov @stephentoub dotnet/corefx#36693 is where I previously looked at reducing count to 1. Performance was for some benchmarks a bit better. For fortunes with 65k connections there was a clear regression. With the current implementation, multiple engines are created only when there are at least 6 cores, and existing engines have 32 sockets in use. Probably these systems can spare a file descriptor. |
|
Alright, closing this then. Thanks, folks! |
|
@tmds - yes, @stephentoub pointed me to dotnet/corefx#36693 because I started making the same conclusions - 1 poller is nearly always enough. However, there is also the 65K connection scenario, so this is not so simple. |
Since we don't really care what is being written to the pipe to wake up the event loop, we can substitute it for an eventfd(2) on Linux and save ourselves a file descriptor per SocketAsyncEngine. (For manycore systems, that's potentially dozens of file descriptors that aren't needed anymore.)
(It's a tiny wee little bit more efficient to write to an eventfd than it is writing to a pipe, although in this case it doesn't matter that much as we're shutting down the engines.)
Also, I don't know C# very well, so the changes in
SocketAsyncEngine.RequestEventLoopShutdown()might not be correct.