-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
gh-81536: Avoids repeated GIL acquisition when reading/writing to SSL sockets #102214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I believe this is a suitable implementation for #81536, though I haven't been able to construct a benchmark to demonstrate it. If someone is able to show it, that'd be great. Theoretically, since we're doing more work outside of the GIL, it should parallelise better. |
|
Have you tried the test in #25478? It's a very reliable test. |
|
Also suggest the test on #25478 it currently fails with this PR, possibly a recv_into test where the buffer is larger than the amount to be read (last read would be a retval == 0 but count > 0). It looks like PySSL_select with 0 timeout is always going to return SOCKET_IS_NONBLOCKING so it never reads more than 16KB. |
You're right. If I set the timeout to 1 then the test passes. I wonder if now I'll see some perf impact? Might have to figure out how to flow a proper non-blocking select() through that function. |
|
I can construct tests to show a big perf impact from the change now, so it's finally doing what was expected. Passes that other test case as well, so I think we're getting somewhere. |
|
@zooba how do you solve the problem that I describe in my PR for this problem? That OpenSSL can read EOF + data? From what I understand write does not need a loop since the openssl docs say that it will always write the whole buffer. |
|
Because of the earlier bug, I hadn't encountered the close issue :) So I guess I'm going to go learn more about how OpenSSL works to figure out a workaround. If OpenSSL wants to guarantee they write the whole buffer, they can skip returning the actual number of bytes written. But the loop is required to keep pushing the protocol along - it was there before, I just inverted it a bit. |
|
Diff for the test to show an issue with a timeout > 0, https://gist.github.com/mnightingale/7f626f95531b4d4e1bd5047fd0811ade results in "TimeoutError: The read operation timed out". So with a 128KB buffer, we can read the first 64KB then it will timeout because the echo server is only going to return 64KB. An attempt at resolving this based on #31492, https://gist.github.com/mnightingale/92bca633b019fd5b05886b18401b3ee7 it complicates the flow a little bit. |
|
I already fixed the timeout issue for myself, just haven't pushed the change yet (need to fix up sockstate). The problem I'm hitting now is OpenSSL is failing on the last (empty) read, but without setting a proper error code. Seems exactly the sort of thing that should be a bug on their side, but I'm not confident that I'm using it right to go and report it yet. There's also some kind of race condition going on, at least in the test I'm using ( |
Could it be the known bug listed on https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html#BUGS The same error state appears to happens without this PR. |
Yes, that'll be it. I've added some notes about it to not lose track. The problem with the FTP test seems to be the extra No idea right now if that can be dealt with or not. As it doesn't seem to come up in any other tests, I assume it's not a real common pattern, so perhaps we can make users enable a flag for "fast parallel TLS" if they want it? Which I would expect most will. |
|
And yes, I know that's exactly what the earlier posts say :) I'm having to learn OpenSSL for this, so need to go through things on my own and use my own words |
|
@zooba My PR for the faster-TLS-flag (#31492) was ignored for almost a year, so I hope if you implement something like it, it will be accepted. |
Sorry about that. I have the advantage as a core developer that I get to be responsible for my change forever, so it's more likely that the rest of the team will approve it. We are always more careful taking on code from contributors if we aren't sure we can (or want to) maintain it ourselves. I'm probably going to have to leave this on hold for a bit anyway. It will be some kind of API change to enable, and the benchmarks we've been running haven't shown improvement in real code that we care about. So even though it's theoretically better, I'm unlikely to push through the change with only my own support. |
|
Indeed the change is limited to a subset of people that want to receive or send more than 1 TLS packet (16Kb) at once and do so concurrently. For that use case the change is very substantial: we are seeing real world +25% throughput from users that use our patched recv. |
I don't think it's OpenSSL, it's Python's approach to wrapping sockets but leaving both in the public API. If an SSLSocket completely subsumed the socket, it would be something we could change, but because it's possible (encouraged!) for users to use both the raw socket and the wrapped SSL socket independently, it's much harder to make any changes to the API. It's definitely something I've criticised before, and a big part of why I'd happily see the |
Uh oh!
There was an error while loading. Please reload this page.