Deadlocks caused by openssl's handling of TLS 1.3 session tickets

We have a test that tries running `SSLStream` over an unbuffered transport, in part to confirm we aren't making assumptions about buffering. It turns out that openssl 1.1.1/TLS 1.3 does make assumptions about buffering, and the test hangs. Now what?

Specifically: we [create an unbuffered transport, then call `do_handshake` on both the client and server sides](https://github.com/python-trio/trio/blob/5a03bee3fbf1a5364a3584ef8150b04143dcecdb/trio/tests/test_ssl.py#L762-L771). The `client.do_handshake()` completes successfully. The `server.do_handshake()` hangs.

I believe what's happening is that when openssl 1.1.1 is doing TLS 1.3, the server automatically sends some session tickets after the handshake completes. Technically these aren't part of the handshake, so the `client.do_handshake()` call doesn't wait for them. But on the server side we don't know this; all we can see is that we called `SSL_do_handshake` and it spit out some data to send, so we assume that the handshake won't complete until we send that data.

I guess there are two scenarios that the designers imagined: (1) the client is trying to read application-level data, so it's reading from the transport, and it gets the session tickets as a bonus. (2) the client isn't trying to read application-level data, so it's not reading from the transport, but the transport has enough intrinsic buffering that the session tickets can just sit there forever without causing any problems. But in our test neither is true, so it breaks.

As far as I know, this shouldn't have too much real-world impact, since real transports almost always have some buffering. (Empirically the session tickets appear to be 510 bytes, which is smaller than pretty much any real transport buffer.) And, there is a workaround available: modify `clogged_stream_maker` so the server does `await server.do_handshake(); await server.send_all(b"x")`, and the client does `await client.do_handshake(); await client.receive_some(1)`. Sending a bit of application level data from server→client like this should flush out the session tickets and get back to the state we want for the rest of the tests.

Nonetheless, it's not ideal that openssl is making subtle assumptions about buffering like this. Shoudl we do something?

Unfortunately, TLS 1.3's decision to move session tickets out of the handshake doesn't give us a lot of options here. We could disable session tickets, but that seems sub-optimal (and currently the `ssl` module doesn't expose a knob for this anyway).

We could potentially delay sending the session tickets until the first time we send application-level data. That way `server.do_handshake()` wouldn't need to make any assumptions about buffering, and it's totally reasonable that `server.send_all(...)` has to wait for the peer to call `client.receive_some(...)`. But, there are two challenges:

* I'm not sure how to implement this. Specifically, it requires figuring out that the data openssl has dumped into its send buffer is the session tickets, as opposed to part of the handshake. Is there any reliable way to do this?

* There are some cases where session tickets are successfully delivered now, but wouldn't be if we implemented this. Specifically, if you have a connection where the server never sends any application-level data, *but* the client does read from network. This sounds a little weird, but it could happen: e.g. any twisted or asyncio client will act like this; or, if you're using trio, you could intentionally implement a client like this exactly because you want to get session tickets.

So.... I'm not having any brilliant ideas here. I think for right now we'll just implement the workaround to get the test suite working, and then wait to see if this becomes an issue or not.

CC: @tiran – I don't think there's anything for you to do here in particular, but it seems like the kind of issue that you like to know about.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Deadlocks caused by openssl's handling of TLS 1.3 session tickets #819

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Deadlocks caused by openssl's handling of TLS 1.3 session tickets #819

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions