TCP connection: detect and send RST#28817
Conversation
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
|
cc @yanavlasov |
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
|
This is cool! A few questions, concerns, or things to document:
No, I think it's better how you have it how. |
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
|
Hi @ggreenway, thanks for the comment!
I think it supports both windows and POSIX systems with different error codes like
This is a good point, and I am still looking into the UNIX networks for the RST behavior on half-closed connection to learn more. My take right now is it needs at least one additional write or read operations to know the errors from socket on half-closed socket. It could have other errors types, and do you have any reference for this?
Yes, but I am thinking like an incremental adoption will be reasonable.
Regarding the event handling, we don't distinguish close type before, but right now we will have either |
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
|
The CI eventually is green for this commit version and only CodeQL is pending. Adding more tests... |
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
|
/retest |
|
LGTM form me with a small nit. Pinging @ggreenway for non Google review. /wait-any |
Signed-off-by: Boteng Yao <boteng@google.com>
Signed-off-by: Boteng Yao <boteng@google.com>
ggreenway
left a comment
There was a problem hiding this comment.
A couple nits, but LGTM overall
/wait
Signed-off-by: Boteng Yao <boteng@google.com>
|
/retest |
1 similar comment
|
/retest |
|
fyi @botengyao this broke us in Istio. I will try out #29616 to see if it fixes it |
|
I believe this does, will know for sure in a few days when it rolls through our pipelines |
Upstream Envoy sets SO_LINGER to 0 only when it detects local or remote reset. A remote close signaled on the HTTP level will lead to a local close on the Envoy connection level. Relates: envoyproxy/envoy#28817
Upstream Envoy sets SO_LINGER to 0 only when it detects local or remote reset. A remote close signaled on the HTTP level will lead to a local close on the Envoy connection level. This commit is to enable SO_LINGER socket option again, with minimum value as 1s. Relates: envoyproxy/envoy#28817
In summary
Related issue #2703
This PR will add the support to detect RST from the peer and send RST by Envoy controlled by flag.
This PR added
enableTcpRstDetectAndSendmethod to the connection object, and it is disabled by default. The filter or application can enable it by callingenableTcpRstDetectAndSend(true)to enable it. This is similar for the usage ofenableHalfClose(). In the meantime, this feature will also be controlled by the runtime guardenvoy_reloadable_features_detect_and_raise_rst_tcp_connection.Detect RST:
The error code can be got from
io_handle(through sys call) read or write and then is propagated back to the transport socket throughIoResult. e.g.,IoResult RawBufferSocket::doRead(..)will return{action, bytes_written, false, err};, and then the connection layer will read the err code and do some job based on it.In this PR I am mainly focusing on linux system.
We translated them to Envoy internal error: SOCKET_ERROR_CONNRESET
This PR only added the support to
raw_buffer_socket, we need to make changes to other transport socket extensions.Send RST:
Discussion:
Event handling: to achieve that Envoy will close the upstream or downstream by sending RST when a RST is received from the peer, we need to find a way to extend the notification mechanism. Right now we are using
Network::ConnectionEventenum class, and the current PR will add 2 more events:LocalResetandRemoteReset.Right now a lot of filters or extensions are only using
LocalCloseandRemoteCloseto handle the close, e.g.,If we reach a consensus that this is a good way to go, I will kick off the clean up plan for this to correctly handle the events for all the callback functions.
Another option is to use the combined event by leveraging bitfield like we can combine the
LocalCloseandLocalResettogether. We actually converted from this bitfield way to the enum way 5 years ago: #1358.Future Plan
Right now I only verified the sending and detection through an async tcp client with raw_buffer socket. In order to incrementally adopt this feature, further plan includes:
1. TcpProxy support for RST detection and actively send it to peer, integration tests and so on.
2. HttpConnectionManager changes to handle it for L7.
3. More transport socket adoption for the system error code. And I think the next big one is
tls_transport_socketwhich is using boringSSL. The current state for that is we only handle all errors asSYSCALL_ERROR, and we need to distinguish the errno.Let me know if this sounds reasonable, and thank you all for the review!
Commit Message:
Additional Description:
Risk Level: Medium
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]