Skip to content
This repository was archived by the owner on Jan 20, 2026. It is now read-only.

Error if there's dupe connections#28

Merged
BrandonWeng merged 6 commits intomainfrom
bweng-disconnect
Dec 14, 2022
Merged

Error if there's dupe connections#28
BrandonWeng merged 6 commits intomainfrom
bweng-disconnect

Conversation

@BrandonWeng
Copy link
Contributor

Describe your changes and provide context

The peers fall into an unrecoverable state if they already have a connection to another peer and they try to connect again.

It seems like Peer A lost its connection to Peer B, but Peer B's mapping was not updated to be aware of this. So when Peer A tries to connect to peer B again, it fails. Therefore if we get a duplicate request where the peer is already connected, we should Error out the existing connection AND return error, this way if it retries it can successfully connect

Dec 12 15:02:02 ip-172-31-33-167 seid[16502]: 3:02PM ERR failed to accept connection err="peer \"4b336938f10433839baf727999e78965dcbae4d5\" is already connected" module=p2p op=incoming/accepted peer=4b336938f10433839baf727999e78965dcbae4d5
Dec 12 15:02:02 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58322,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF starting service impl=MConnection module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF peer connected endpoint={} module=p2p peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF Connection is closed @ recvRoutine (likely by the other side) conn={} module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"}
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF received peer update module=statesync peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b status=up
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF peer disconnected endpoint={} module=p2p peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF received peer update module=statesync peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b status=down
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF starting service impl=MConnection module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58324,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM ERR failed to accept connection err="peer \"4b336938f10433839baf727999e78965dcbae4d5\" is already connected" module=p2p op=incoming/accepted peer=4b336938f10433839baf727999e78965dcbae4d5
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58324,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF starting service impl=MConnection module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF peer connected endpoint={} module=p2p peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF Connection is closed @ recvRoutine (likely by the other side) conn={} module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"}
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.143.253.100","NodeID":"53dd9f3b8edf7e37ddd6ee152d0eec205181752b","Path":"","Port":26656,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF received peer update module=statesync peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b status=up
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF peer disconnected endpoint={} module=p2p peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b
Dec 12 15:02:03 ip-172-31-33-167 seid[16502]: 3:02PM INF received peer update module=statesync peer=53dd9f3b8edf7e37ddd6ee152d0eec205181752b status=down
Dec 12 15:02:04 ip-172-31-33-167 seid[16502]: 3:02PM INF starting service impl=MConnection module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58326,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:04 ip-172-31-33-167 seid[16502]: 3:02PM ERR failed to accept connection err="peer \"4b336938f10433839baf727999e78965dcbae4d5\" is already connected" module=p2p op=incoming/accepted peer=4b336938f10433839baf727999e78965dcbae4d5
Dec 12 15:02:04 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58326,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:05 ip-172-31-33-167 seid[16502]: 3:02PM INF starting service impl=MConnection module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58328,"Protocol":"mconn"} service=MConnection
Dec 12 15:02:05 ip-172-31-33-167 seid[16502]: 3:02PM ERR failed to accept connection err="peer \"4b336938f10433839baf727999e78965dcbae4d5\" is already connected" module=p2p op=incoming/accepted peer=4b336938f10433839baf727999e78965dcbae4d5
Dec 12 15:02:05 ip-172-31-33-167 seid[16502]: 3:02PM INF stopping service module=p2p peer={"Hostname":"3.22.225.111","NodeID":"4b336938f10433839baf727999e78965dcbae4d5","Path":"","Port":58328,"Protocol":"mconn"} service=MConnection

Testing performed to validate your change

Deployed this in a test cluster and deployed to Peer B, and saw that the Peer A was then able to connect to Peer B again

@BrandonWeng BrandonWeng marked this pull request as ready for review December 13, 2022 16:18
Copy link
Collaborator

@philipsu522 philipsu522 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I think eviction should be ok, since we will retry anyways. Additionally, this logic seems to be used pretty liberally if at max capacity here https://github.com/sei-protocol/sei-tendermint/blob/main/internal/p2p/router.go#L406

@BrandonWeng BrandonWeng merged commit d86db70 into main Dec 14, 2022
@BrandonWeng BrandonWeng deleted the bweng-disconnect branch December 14, 2022 16:16
BrandonWeng added a commit that referenced this pull request Dec 27, 2022
* Error if there's dupe connections

* Go Routine

* evict

* Disconnect and error instead

* panic instead

* revert panic
Timwood0x10 pushed a commit to Timwood0x10/sei-tendermint that referenced this pull request Jun 7, 2023
[acl] Add commit access operation to represent end of message
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants