Skip to content

Conversation

@tmakita
Copy link
Contributor

@tmakita tmakita commented Mar 16, 2017

When an LB backend is removed and the corresponding ipvs destination is
deleted, ipvs maintains dangling connections pointing to the destination
if the connections have existed since before destination removal.
Then packets going to this connection will be dropped in kernel because
there is no destination with this connection. This continues until the
connection expires (e.g. 60 seconds for SYN_RECV (tcp initial) state).
This in some cases causes TCP connection timeout if the connection is
initiated between container failure and ipvs destination deletion.

ipvs provides a parameter "expire_nodest_conn" to reduce the downtime.
When enabling the option, the staled connection immediately expires on
receiving a packet on the connection.

Although this option is not suitable if flapping can happen, I think the
user of LB should ensure it not to happen by taking enough time before
determining the container is unreachable.

Reference: https://www.kernel.org/doc/Documentation/networking/ipvs-sysctl.txt
Signed-off-by: Toshiaki Makita makita.toshiaki@lab.ntt.co.jp

@mavenugo
Copy link
Contributor

@tmakita thanks for the patch and yes, I agree. The only flapping scenario I can think of is during HEALTHCHECK failures. But an HEALTHCHECK failure will also result in endpoint being removed.
So, I think this is a safe fix.

service_linux.go Outdated
err = ioutil.WriteFile("/proc/sys/net/ipv4/vs/expire_nodest_conn", []byte{'1', '\n'}, 0644)
if err != nil {
logrus.Errorf("Failed to write to /proc/sys/net/ipv4/vs/expire_nodest_conn: %v", err)
os.Exit(8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be os.Exit(9)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.
I failed to find the meaning of the return value, but if it is assigned in incremental fashion, it should be 10 shouldn't it? Since 9 is already used.

@aboch
Copy link
Contributor

aboch commented Mar 16, 2017

Thanks @tmakita

Small comment, otherwise LGTM

When an LB backend is removed and the corresponding ipvs destination is
deleted, ipvs maintains dangling connections pointing to the destination
if the connections have existed since before destination removal.
Then packets going to this connection will be dropped in kernel because
there is no destination with this connection. This continues until the
connection expires (e.g. 60 seconds for SYN_RECV (tcp initial) state).
This in some cases causes TCP connection timeout if the connection is
initiated between container failure and ipvs destination deletion.

ipvs provides a parameter "expire_nodest_conn" to reduce the downtime.
When enabling the option, the staled connection immediately expires on
receiving a packet on the connection.

Although this option is not suitable if flapping can happen, I think the
user of LB should ensure it not to happen by taking enough time before
determining the container is unreachable.

Reference: https://www.kernel.org/doc/Documentation/networking/ipvs-sysctl.txt
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
@tmakita tmakita force-pushed the reduce-lb-downtime branch from 1229132 to 9af59fd Compare March 17, 2017 01:35
@tmakita
Copy link
Contributor Author

tmakita commented Mar 17, 2017

Updated with error code 10, presuming it is assigned in incremental fashion.
Correct me if I misunderstood the meaning of the return value.

@aboch
Copy link
Contributor

aboch commented Mar 17, 2017

LGTM

@tmakita
Copy link
Contributor Author

tmakita commented Apr 11, 2017

@mavenugo @aboch
What is the status of this PR? Anything to do?

@mavenugo
Copy link
Contributor

@tmakita I think @sanimej wanted to test this changes in different scenarios to make sure it doesnt cause any regressions.

@tmakita
Copy link
Contributor Author

tmakita commented Apr 11, 2017

@mavenugo thx!

@thaJeztah
Copy link
Member

thaJeztah commented Mar 12, 2020

looks like this was taken care of by #2154, so let me close it, but feel free to comment if you think something is left to be done 👍

@thaJeztah thaJeztah closed this Mar 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants