-
-
Notifications
You must be signed in to change notification settings - Fork 747
Increase default for connect timeout #4228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Increasing the default to 30s makes sense to me. Would like to get a 👍 from another maintainer before merging though. |
|
I just checked, we've been running with a Anyhow,... just wanted to put a real world data point in here. timeouts are sometimes a sensitive, infra depending issue and 60s may be good for us but too much for other. Either way, I believe 10s is too small :) |
|
On large benchmark runs we have set the timeout somewhat arbitrarily to 100s. Increasing to 30 or even 60 would be fine with me. |
|
I am not sure if my issue is related to this, I am seeing the following error about 2 -3 seconds after attempting to connect: |
|
Just confirmed that I can connect if I change distribured version to 2.30.0 just for Client (without changing the scheduler). |
|
@latusaki I think your report is unconnected to the actual value of the timeout which is discussed in here. Can you open another ticket for this? At the very least the exception message might be misleading since it's probably something other than the timeout then. In particular the entire traceback would be interesting for this since the exception cause should usually also be logged which should reveal the actual exception |
|
Superseded by #5022 |
|
We bumped to 30s in #5022. Closing this |
Disclaimer: I don't have data to back this up, only a gut feeling.
We're still seeing connection timeouts (#4080), even after the removal of the fixed handshake timeouts in #4176. There we already discussed the increase of the default timeout since the
connectno longer just needs to accommodate for the actual TCP (or whatever) connect but also for the handshake.What's made this worse is that due to DNS instabilities it was requested to split up to favour multiple small attempts instead of one large attempt (#3104)
Putting everything together, I believe the initial default of 10s should be increased to counteract the added complexity of this operation. I don't have a good feeling about an appropriate value here, in #4176 there was a suggestion to increase it to 30s but I could also see higher values to make sense.
cc @jcrist