Skip to content

Increase Zookeeper probe timeouts#220

Merged
lhotari merged 1 commit intoapache:masterfrom
lhotari:lh-increase-zk-probe-timeout
Jan 31, 2022
Merged

Increase Zookeeper probe timeouts#220
lhotari merged 1 commit intoapache:masterfrom
lhotari:lh-increase-zk-probe-timeout

Conversation

@lhotari
Copy link
Copy Markdown
Member

@lhotari lhotari commented Jan 31, 2022

Motivation

  • 5 seconds seems to be too short a probe timeout on a system with low resources, such as in CI

Modifications

  • change default timeout to 30 seconds (which matches the probe's periodSeconds parameter)

- 5 seconds seems to be a too short probe timeout on a system with low resources such as in CI
Copy link
Copy Markdown
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 seconds seems to be too short a probe timeout on a system with low resources, such as in CI

Were you seeing CI errors because of the low value? What are the consequences of changing this default? It seems to me that it'd let an unresponsive ZK receive traffic for 25 seconds longer, but zk might have some other mechanism that intercedes first. If it does increase the time to recovery, I am not sure that we should change the default for the chart. Can we update a CI values.yaml instead?

@lhotari
Copy link
Copy Markdown
Member Author

lhotari commented Jan 31, 2022

5 seconds seems to be too short a probe timeout on a system with low resources, such as in CI

Were you seeing CI errors because of the low value? What are the consequences of changing this default? It seems to me that it'd let an unresponsive ZK receive traffic for 25 seconds longer, but zk might have some other mechanism that intercedes first. If it does increase the time to recovery, I am not sure that we should change the default for the chart. Can we update a CI values.yaml instead?

There hasn't been a timeout for k8s <1.20 before #214 . It's better to set a safe default than have a very tight timeout.

@lhotari lhotari merged commit dd0e6d8 into apache:master Jan 31, 2022
Copy link
Copy Markdown
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari - thanks for the explanation, LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants