Skip to content

Set a high terminationGracePeriodSeconds on webhook#3596

Merged
knative-prow-robot merged 1 commit into
knative:masterfrom
mattmoor:raise-termination-grace-period
Jul 15, 2020
Merged

Set a high terminationGracePeriodSeconds on webhook#3596
knative-prow-robot merged 1 commit into
knative:masterfrom
mattmoor:raise-termination-grace-period

Conversation

@mattmoor
Copy link
Copy Markdown
Member

@mattmoor mattmoor commented Jul 15, 2020

When our webhook drains, it sleeps for network.DefaultDrainTimeout after failing readiness probes before exiting, see here which is configured to this value.

I suspect that the default terminationGracePeriodSeconds (of 30) is clipping this sleep already (due to the coordination involved), but I am also thinking about raising this value due to seeing a non-zero number of EOF messages running chaos during our e2e testing.

  • 🧽 Update or clean up current behavior

Related PR in serving: knative/serving#8640

Example of the sort of failure that I'm hoping this helps to address: #3565 (comment)

- 🐛 Fix bug
Extend the terminationGracePeriod to fix issues shutting down the webhook.

When our webhook drains, it sleeps for `network.DefaultDrainTimeout` after failing readiness probes before exiting, see [here](https://github.com/knative/pkg/blob/4419e613c133505ea5109380102765a7699b9bf8/webhook/webhook.go#L229-L234) which is configured to [this value](https://github.com/knative/pkg/blob/4419e613c133505ea5109380102765a7699b9bf8/network/network.go#L39-L43).

I suspect that the default `terminationGracePeriodSeconds` (of `30`) is clipping this sleep already (due to the coordination involved), but I am also thinking about raising this value due to seeing a non-zero number of EOF messages running chaos during our e2e testing.
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Jul 15, 2020
@knative-prow-robot knative-prow-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 15, 2020
@vaikas
Copy link
Copy Markdown
Contributor

vaikas commented Jul 15, 2020

I think this is a bug fix, so we should tag the release notes to indicate it's a bug fix?

@vaikas
Copy link
Copy Markdown
Contributor

vaikas commented Jul 15, 2020

/lgtm
/approve

/hold
Can you just update the release notes to say it's a bug fix? Then feel free to remove the hold.

@knative-prow-robot knative-prow-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Jul 15, 2020
@knative-prow-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattmoor, vaikas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mattmoor
Copy link
Copy Markdown
Member Author

/hold cancel

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 15, 2020
@knative-prow-robot knative-prow-robot merged commit 88c5086 into knative:master Jul 15, 2020
@mattmoor mattmoor deleted the raise-termination-grace-period branch July 15, 2020 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants