WIP bootkube.sh populate complete list of etcd endpoints during bootstrap by hexfusion · Pull Request #2998 · openshift/installer

hexfusion · 2020-01-28T02:18:44Z

This PR attempts to reduce initial bootstrap complexity caused by only populating the bootstrap endpoint. By feeding apiserver the entire list during bootstrap we avoid the scenario where cluster-etcd-operator completes scaling up to 4 members. The result of this scaling is the host-etcd service is also adjusted to reflect all of the scaled etcd endpoints. Meanwhile, the cluster-kube-apiserver-operator has not yet rolled out the new static pod assets in the correct revision. So when we reap the bootstrap node we leave apiserver with a single backend endpoint pointing at the bootstrap node that is no longer alive.

In the previous version of the etcd client balancer, this would have proven overly disruptive but the new balancer handles the sub connection round-robin failover very gracefully.

We consider this a short term solution while we improve the timings and complexity around this process.

Requires openshift/cluster-etcd-operator#60

hexfusion · 2020-01-28T02:19:56Z

/test e2e-gcp

openshift-ci-robot · 2020-01-28T02:20:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign smarterclayton
You can assign the PR to them by writing /assign @smarterclayton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

hexfusion · 2020-01-28T02:20:23Z

/test e2e-azure

hexfusion · 2020-01-28T02:38:25Z

level=fatal msg="failed to fetch Terraform Variables: failed to fetch dependency of "Terraform Variables": failed to fetch dependency of "Bootstrap Ignition Config": failed to fetch dependency of "Master Machines": failed to generate asset "Platform Credentials Check": validate AWS credentials: checking install permissions: error simulating policy: Throttling: Rate exceeded\n\tstatus code: 400, request id: 6cbb243d-b72a-4332-99f9-4d0f4f16e829"

limit flake

/test e2e-aws

hexfusion · 2020-01-28T03:21:35Z

/test e2e-gcp

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion · 2020-01-28T04:25:29Z

/test e2e-gcp

hexfusion · 2020-01-28T04:55:04Z

/test e2e-aws-upgrade

hexfusion · 2020-01-28T05:44:47Z

/test e2e-gcp

One last try but it appears we will need openshift/cluster-etcd-operator#60

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion · 2020-01-28T15:50:19Z

/test e2e-gcp

hexfusion · 2020-01-28T15:56:35Z

/test e2e-azure

hexfusion · 2020-01-28T17:44:58Z

After testing we have decided to continue with openshift/cluster-etcd-operator#58 and will continue that work on the installer via #3005

openshift-ci-robot · 2020-01-28T18:08:07Z

@hexfusion: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-fips	`21aa688`	link	`/test e2e-aws-fips`
ci/prow/e2e-libvirt	`21aa688`	link	`/test e2e-libvirt`
ci/prow/e2e-gcp	`21aa688`	link	`/test e2e-gcp`
ci/prow/e2e-azure	`21aa688`	link	`/test e2e-azure`
ci/prow/e2e-aws-scaleup-rhel7	`21aa688`	link	`/test e2e-aws-scaleup-rhel7`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 28, 2020

openshift-ci-robot requested review from jcpowermac and jstuever January 28, 2020 02:19

hexfusion force-pushed the fix-4.4-rhcos+ceo branch from 03041cd to 7525812 Compare January 28, 2020 03:17

bootkube.sh populate complete list of etcd endpoints during bootstrap

4ba7073

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion force-pushed the fix-4.4-rhcos+ceo branch from 7525812 to 4ba7073 Compare January 28, 2020 03:23

hexfusion mentioned this pull request Jan 28, 2020

WIP pkg/operator/hostetcdendpointcontroller: disable host-etcd update duing cluster bootstrap openshift/cluster-etcd-operator#60

Closed

hexfusion force-pushed the fix-4.4-rhcos+ceo branch from 886e674 to da1ac09 Compare January 28, 2020 14:01

openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 28, 2020

cmd/openshift-install: shift timeouts from api to bootstrap

21aa688

Signed-off-by: Sam Batschelet <sbatsche@redhat.com>

hexfusion force-pushed the fix-4.4-rhcos+ceo branch from da1ac09 to 21aa688 Compare January 28, 2020 14:06

hexfusion closed this Jan 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP bootkube.sh populate complete list of etcd endpoints during bootstrap#2998

WIP bootkube.sh populate complete list of etcd endpoints during bootstrap#2998
hexfusion wants to merge 2 commits intoopenshift:masterfrom
hexfusion:fix-4.4-rhcos+ceo

hexfusion commented Jan 28, 2020 •

edited

Loading

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

openshift-ci-robot commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

openshift-ci-robot commented Jan 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hexfusion commented Jan 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

openshift-ci-robot commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

hexfusion commented Jan 28, 2020

Uh oh!

openshift-ci-robot commented Jan 28, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hexfusion commented Jan 28, 2020 •

edited

Loading