Skip to content

[FCOS] Control amount of replicas in etcd-quorum-guard deployment#1708

Merged
openshift-merge-robot merged 1 commit intoopenshift:fcosfrom
vrutkovs:fcos-quorum-guard-scale
May 20, 2020
Merged

[FCOS] Control amount of replicas in etcd-quorum-guard deployment#1708
openshift-merge-robot merged 1 commit intoopenshift:fcosfrom
vrutkovs:fcos-quorum-guard-scale

Conversation

@vrutkovs
Copy link
Copy Markdown
Contributor

@vrutkovs vrutkovs commented May 1, 2020

- What I did

  • Moved etcd-quorum-guard deployment under MCO control (instead of CVO).
  • Updated the code which applies MCD daemonset to apply etcd-quorum-guard deployment
  • Updated amount of replicas in etcd-quorum-guard deployment - if CEO's unsupported non-HA option is enabled MCO would create a deployment with one replica.
  • etcd-quorum-guard now uses pod image instead of cli

This allows provisioning single node clusters.

- How to verify it
Create a cluster with 0 workers, 1 master and apply CEO's manifest to allow less than 3 workers(openshift/cluster-etcd-operator#279).

TODO:

  • Check if cliImage can be avoided and infraImage can be reused instead

This is submitted for fcos branch only, as master's implementation would depend on scalable control plane enhancement

/cc @LorbusChris

Fixes okd-project/okd#162

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2020
@vrutkovs vrutkovs force-pushed the fcos-quorum-guard-scale branch 2 times, most recently from 92f0a31 to 90a7f31 Compare May 1, 2020 13:17
@vrutkovs vrutkovs force-pushed the fcos-quorum-guard-scale branch 2 times, most recently from aef70ca to 6930a61 Compare May 2, 2020 05:51
@LorbusChris
Copy link
Copy Markdown
Contributor

LorbusChris commented May 5, 2020

I'd rather not introduce this now that are so close to merging fcos with master and instead wait for implementation of openshift/enhancements#292

@vrutkovs
Copy link
Copy Markdown
Contributor Author

vrutkovs commented May 5, 2020

That's for 4.4, the enhancement won't be backported there anyway - and we can branch fcos-4.4 later

@ashcrow
Copy link
Copy Markdown
Member

ashcrow commented May 5, 2020

/cc'ing @cgwalters @dustymabe as well for review

@LorbusChris
Copy link
Copy Markdown
Contributor

this is only needed for OKD CRC, right?

@vrutkovs
Copy link
Copy Markdown
Contributor Author

vrutkovs commented May 5, 2020

That's not required for CRC, but with this PR you can setup a single-node cluster (still having bootstrap node), but that would be a full blown cluster, no operators disabled / ripped. This cluster, of course, cannot be updated

@LorbusChris
Copy link
Copy Markdown
Contributor

Just noting that this will need to be handled differently in master from 4.6 on, as we won't have an fcos fork anymore by then. Otherwise we'll have to regress on this in OKD 4.6.
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2020
@vrutkovs
Copy link
Copy Markdown
Contributor Author

vrutkovs commented May 7, 2020

/refresh

1 similar comment
@vrutkovs
Copy link
Copy Markdown
Contributor Author

/refresh

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@vrutkovs
Copy link
Copy Markdown
Contributor Author

/hold

level=fatal msg="failed to initialize the cluster: Cluster operator machine-config is reporting a failure: Failed to resync 0.0.1-2020-05-11-102411 because: failed to execute template: template: manifests/etcdquorumguard_deployment.yaml:54:24: executing \"manifests/etcdquorumguard_deployment.yaml\" at <.Images.KubeClientAgent>: can't evaluate field KubeClientAgent in type *operator.RenderConfigImages"

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 11, 2020
@vrutkovs vrutkovs force-pushed the fcos-quorum-guard-scale branch from 6930a61 to 938995e Compare May 11, 2020 11:42
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label May 11, 2020
@vrutkovs vrutkovs force-pushed the fcos-quorum-guard-scale branch 6 times, most recently from e306b91 to 197c397 Compare May 11, 2020 15:24
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 18, 2020
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

13 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@vrutkovs
Copy link
Copy Markdown
Contributor Author

/hold

PDB is not valid :(

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 19, 2020
MCO should apply etcd-quorum-guard deployment instead of CVO.
It also controls the number of replicas in this deployment: it would
scale 1 replica if CEO's useUnsupportedUnsafeNonHANonProductionUnstableEtcd
option is enabled.

This allows creating single node clusters
@vrutkovs vrutkovs force-pushed the fcos-quorum-guard-scale branch from 5c04c20 to 884f76b Compare May 19, 2020 08:08
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label May 19, 2020
@vrutkovs
Copy link
Copy Markdown
Contributor Author

/retest

Storage flakes

@vrutkovs
Copy link
Copy Markdown
Contributor Author

/test e2e-aws

1 similar comment
@vrutkovs
Copy link
Copy Markdown
Contributor Author

/test e2e-aws

@vrutkovs
Copy link
Copy Markdown
Contributor Author

/hold

@vrutkovs
Copy link
Copy Markdown
Contributor Author

/hold cancel

@LorbusChris ready for another review (added a default setting for etcd-quorum-guard replicas)

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2020
@LorbusChris
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2020
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LorbusChris, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [LorbusChris,vrutkovs]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 0e34bdf into openshift:fcos May 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants