Skip to content

pkg/asset/ignition/bootstrap/cvoignore: Add group/kind/name(space) collision detection#6247

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
wking:fail-on-manifest-conflicts
Sep 14, 2022
Merged

pkg/asset/ignition/bootstrap/cvoignore: Add group/kind/name(space) collision detection#6247
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
wking:fail-on-manifest-conflicts

Conversation

@wking
Copy link
Copy Markdown
Member

@wking wking commented Aug 20, 2022

docs/design/resource_dep.svg shows the complicated hierarchy of installer assets, and it is likely that not all users are familiar with all the manifests that the installer generates. It's also possible that there is a complicated hierarchy on the caller's side, and not just a single actor, creating additional manifests to feed into a create-manifests flow. With all of these inputs, it seems possible that folks occasionally call the installer with directory contents that result in multiple manifests for a single resource, as determined by the (group, kind, namespace, name) tuples. This commit adds a check at the slightly-before-Ignition-serialization overrides asset, which was already in the business of iterating over the content and building ClusterVersion spec.overrides. Failing fast before we build Ignition configs should allow for relatively cheap recovery, vs. a user's cluster failing to come up, or coming up with a config that diverges from their intention because some other manifest ended up taking precedence over the resource that they'd been trying to control.

Not covered in this commit is deduping between manifests generated on the bootstrap machine (e.g. by the various 'render' calls to operator containers). But that can happen orthogonally, or not, and I think it's still worth having this cheap, pre-Ignition-config sanity check.

…llision detection

docs/design/resource_dep.svg shows the complicated hierarchy of
installer assets, and it is likely that not all users are familiar
with all the manifests that the installer generates.  It's also
possible that there is a complicated hierarchy on the caller's side,
and not just a single actor, creating additional manifests to feed
into a create-manifests flow.  With all of these inputs, it seems
possible that folks occasionally call the installer with directory
contents that result in multiple manifests for a single resource, as
determined by the (group, kind, namespace, name) tuples.  This commit
adds a check at the slightly-before-Ignition-serialization overrides
asset, which was already in the business of iterating over the content
and building ClusterVersion spec.overrides.  Failing fast before we
build Ignition configs should allow for relatively cheap recovery,
vs. a user's cluster failing to come up, or coming up with a config
that diverges from their intention because some other manifest ended
up taking precedence over the resource that they'd been trying to
control.

Not covered in this commit is deduping between manifests generated on
the bootstrap machine (e.g. by the various 'render' calls to operator
containers).  But that can happen orthogonally, or not, and I think
it's still worth having this cheap, pre-Ignition-config sanity check.
@wking
Copy link
Copy Markdown
Member Author

wking commented Aug 23, 2022

Lots of Init container ... not ready.

/retest

Copy link
Copy Markdown
Contributor

@patrickdillon patrickdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/assign @jhixson74
/retest

namespace := u.GetNamespace()
name := u.GetName()

key := fmt.Sprintf("%s |! %s |! %s |! %s", group, kind, namespace, name)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why the |! separator?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't spend much time poking around in allowed-characters for the four entries, and I wanted a delimiter that was unlikely to show up in any of them.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would perhaps be more robust to just use %q%q%q%q?

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Aug 25, 2022
@jhixson74
Copy link
Copy Markdown
Member

This looks straight forward to me, I just want to see the tests pass first ;-)

@jhixson74
Copy link
Copy Markdown
Member

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 31, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhixson74

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 31, 2022
@patrickdillon
Copy link
Copy Markdown
Contributor

/skip
/retest

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 4d3796f and 2 for PR HEAD 1c9e68d in total

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD f0cdc99 and 1 for PR HEAD 1c9e68d in total

@patrickdillon
Copy link
Copy Markdown
Contributor

/skip

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 618f468 and 0 for PR HEAD 1c9e68d in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Sep 12, 2022

@wking: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-ibmcloud 1c9e68d link false /test e2e-ibmcloud
ci/prow/e2e-libvirt 1c9e68d link false /test e2e-libvirt
ci/prow/e2e-ibmcloud-ovn 1c9e68d link false /test e2e-ibmcloud-ovn
ci/prow/okd-e2e-gcp-ovn-upgrade 1c9e68d link false /test okd-e2e-gcp-ovn-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/hold

Revision 1c9e68d was retested 3 times: holding

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 12, 2022
@wking
Copy link
Copy Markdown
Member Author

wking commented Sep 14, 2022

/retest
/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2022
@openshift-merge-robot openshift-merge-robot merged commit 19212e4 into openshift:master Sep 14, 2022
@wking wking deleted the fail-on-manifest-conflicts branch September 15, 2022 05:20
wking added a commit to wking/openshift-release that referenced this pull request Sep 15, 2022
@wking
Copy link
Copy Markdown
Member Author

wking commented Sep 15, 2022

Confirmed that this works as expected, with an error message that could be polished further, but which is already good enough to satisfy me ;)

zaneb added a commit to zaneb/assisted-service that referenced this pull request Sep 19, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-merge-robot pushed a commit to openshift/assisted-service that referenced this pull request Sep 20, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/assisted-service that referenced this pull request Sep 28, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.
openshift-merge-robot pushed a commit to openshift/assisted-service that referenced this pull request Sep 29, 2022
When generating the Ignition files, the installer already sets
schdulableMasters to true when there are no worker nodes (i.e. in the
SNO and compact cluster topologies. (See
openshift/installer#2004) Therefore it is
unnecessary to override it here (though it may be preferred to avoid a
warning log from the installer).

Since openshift/installer#6247, attempting to
override the schedulableMasters setting causes installation to fail,
because there are two manifests of the same type and name that conflict.

Since we don't need to set this override when the installer would
already do it, avoid doing so and triggering the error when the value is
determined by the number of hosts rather than explicitly set by the
user.

The conflict still needs to be resolved so that the user can enable
schedulableMasters, but this at least allows the SNO and compact
topologies to install OpenShift 4.12 again.

This partially reverts commit c45f369.

Co-authored-by: Zane Bitter <zbitter@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants