Skip to content

Set clusterID label on machine#608

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
alexander-demicev:clusterid
Jun 11, 2020
Merged

Set clusterID label on machine#608
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
alexander-demicev:clusterid

Conversation

@alexander-demicev
Copy link
Copy Markdown
Contributor

Set clusterID label on the machine using controller. Currently, we require the user to set clusterID label manually, this PR introduces usage of infrastructure resource for setting label.

@enxebre
Copy link
Copy Markdown
Member

enxebre commented Jun 4, 2020

Thanks!
Please let's make sure to reference the upcoming revendor PRs here.
This will require updating docs to drop the requirement to manually add the label.

/approve

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2020
@JoelSpeed
Copy link
Copy Markdown
Contributor

Given we are introducing webhooks to validate provider specs for Machine/MachineSet, why don't we add this as a default/validation item in the webhook rather than in the controller? I would have thought the controller behaviour shouldn't be changing and it should still expect it to be set

@enxebre
Copy link
Copy Markdown
Member

enxebre commented Jun 4, 2020

Given we are introducing webhooks to validate provider specs for Machine/MachineSet, why don't we add this as a default/validation item in the webhook rather than in the controller? I would have thought the controller behaviour shouldn't be changing and it should still expect it to be set

The need for this label is mainly for AWS to be able use it to filter instances by name and clusterID when there's no an instanceID available yet https://github.com/openshift/cluster-api-provider-aws/blob/master/pkg/actuators/machine/reconciler.go#L358-L371

This is just a detail we as devs choose to implement this way, it's not a legitimately user input API field. The clusterID is by design implicit and available on the context and client that the machine is being created at any time. Therefore I don't think this should be exposed as user input at all or defaulted by a webhook at creation/updating time. The controller should be able to discover and pass the clusterID through with or without a webhook at the front. We could even eventually drop the label and use a different mechanism to pass the clusterID if we choose to and this should be transparent to machine creation validation/defaulting.

Copy link
Copy Markdown

@Danil-Grigorev Danil-Grigorev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just an improvement

Comment on lines +474 to +479
infra := &configv1.Infrastructure{}
infraName := client.ObjectKey{Name: globalInfrastuctureName}

if err := r.Client.Get(ctx, infraName, infra); err != nil {
return err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be substituted with

func getInfrastructure(c runtimeclient.Reader) (*configv1.Infrastructure, error) {
if c == nil {
return nil, errors.New("no API reader -- will not fetch infrastructure config")
}
infra := &configv1.Infrastructure{}
infraName := runtimeclient.ObjectKey{Name: globalInfrastuctureName}
if err := c.Get(context.Background(), infraName, infra); err != nil {
return nil, err
}
return infra, nil
}

With added tests from pkg/controller/machine/machine_controller_test.go you'll essentially cover both.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to import a function from vsphere utils, but making a shared utils package in future makes sense. Anyway, it's not the scope of this PR


skipWaitForDeleteTimeoutSeconds = 60 * 5

globalInfrastuctureName = "cluster"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, VSphere util already exposes this variable:

globalInfrastuctureName = "cluster"

@alexander-demicev
Copy link
Copy Markdown
Contributor Author

/retest

enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jun 5, 2020
The need for this label is mainly for AWS to be able use it to filter instances by name and clusterID when there's no an instanceID available yet https://github.com/openshift/cluster-api-provider-aws/blob/master/pkg/actuators/machine/reconciler.go#L358-L371

This is just a detail we as devs choose to implement this way, it's not a legitimately user input API field. The clusterID is by design implicit and available on the context and client that the machine is being created at any time. Therefore I don't think this should be exposed as user input at all or defaulted by a webhook at creation/updating time. The controller should be able to discover and pass the clusterID through with or without a webhook at the front. We could even eventually drop the label and use a different mechanism to pass the clusterID if we choose to and this should be transparent to machine creation validation/defaulting.
The burden for the user to set this label will be fixed by openshift#608
@enxebre
Copy link
Copy Markdown
Member

enxebre commented Jun 10, 2020

/retest
PTAL @Danil-Grigorev

@Danil-Grigorev
Copy link
Copy Markdown

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 10, 2020
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@alexander-demichev: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-azure 75857f1 link /test e2e-azure
ci/prow/e2e-azure-operator 75857f1 link /test e2e-azure-operator
ci/prow/e2e-aws-scaleup-rhel7 75857f1 link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 9a69f85 into openshift:master Jun 11, 2020
@alexander-demicev alexander-demicev deleted the clusterid branch June 11, 2020 10:42
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-openstack that referenced this pull request Jul 14, 2020
- Add configv1 to scheme to automatically get clusterId from infrastructure resource introduced in PR: openshift/machine-api-operator#608
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-baremetal that referenced this pull request Jul 14, 2020
- Add missing configv1.AddToScheme for getting ClusterId label from Infrastructure resource, added in: openshift/machine-api-operator#608
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-baremetal that referenced this pull request Jul 14, 2020
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-baremetal that referenced this pull request Jul 15, 2020
- Add missing configv1.AddToScheme for getting ClusterId label from Infrastructure resource, added in: openshift/machine-api-operator#608
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-baremetal that referenced this pull request Jul 15, 2020
Danil-Grigorev pushed a commit to Danil-Grigorev/cluster-api-provider-baremetal that referenced this pull request Jul 15, 2020
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 16, 2020
…nd machineSet via webhook

With openshift#608 we dropped the burden from the user to set the clusterID label on machines.
As elaborated in openshift#608 (comment) the motivation is that this is an implementation detail that users shouldn't care about.

However as the labels are used by machineSet to determine ownership, the change introduced above might result in edge scenarios where the machineSet and machine label has a different value. This would result in machines going orphan and the machineSet recreating new instances. Bad.

Therefore we choose now to remove the burden from users by enforcing the label value via webhhooks and keeping the old behaviour in the backend to avoid any chance of breaking existing environments where bad input might have been set as in https://bugzilla.redhat.com/show_bug.cgi?id=1857175.
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 16, 2020
…nd machineSet via webhook

With openshift#608 we dropped the burden from the user to set the clusterID label on machines.
As elaborated in openshift#608 (comment) the motivation is that this is an implementation detail that users shouldn't care about.

However as the labels are used by machineSet to determine ownership, the change introduced above might result in edge scenarios where the machineSet and machine label has a different value. This would result in machines going orphan and the machineSet recreating new instances. Bad.

Therefore we choose now to remove the burden from users by enforcing the label value via webhhooks and keeping the old behaviour in the backend to avoid any chance of breaking existing environments where bad input might have been set as in https://bugzilla.redhat.com/show_bug.cgi?id=1857175.
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 16, 2020
…nd machineSet via webhook

With openshift#608 we dropped the burden from the user to set the clusterID label on machines.
As elaborated in openshift#608 (comment) the motivation is that this is an implementation detail that users shouldn't care about.

However as the labels are used by machineSet to determine ownership, the change introduced above might result in edge scenarios where the machineSet and machine label has a different value. This would result in machines going orphan and the machineSet recreating new instances. Bad.

Therefore we choose now to remove the burden from users by enforcing the label value via webhhooks and keeping the old behaviour in the backend to avoid any chance of breaking existing environments where bad input might have been set as in https://bugzilla.redhat.com/show_bug.cgi?id=1857175.
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 16, 2020
…nd machineSet via webhook

With openshift#608 we dropped the burden from the user to set the clusterID label on machines.
As elaborated in openshift#608 (comment) the motivation is that this is an implementation detail that users shouldn't care about.

However as the labels are used by machineSet to determine ownership, the change introduced above might result in edge scenarios where the machineSet and machine label has a different value. This would result in machines going orphan and the machineSet recreating new instances. Bad.

Therefore we choose now to remove the burden from users by enforcing the label value via webhhooks and keeping the old behaviour in the backend to avoid any chance of breaking existing environments where bad input might have been set as in https://bugzilla.redhat.com/show_bug.cgi?id=1857175.
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 16, 2020
…nd machineSet via webhook

With openshift#608 we dropped the burden from the user to set the clusterID label on machines.
As elaborated in openshift#608 (comment) the motivation is that this is an implementation detail that users shouldn't care about.

However as the labels are used by machineSet to determine ownership, the change introduced above might result in edge scenarios where the machineSet and machine label has a different value. This would result in machines going orphan and the machineSet recreating new instances. Bad.

Therefore we choose now to remove the burden from users by enforcing the label value via webhhooks and keeping the old behaviour in the backend to avoid any chance of breaking existing environments where bad input might have been set as in https://bugzilla.redhat.com/show_bug.cgi?id=1857175.
enxebre added a commit to enxebre/machine-api-operator that referenced this pull request Jul 27, 2020
This tries to fix the following scenario:
We set ms.Spec.Selector.MatchLabels[MachineClusterIDLabel] if it's not present. It's not present and we set it to the correct value. If there happens to be a bad label in `ms.Spec.Template.Labels` this would result in a miss match.

Follow for
openshift#608,
openshift#644 and
openshift#653.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants