Skip to content

OCPEDGE-649: check if CloudCredentialManager present#4188

Closed
qJkee wants to merge 5 commits into
openshift:masterfrom
qJkee:OCPVE-628
Closed

OCPEDGE-649: check if CloudCredentialManager present#4188
qJkee wants to merge 5 commits into
openshift:masterfrom
qJkee:OCPVE-628

Conversation

@qJkee
Copy link
Copy Markdown
Contributor

@qJkee qJkee commented Feb 14, 2024

Add check if CCMO is present within the cluster. If not, then do not pass the flag to kubelet related to the cloud providers

Here I try to read the install config from the cluster or from the disc and detect enabled capabilities.
If CloudController capability is disabled(cloud controller manager operator is not installed to the cluster) we return an empty value passed to the --cloud-provider kubelet flag. We pass external only in case if cloud provider is actually external to support manual installation of providers

This is a new PR in order to keep actual context. The old one is #3999

qJkee added 4 commits February 9, 2024 15:59
Add check if CCMO is present within the cluster.
If not, then do not pass the flag --cloud-provider
to the kubelet
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 14, 2024
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Feb 14, 2024

@qJkee: This pull request references OCPEDGE-649 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

Details

In response to this:

Add check if CCMO is present within the cluster. If not, then do not pass the flag to kubelet related to the cloud providers

Here I try to read the install config from the cluster or from the disc and detect enabled capabilities.
If CloudController capability is disabled(cloud controller manager operator is not installed to the cluster) we return an empty value passed to the --cloud-provider kubelet flag. We pass external only in case if cloud provider is actually external to support manual installation of providers

This is a new PR in order to keep actual context. The old one is #3999

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@qJkee
Copy link
Copy Markdown
Contributor Author

qJkee commented Feb 14, 2024

/cc @cdoern

@openshift-ci openshift-ci Bot requested review from cdoern, djoshy and mtrmac February 14, 2024 17:36
Copy link
Copy Markdown
Contributor

@cdoern cdoern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. I caught a few things looking over this briefly. I will give this another look next week. I agree with your overall choices and defaults in most of these scenarios!

Comment thread go.mod
github.com/stretchr/testify v1.8.4
github.com/vincent-petithory/dataurl v1.0.0
golang.org/x/net v0.18.0
golang.org/x/net v0.19.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might have asked this, but any reason for these bumps?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comes from openshift/api

Comment thread go.mod

replace k8s.io/kube-openapi => github.com/openshift/kube-openapi v0.0.0-20230816122517-ffc8f001abb0

replace github.com/openshift/api => github.com/qjkee/api v0.0.0-20240209124943-dfcecbd06e01
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, we would like this merged before we can merge this cc @JoelSpeed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a temp replace. Once this PR is approved I'll merge api PR and bump it there

external, err := isTopologyExternal(conf.manifestDir)
if err != nil {
klog.Infoln("failed to check if topology is external", err)
} else if external {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. just hypershift, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this check is used to detect if we are in hypershift environment

cfg.configClientSet = cl
}

cv, err := cfg.configClientSet.ConfigV1().ClusterVersions().Get(context.Background(), "version", metav1.GetOptions{})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a lister here?

return false, fmt.Errorf("failed to read cluster config, %w", err)
}

obji, err := runtime.Decode(kscheme.Codecs.UniversalDecoder(corev1.SchemeGroupVersion), f)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. can we not just unmarshal into a configmap and if we fail, error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach used in multiple places, so I decided to with it as well

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 16, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cdoern, qJkee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 16, 2024
@qJkee
Copy link
Copy Markdown
Contributor Author

qJkee commented Feb 17, 2024

/hold for QE testing

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 17, 2024
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 27, 2024

@qJkee: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op dbb040e link true /test e2e-gcp-op
ci/prow/okd-scos-e2e-aws-ovn dbb040e link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-azure-ovn-upgrade-out-of-change dbb040e link false /test e2e-azure-ovn-upgrade-out-of-change

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Copy Markdown
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this PR is actually even required.

The --cloud-provider flag is currently set to the "" value for all cases where the capability can be disabled, which is the intention of this PR as well.

For platform None, Baremetal, and when the platofrm is External and CloudControllerManager.State is None, the library go cloud provider functions already behave as intended.

Since these configurations are all currently immutable, I don't see a need to add this complexity within the MCO.

return disabled, err
}

disabled, err = checkInstallConfigForCCMODisabled(conf)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster version should be the source of truth, why is this required? If the clusterversion checks fail I would not expect a fallback


// checkInstallConfigForCCMODisabled reads install config from the filesystem
// and checks if CloudController capability disabled
func checkInstallConfigForCCMODisabled(cfg config) (bool, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not check install config, it's mutable, where the APIs themselves are not, why are we including this fallback?

if err != nil {
klog.Errorln("failed to check if cloud provider external", err)
} else if external {
if cfg.CloudControllerDisabled && isCCMNone(cfg.Infra.Status.PlatformStatus) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter half of this check is already done in library-go, it's impossible to reach this code on platforms where the cloud controller manager may be disabled.

@qJkee qJkee closed this Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants