OCPEDGE-649: check if CloudCredentialManager present#4188
Conversation
Add check if CCMO is present within the cluster. If not, then do not pass the flag --cloud-provider to the kubelet
|
@qJkee: This pull request references OCPEDGE-649 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cc @cdoern |
cdoern
left a comment
There was a problem hiding this comment.
looks good. I caught a few things looking over this briefly. I will give this another look next week. I agree with your overall choices and defaults in most of these scenarios!
| github.com/stretchr/testify v1.8.4 | ||
| github.com/vincent-petithory/dataurl v1.0.0 | ||
| golang.org/x/net v0.18.0 | ||
| golang.org/x/net v0.19.0 |
There was a problem hiding this comment.
might have asked this, but any reason for these bumps?
There was a problem hiding this comment.
I think this comes from openshift/api
|
|
||
| replace k8s.io/kube-openapi => github.com/openshift/kube-openapi v0.0.0-20230816122517-ffc8f001abb0 | ||
|
|
||
| replace github.com/openshift/api => github.com/qjkee/api v0.0.0-20240209124943-dfcecbd06e01 |
There was a problem hiding this comment.
again, we would like this merged before we can merge this cc @JoelSpeed
There was a problem hiding this comment.
This is just a temp replace. Once this PR is approved I'll merge api PR and bump it there
| external, err := isTopologyExternal(conf.manifestDir) | ||
| if err != nil { | ||
| klog.Infoln("failed to check if topology is external", err) | ||
| } else if external { |
There was a problem hiding this comment.
makes sense. just hypershift, right?
There was a problem hiding this comment.
Yes, this check is used to detect if we are in hypershift environment
| cfg.configClientSet = cl | ||
| } | ||
|
|
||
| cv, err := cfg.configClientSet.ConfigV1().ClusterVersions().Get(context.Background(), "version", metav1.GetOptions{}) |
| return false, fmt.Errorf("failed to read cluster config, %w", err) | ||
| } | ||
|
|
||
| obji, err := runtime.Decode(kscheme.Codecs.UniversalDecoder(corev1.SchemeGroupVersion), f) |
There was a problem hiding this comment.
interesting. can we not just unmarshal into a configmap and if we fail, error?
There was a problem hiding this comment.
this approach used in multiple places, so I decided to with it as well
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cdoern, qJkee The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold for QE testing |
|
@qJkee: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
JoelSpeed
left a comment
There was a problem hiding this comment.
I'm not sure this PR is actually even required.
The --cloud-provider flag is currently set to the "" value for all cases where the capability can be disabled, which is the intention of this PR as well.
For platform None, Baremetal, and when the platofrm is External and CloudControllerManager.State is None, the library go cloud provider functions already behave as intended.
Since these configurations are all currently immutable, I don't see a need to add this complexity within the MCO.
| return disabled, err | ||
| } | ||
|
|
||
| disabled, err = checkInstallConfigForCCMODisabled(conf) |
There was a problem hiding this comment.
The cluster version should be the source of truth, why is this required? If the clusterversion checks fail I would not expect a fallback
|
|
||
| // checkInstallConfigForCCMODisabled reads install config from the filesystem | ||
| // and checks if CloudController capability disabled | ||
| func checkInstallConfigForCCMODisabled(cfg config) (bool, error) { |
There was a problem hiding this comment.
Should not check install config, it's mutable, where the APIs themselves are not, why are we including this fallback?
| if err != nil { | ||
| klog.Errorln("failed to check if cloud provider external", err) | ||
| } else if external { | ||
| if cfg.CloudControllerDisabled && isCCMNone(cfg.Infra.Status.PlatformStatus) { |
There was a problem hiding this comment.
The latter half of this check is already done in library-go, it's impossible to reach this code on platforms where the cloud controller manager may be disabled.
Add check if CCMO is present within the cluster. If not, then do not pass the flag to kubelet related to the cloud providers
Here I try to read the install config from the cluster or from the disc and detect enabled capabilities.
If CloudController capability is disabled(cloud controller manager operator is not installed to the cluster) we return an empty value passed to the --cloud-provider kubelet flag. We pass external only in case if cloud provider is actually external to support manual installation of providers
This is a new PR in order to keep actual context. The old one is #3999