Skip to content

Problem installing openshift 4.12 SNO cluster using console.redhat.com #6904

@alanconway

Description

@alanconway

Version

Not using openshift-install, using the console.redhat.com assisted installer web UI. My draft cluster is
https://console.redhat.com/openshift/details/s/2MIQiz1aQGpK27xsS2xLSrLhZu9#overview
I can leave it in place if that helps.

Platform:

Baremetal

What happened?

  • Followed UI instructions to create a bare-metal SNO cluster, no extra operators.
  • Host booted and joined sucesfully, appears on "Host Discovery" list but with "Disconnected" status.
  • Cannot log in to cluster:
  • "The connection to the server oauth-openshift.apps.snoflake.my.test was refused - did you specify the right host or port?"
  • Host journald shows similar symptoms to Openshift 4.4 Install: Workers not seen by cluster #3711 (some repeated log lines replaced with ... for easier reading.)
-- Logs begin at Wed 2023-01-11 15:08:41 EST. --
Feb 27 08:44:35 oscar7 hyperkube[4304]: E0227 08:44:35.706202    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:35 oscar7 hyperkube[4304]: E0227 08:44:35.807343    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
...
Feb 27 08:44:36 oscar7 hyperkube[4304]: I0227 08:44:36.166196    4304 csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Feb 27 08:44:36 oscar7 hyperkube[4304]: E0227 08:44:36.210364    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:36 oscar7 sudo[4863]:     core : TTY=pts/0 ; PWD=/var/home/core ; USER=root ; COMMAND=/bin/journalctl -f
Feb 27 08:44:36 oscar7 sudo[4863]: pam_systemd(sudo:session): Cannot create session: Already running in a session or user slice
Feb 27 08:44:36 oscar7 sudo[4863]: pam_unix(sudo:session): session opened for user root by core(uid=0)
Feb 27 08:44:36 oscar7 hyperkube[4304]: E0227 08:44:36.310944    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:36 oscar7 hyperkube[4304]: E0227 08:44:36.411787    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
...
Feb 27 08:44:37 oscar7 hyperkube[4304]: I0227 08:44:37.165752    4304 csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Feb 27 08:44:37 oscar7 hyperkube[4304]: E0227 08:44:37.218078    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:37 oscar7 hyperkube[4304]: E0227 08:44:37.319020    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
...
Feb 27 08:44:38 oscar7 hyperkube[4304]: I0227 08:44:38.164980    4304 csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Feb 27 08:44:38 oscar7 hyperkube[4304]: E0227 08:44:38.224918    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:38 oscar7 hyperkube[4304]: E0227 08:44:38.325324    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
...
Feb 27 08:44:38 oscar7 hyperkube[4304]: E0227 08:44:38.649325    4304 controller.go:144] failed to ensure lease exists, will retry in 7s, error: leases.coordination.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease"
...
Feb 27 08:44:39 oscar7 hyperkube[4304]: E0227 08:44:39.130463    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.163573    4304 csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Feb 27 08:44:39 oscar7 hyperkube[4304]: E0227 08:44:39.231024    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.295808    4304 kubelet_node_status.go:376] "Setting node annotation to enable volume controller attach/detach"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.296763    4304 kubelet_node_status.go:590] "Recording event message for node" node="oscar7" event="NodeHasSufficientMemory"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.296784    4304 kubelet_node_status.go:590] "Recording event message for node" node="oscar7" event="NodeHasNoDiskPressure"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.296792    4304 kubelet_node_status.go:590] "Recording event message for node" node="oscar7" event="NodeHasSufficientPID"
Feb 27 08:44:39 oscar7 hyperkube[4304]: I0227 08:44:39.296807    4304 kubelet_node_status.go:72] "Attempting to register node" node="oscar7"
Feb 27 08:44:39 oscar7 hyperkube[4304]: E0227 08:44:39.298969    4304 kubelet_node_status.go:94] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="oscar7"
Feb 27 08:44:39 oscar7 hyperkube[4304]: E0227 08:44:39.331677    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
...
Feb 27 08:44:40 oscar7 hyperkube[4304]: I0227 08:44:40.164975    4304 csi_plugin.go:1063] Failed to contact API server when waiting for CSINode publishing: csinodes.storage.k8s.io "oscar7" is forbidden: User "system:anonymous" cannot get resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.239502    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.257779    4304 transport.go:112] "No valid client certificate is found but the server is not responsive. A restart may be necessary to retrieve new initial credentials." lastCertificateAvailabilityTime="2023-02-27 08:06:30.110631999 -0500 EST m=+0.065007060" shutdownThreshold="5m0s"
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.340120    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.441086    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.541707    4304 kubelet.go:2447] "Error getting node" err="node \"oscar7\" not found"
Feb 27 08:44:40 oscar7 hyperkube[4304]: E0227 08:44:40.554521    4304 eviction_manager.go:254] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"oscar7\" not found"

Restarted the host several times, no change in symptoms and no change in "Disconnected" status on console.redhat.com.

What you expected to happen?

Cluster starts up.

How to reproduce it (as minimally and precisely as possible)?

Create a SNO bare-metal cluster on console.redhat.com.

I don't know if there's something in my setup that is involved: I've done this sucessfully with openshift 4.10 and 4.11 on the same host I'm using now, I'm not aware of anything different in my setup.

Anything else we need to know?

There is a workaround on #3711 - remove the bootstrap host and resolve outstanding CSRs. I can't apply that because

  • It's an SNO cluster so the bootstrap, master and worker host are the same host.
  • I can't log into the cluster to check CSRs.

References

#3711

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions