Skip to content

operator controller-manager pod stuck in ImagePullBackOff #3691

@jmrodri

Description

@jmrodri

Bug Report

What did you do?
I tried to use run packagemanifests to run my operator it failed because it timed out.

$ operator-sdk run packagemanifests packagemanifests  --version 0.0.1 --namespace test-operator2
INFO[0000] Running operator from directory packagemanifests 
INFO[0000] Creating catalog source                      
INFO[0000]   Creating CatalogSource "test-operator2/test-operator-ocs" 
INFO[0000] Creating test-operator registry              
INFO[0000]   Creating ConfigMap "test-operator2/test-operator-registry-manifests-package" 
INFO[0000]   Creating ConfigMap "test-operator2/test-operator-registry-manifests-0-0-1" 
INFO[0000]   Creating Deployment "test-operator2/test-operator-registry-server" 
INFO[0000]   Creating Service "test-operator2/test-operator-registry-server" 
INFO[0000] Waiting for Deployment "test-operator2/test-operator-registry-server" rollout to complete 
INFO[0000] Waiting for Deployment "test-operator2/test-operator-registry-server" to rollout: waiting for deployment spec update to be observed 
INFO[0001]   Waiting for Deployment "test-operator2/test-operator-registry-server" to rollout: 0 of 1 updated replicas are available 
INFO[0002]   Deployment "test-operator2/test-operator-registry-server" successfully rolled out 
INFO[0002] Creating resources                           
INFO[0002]   Creating Subscription "test-operator2/test-operator-v0-0-1-sub" 
INFO[0002]   Creating OperatorGroup "test-operator2/operator-sdk-og" 
INFO[0002] Waiting for ClusterServiceVersion "test-operator2/test-operator.v0.0.1" to reach 'Succeeded' phase 
INFO[0002]   Waiting for ClusterServiceVersion "test-operator2/test-operator.v0.0.1" to appear 
INFO[0009]   Found ClusterServiceVersion "test-operator2/test-operator.v0.0.1" phase: Pending 
INFO[0012]   Found ClusterServiceVersion "test-operator2/test-operator.v0.0.1" phase: Installing 
FATA[0120] Failed to run operator: error waiting for CSV to install: timed out waiting for the condition 

Here is the state of the namespace:

NAME                                                    READY     STATUS             RESTARTS   AGE
pod/test-operator-controller-manager-6d7fdff74f-hzs7t   1/2       ImagePullBackOff   0          59m
pod/test-operator-registry-server-65c44c8f66-ff9jf      1/1       Running            0          93m

NAME                                    TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)     AGE
service/test-operator-registry-server   ClusterIP   10.96.204.54   <none>        50051/TCP   93m

NAME                                               READY     UP-TO-DATE   AVAILABLE   AGE
deployment.apps/test-operator-controller-manager   0/1       1            0           93m
deployment.apps/test-operator-registry-server      1/1       1            1           93m

NAME                                                          DESIRED   CURRENT   READY     AGE
replicaset.apps/test-operator-controller-manager-6d7fdff74f   1         1         0         93m
replicaset.apps/test-operator-registry-server-65c44c8f66      1         1         1         93m

Images in the kind cluster:

$ docker exec -it kind-control-plane crictl images
IMAGE                                                     TAG                 IMAGE ID            SIZE
quay.io/operator-framework/olm                            <none>              1cb5c6e6a8401       87.2MB
docker.io/kindest/kindnetd                                0.5.4               2186a1a396deb       113MB
docker.io/library/controller                              latest              7f692a61fe82c       47.8MB
docker.io/rancher/local-path-provisioner                  v0.0.12             db10073a6f829       42MB
gcr.io/kubebuilder/kube-rbac-proxy                        v0.5.0              7d94526bba66b       19.8MB
k8s.gcr.io/coredns                                        1.6.7               67da37a9a360e       43.9MB
k8s.gcr.io/debian-base                                    v2.0.0              9bd6154724425       53.9MB
k8s.gcr.io/etcd                                           3.4.3-0             303ce5db0e90d       290MB
k8s.gcr.io/kube-apiserver                                 v1.18.2             7df05884b1e25       147MB
k8s.gcr.io/kube-controller-manager                        v1.18.2             31fd71c85722f       133MB
k8s.gcr.io/kube-proxy                                     v1.18.2             312d3d1cb6c72       133MB
k8s.gcr.io/kube-scheduler                                 v1.18.2             121edc8356c58       113MB
k8s.gcr.io/pause                                          3.2                 80d28bedfe5de       686kB
quay.io/openshift/origin-operator-registry                latest              a294fb2571655       182MB
quay.io/operator-framework/upstream-community-operators   latest              12d79fdef2c5e       28.5MB
quay.io/redhat-developer/openshift-jenkins-operator       0.4.1-ed5de33       4e730a22c0ec1       185MB

What did you expect to see?
Either a working deployment or an some helpful message on what to look for.

What did you see instead? Under which circumstances?
Fatal timeout error

FATA[0120] Failed to run operator: error waiting for CSV to install: timed out waiting for the condition 

Environment

  • operator-sdk version:
$ operator-sdk version
operator-sdk version: "v1.0.0-alpha.2-16-g8bca0209", commit: "8bca020931136d196ae5f4135c1d70738a501135", kubernetes version: "v1.18.2", go version: "go1.13.11 linux/amd64", GOOS: "linux", GOARCH: "amd64"
  • go version:
go version go1.13.11 linux/amd64
  • Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-30T20:19:45Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:
$ kind version
kind v0.8.1 go1.13.11 linux/amd64
  • Are you writing your operator in ansible, helm, or go?
    Go

Possible Solution

Additional context
The test operator project I'm using: https://github.com/jmrodri/test-operator

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions