Skip to content

Revision stays in ContainerMissing condition forever after a temporary failure of digest resolution #15466

@maschmid

Description

@maschmid

/area reconciler

What version of Knative?

1.14

Expected Behavior

After a temporary error in digest resolution causes a ContainerHealthy condition to be False due to ContainerMissing , when the digest resolution is eventually successful, the ContainerHealthy should be True.

Actual Behavior

After a temporary error in digest resolution, when the digest resolution is eventually successful, the Revision stays in this inconsistent broken state:

status:
  actualReplicas: 1
  conditions:
  - lastTransitionTime: "2024-08-12T22:30:16Z"
    severity: Info
    status: "True"
    type: Active
  - lastTransitionTime: "2024-08-12T22:28:04Z"
    message: 'Unable to fetch image "image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp":
      failed to resolve image to digest: GET https://image-registry.openshift-image-registry.svc:5000/openshift/token?scope=repository%3Aocf-qe-images%2Freceiverhttp%3Apull&service=:
      unexpected status code 401 Unauthorized'
    reason: ContainerMissing
    status: "False"
    type: ContainerHealthy
  - lastTransitionTime: "2024-08-12T22:28:04Z"
    message: 'Unable to fetch image "image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp":
      failed to resolve image to digest: GET https://image-registry.openshift-image-registry.svc:5000/openshift/token?scope=repository%3Aocf-qe-images%2Freceiverhttp%3Apull&service=:
      unexpected status code 401 Unauthorized'
    reason: ContainerMissing
    status: "False"
    type: Ready
  - lastTransitionTime: "2024-08-12T22:30:12Z"
    status: "True"
    type: ResourcesAvailable
  containerStatuses:
  - imageDigest: image-registry.openshift-image-registry.svc:5000/ocf-qe-images/receiverhttp@sha256:e915478407c5c882346c4fc72078007fd2511d9e1796345db1873facafddf836
    name: user-container
  desiredReplicas: 1
  observedGeneration: 1

Notice the containerStatuses showing the resolved image digest , the deployments are Ready (with ResourcesAvailable being True), but the ContainerHealthy still being False with the original digest resolution error.

Steps to Reproduce the Problem

Currently does not have a reproducer, noticed the problem on a long running test

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions