Old unreachable revision is causing new pods to get created when it should scale down

/area autoscale

## What version of Knative?

> 1.10 (but exists in older versions as well)

## Expected Behavior

When an old revision is unreachable, no more pods should be created for it

## Actual Behavior

New (and many) pods are created if the old revision has a large minScale.

## Steps to Reproduce the Problem

1. `kn quickstart kind` (issue also happens outside of KinD)
2. `kn service create testsvc --image ghcr.io/knative/helloworld-go:latest --scale 9`
3. Start to watch the pods by using `kubectl` or a tool like `k9s`
4. `kn service update testsvc --scale 1`

After the second revision is ready, it starts to terminate the old pods, but also brings up many additional pods for the old revision. Depending on how long pods are stuck in Terminating you can reach more than 60 existing pods for the old revision.

-- 

The problem is the [`IsReady` function of the ServerlessService](https://github.com/knative/serving/blob/v0.27.2/vendor/knative.dev/networking/pkg/apis/networking/v1alpha1/serverlessservice_lifecycle.go#L95) which is returning false if the generation is different in metadata and status.

Due to it temporarily not being ready, the PodAutoscaler is set to SKSReady=Unknown [here](https://github.com/knative/serving/blob/v0.37.2/pkg/reconciler/autoscaling/kpa/kpa.go#L174).

The PodAutoscaler then goes Ready=False because SKSReady is part of Ready [here](https://github.com/knative/serving/blob/v0.37.2/pkg/apis/autoscaling/v1alpha1/pa_lifecycle.go#L34-L38).

The revision inherits the PodAutoscaler status and sets Active=Unknown [here](https://github.com/knative/serving/blob/v0.37.2/pkg/apis/serving/v1/revision_lifecycle.go#L172-L202).

This causes the PodAutoscaler's Reachability to be set to Unknown [here](https://github.com/knative/serving/blob/v0.37.2/pkg/reconciler/revision/resources/pa.go#L60).

Once the PodAutoscaler is not anymore marked as unreachable, the scaling boundary will be set to the min value from the annotation again [here](https://github.com/knative/serving/blob/v0.37.2/pkg/apis/autoscaling/v1alpha1/pa_lifecycle.go#L90-L95). For an unreachable revision, it would otherwise always use min = 0 so that scaleDown happens. But now it increases it to 9 again.

I will open a PR in a few minutes that changes `IsReady` of the ServerlessService to only check the condition. This fixes the problem and I have not seen negative impact. Though, it undos the work of [IsReady should take ObservedGeneration into account #8004](https://github.com/knative/serving/issues/8004) which sounds to me as if it tried to resolve an abstract issue while I here have a real one. Alternative could be to leave `IsReady` unchanged und look at the condition directly in kpa.go.

So, I need somebody with experience to carefully assess if there are side-effects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Old unreachable revision is causing new pods to get created when it should scale down #14115

What version of Knative?

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Old unreachable revision is causing new pods to get created when it should scale down #14115

Description

What version of Knative?

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions