Last Pod Scaledown Timeout

Currently we have the `scale-to-zero-grace-period` flag. But it's original purpose has been to avoid scaling to 0 before the network is reprogrammed. So over the time we spent quite some effort to make sure this number can be reduced or even made 0, in some cases (and it is if the revision has been backed by activator for that amount of seconds, as been requested before).

Unfortunately the documentation of the flag lead people to believe that it is a guaranteed time to hang the last pod around after we'd want to scale to 0 (i.e. 0 requests over the `stable-window`).
One way to achieve that is to increase stable window, but that leads to sluggish autoscaler performance.

So, I heard from users, that they in fact want pods to hang around for longer periods of time, after `stable-window` seconds of 0 requests (e.g. if your pod startup is very expensive).

So the suggestion is to introduce a knob for that, something like `last-pod-retention-time`, which would do that, while preserving the functionality that `scale-to-zero-grace-period` bears.

Then the scale to 0 time would be `max(inferred-scale-to-zero-grace, last-pod-retention-time)`.


## In what area(s)?
/area API
/area autoscale


/cc @ahmetb @mattmoor @markusthoemmes 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Last Pod Scaledown Timeout #7596

In what area(s)?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Last Pod Scaledown Timeout #7596

Description

In what area(s)?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions