Skip to content

Last Pod Scaledown Timeout #7596

@vagababov

Description

@vagababov

Currently we have the scale-to-zero-grace-period flag. But it's original purpose has been to avoid scaling to 0 before the network is reprogrammed. So over the time we spent quite some effort to make sure this number can be reduced or even made 0, in some cases (and it is if the revision has been backed by activator for that amount of seconds, as been requested before).

Unfortunately the documentation of the flag lead people to believe that it is a guaranteed time to hang the last pod around after we'd want to scale to 0 (i.e. 0 requests over the stable-window).
One way to achieve that is to increase stable window, but that leads to sluggish autoscaler performance.

So, I heard from users, that they in fact want pods to hang around for longer periods of time, after stable-window seconds of 0 requests (e.g. if your pod startup is very expensive).

So the suggestion is to introduce a knob for that, something like last-pod-retention-time, which would do that, while preserving the functionality that scale-to-zero-grace-period bears.

Then the scale to 0 time would be max(inferred-scale-to-zero-grace, last-pod-retention-time).

In what area(s)?

/area API
/area autoscale

/cc @ahmetb @mattmoor @markusthoemmes

Metadata

Metadata

Assignees

Labels

area/APIAPI objects and controllersarea/autoscalekind/featureWell-understood/specified features, ready for coding.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions