Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions pkg/autoscaler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,15 +93,15 @@ In Stable Mode the Autoscaler adjusts the size of the Deployment to achieve the

#### Panic Mode

The Autoscaler evaluates its metrics every 2 seconds. In addition to the 60-second window, it also keeps a 6-second window (the panic window). If the 6-second average concurrency reaches 2 times the desired average, then the Autoscaler transitions into Panic Mode. In Panic Mode the Autoscaler bases all its decisions on the 6-second window, which makes it much more responsive to sudden increases in traffic. Every 2 seconds it adjusts the size of the Deployment to achieve the stable, desired average (or a maximum of 10 times the current observed Pod count, whichever is smaller). To prevent rapid fluctuations in the Pod count, the Autoscaler will only increase Deployment size during Panic Mode, never decrease. 60 seconds after the last Panic Mode increase to the Deployment size, the Autoscaler transistions back to Stable Mode and begins evaluating the 60-second windows again.
The Autoscaler evaluates its metrics every 2 seconds. In addition to the 60-second window, it also keeps a 6-second window (the panic window). If the 6-second average concurrency reaches 2 times the desired average, then the Autoscaler transitions into Panic Mode. In Panic Mode the Autoscaler bases all its decisions on the 6-second window, which makes it much more responsive to sudden increases in traffic. Every 2 seconds it adjusts the size of the Deployment to achieve the stable, desired average (or a maximum of 10 times the current observed Pod count, whichever is smaller). To prevent rapid fluctuations in the Pod count, the Autoscaler will only increase Deployment size during Panic Mode, never decrease. 60 seconds after the last Panic Mode increase to the Deployment size, the Autoscaler transitions back to Stable Mode and begins evaluating the 60-second windows again.

#### Deactivation

When the Autoscaler has observed an average concurrency per pod of 0.0 for some time ([#305](https://github.com/knative/serving/issues/305)), it will transistion the Revision into the Reserve state. This causes the Deployment and the Autoscaler to be turned down (or scaled to 0) and routes all traffic for the Revision to the Activator.
When the Autoscaler has observed an average concurrency per pod of 0.0 for some time ([#305](https://github.com/knative/serving/issues/305)), it will transition the Revision into the Reserve state. This causes the Deployment and the Autoscaler to be turned down (or scaled to 0) and routes all traffic for the Revision to the Activator.

### Activator

The Activator is a single multi-tenant component that catches traffic for all Reserve Revisions. It is responsible for activating the Revisions and then proxying the caught requests to the appropriate Pods. It woud be preferable to have a hook in Istio to do this so we can get rid of the Activator (see [Design Goal #3](#design-goals)). When the Activator gets a request for a Reserve Revision, it calls the Knative Serving control plane to transistion the Revision to an Active state. It will take a few seconds for all the resources to be provisioned, so more requests might arrive at the Activator in the meantime. The Activator establishes a watch for Pods belonging to the target Revision. Once the first Pod comes up, all enqueued requests are proxied to that Pod. Concurrently, the Knative Serving control plane will update the Istio route rules to take the Activator back out of the serving path.
The Activator is a single multi-tenant component that catches traffic for all Reserve Revisions. It is responsible for activating the Revisions and then proxying the caught requests to the appropriate Pods. It would be preferable to have a hook in Istio to do this so we can get rid of the Activator (see [Design Goal #3](#design-goals)). When the Activator gets a request for a Reserve Revision, it calls the Knative Serving control plane to transition the Revision to an Active state. It will take a few seconds for all the resources to be provisioned, so more requests might arrive at the Activator in the meantime. The Activator establishes a watch for Pods belonging to the target Revision. Once the first Pod comes up, all enqueued requests are proxied to that Pod. Concurrently, the Knative Serving control plane will update the Istio route rules to take the Activator back out of the serving path.

## Slow Brain Implementation

Expand Down