From 8ac0149c4874f007e26371522823cc1cbb64299f Mon Sep 17 00:00:00 2001 From: abrennan Date: Wed, 22 May 2019 10:41:26 +0100 Subject: [PATCH 1/5] Added autoscaling and scale-to-zero docs --- docs/README.md | 2 +- docs/serving/README.md | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/README.md b/docs/README.md index 3e13deae0f1..82325bb891d 100644 --- a/docs/README.md +++ b/docs/README.md @@ -12,7 +12,7 @@ focus on solving mundane but difficult tasks such as: - [Deploying a container](./install/getting-started-knative-app.md) - [Orchestrating source-to-URL workflows on Kubernetes](./serving/samples/source-to-url-go/) - [Routing and managing traffic with blue/green deployment](./serving/samples/blue-green-deployment.md) -- [Scaling automatically and sizing workloads based on demand](./serving/samples/autoscale-go/) +- [Scaling automatically and sizing workloads based on demand](./serving/autoscaling.md) - [Binding running services to eventing ecosystems](./eventing/samples/kubernetes-event-source/) Developers on Knative can use familiar idioms, languages, and frameworks to diff --git a/docs/serving/README.md b/docs/serving/README.md index bafed9f9a8b..ea4fd750983 100644 --- a/docs/serving/README.md +++ b/docs/serving/README.md @@ -34,6 +34,7 @@ serverless workload behaves on the cluster: The `revision.serving.knative.dev` resource is a point-in-time snapshot of the code and configuration for each modification made to the workload. Revisions are immutable objects and can be retained for as long as useful. +Knative Serving Revisions can be automatically scaled up and down according to incoming traffic. See [Autoscaling](./autoscaling.md) for more information. ![Diagram that displays how the Serving resources coordinate with each other.](https://github.com/knative/serving/raw/master/docs/spec/images/object_model.png) From 70770338e0b0ab6e4d6012ae4e9d215e4ddfc67b Mon Sep 17 00:00:00 2001 From: abrennan Date: Wed, 22 May 2019 10:42:06 +0100 Subject: [PATCH 2/5] Adding autoscaling and scale-to-zero docs --- docs/serving/autoscaling.md | 114 ++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 docs/serving/autoscaling.md diff --git a/docs/serving/autoscaling.md b/docs/serving/autoscaling.md new file mode 100644 index 00000000000..2cdc60ea197 --- /dev/null +++ b/docs/serving/autoscaling.md @@ -0,0 +1,114 @@ + +# Autoscaling + +Since Knative v0.2, per revision autoscalers have been replaced by a single shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which provides fast, request based autoscaling capabilities out of the box. + +## Configuring Knative Pod Autoscaler + +To modify the autoscaler configuration, you must modify a Kubernetes ConfigMap called `config-autoscaler` in the `knative-serving` namespace. + +You can view the default contents of this ConfigMap using the following command. + +`kubectl -n knative-serving get cm config-autoscaler` + +### Example of default ConfigMap + +``` +apiVersion: v1 +kind: ConfigMap +metadata: + name: config-autoscaler + namespace: knative-serving +data: + container-concurrency-target-default: 100 + container-concurrency-target-percentage: 1.0 + enable-scale-to-zero: true + enable-vertical-pod-autoscaling: false + max-scale-up-rate: 10 + panic-window: 6s + scale-to-zero-grace-period: 30s + stable-window: 60s + tick-interval: 2s + ``` + +## Configuring scale to zero + +To correctly configure autoscaling to zero for revisions, you must modify the following parameters in the ConfigMap. + +### scale-to-zero-grace-period + +` scale-to-zero-grace-period` specifies the time an inactive revision is left running before it is scaled to zero (min: 30s). + +``` +scale-to-zero-grace-period: 30s +``` + +### stable-window + +When operating in a stable mode, the autoscaler operates on the average concurrency over the stable window. +``` +stable-window: 60s +``` + +`stable-window` can also be configured in the Revision template as an annotation. + +``` +autoscaling.knative.dev/window: 60s +``` + +### enable-scale-to-zero + +Ensure that enable-scale-to-zero is set to `true`. + +### Termination period + +The termination period is the time that the pod takes to shut down after the last request is finished. The termination period of the pod is equal to the sum of the values of the `stable-window` and `scale-to-zero-grace-period` parameters. In the case of this example, the termination period would be 90s. + +## Configuring concurrency + +Concurrency for autoscaling can be configured using the following methods. + +### target + +` target` defines how many concurrent requests are wanted at a given time (soft limit) and is the recommended configuration for autoscaling in Knative. + +The default value for concurrency target is specified in the ConfigMap as `100`. +``` +`container-concurrency-target-default: 100` +``` +This value can be configured by adding or modifying the `autoscaling.knative.dev/target` annotation value in the Revision template. + +``` +autoscaling.knative.dev/target: 50 +``` + +### containerConcurrency + +**NOTE:** `containerConcurrency` should only be used if there is a clear need to limit how many requests reach the app at a given time. Using `containerConcurrency` is only advised if the application needs to have an enforced constraint of concurrency. + +`containerConcurrency` limits the amount of concurrent requests are allowed into the application at a given time (hard limit), and is configured in the Revision template. + +``` +containerConcurrency: 0 | 1 | 2-N +``` +- A `containerConcurrency` value of `1` will guarantee that only one request is handled at a time by a given instance of the Revision container. +- A value of `2` or more will limit request concurrency to that value. +- A value of `0` means the system should decide. + +If there is no `/target` annotation, the autoscaler is configured as if `/target` == `containerConcurrency`. + +## Configuring CPU based autoscaling + +**NOTE:** You can configure Knative autoscaling to work with either the default KPA or a CPU based metric, i.e. Horizontal Pod Autoscaler (HPA), however scale-to-zero capabilities are only supported for KPA. + +You can configure Knative to use CPU based autoscaling instead of the default request based metric by adding or modifying the `autoscaling.knative.dev/class` and `autoscaling.knative.dev/metric` values as annotations in the Revision template. + +``` +autoscaling.knative.dev/metric: cpu +autoscaling.knative.dev/class: hpa.autoscaling.knative.dev +``` + +## Additional resources + +- [Go autoscaling sample](https://knative.dev/docs/serving/samples/autoscale-go/index.html) +- Knative v0.3 Autoscaling  - A Love Story [blog post](https://medium.com/knative/knative-v0-3-autoscaling-a-love-story-d6954279a67a) From ba27766c7ac73247d2515793d14fb1da11f7c335 Mon Sep 17 00:00:00 2001 From: abrennan Date: Fri, 31 May 2019 11:27:23 +0100 Subject: [PATCH 3/5] Minor changes to address review comments --- docs/README.md | 2 +- docs/serving/README.md | 2 +- ...utoscaling.md => configuring-the-autoscaler.md} | 14 +++++++++----- 3 files changed, 11 insertions(+), 7 deletions(-) rename docs/serving/{autoscaling.md => configuring-the-autoscaler.md} (94%) diff --git a/docs/README.md b/docs/README.md index 82325bb891d..91b8c16679e 100644 --- a/docs/README.md +++ b/docs/README.md @@ -12,7 +12,7 @@ focus on solving mundane but difficult tasks such as: - [Deploying a container](./install/getting-started-knative-app.md) - [Orchestrating source-to-URL workflows on Kubernetes](./serving/samples/source-to-url-go/) - [Routing and managing traffic with blue/green deployment](./serving/samples/blue-green-deployment.md) -- [Scaling automatically and sizing workloads based on demand](./serving/autoscaling.md) +- [Scaling automatically and sizing workloads based on demand](./serving/configuring-the-autoscaler.md) - [Binding running services to eventing ecosystems](./eventing/samples/kubernetes-event-source/) Developers on Knative can use familiar idioms, languages, and frameworks to diff --git a/docs/serving/README.md b/docs/serving/README.md index ea4fd750983..20b0353ae58 100644 --- a/docs/serving/README.md +++ b/docs/serving/README.md @@ -34,7 +34,7 @@ serverless workload behaves on the cluster: The `revision.serving.knative.dev` resource is a point-in-time snapshot of the code and configuration for each modification made to the workload. Revisions are immutable objects and can be retained for as long as useful. -Knative Serving Revisions can be automatically scaled up and down according to incoming traffic. See [Autoscaling](./autoscaling.md) for more information. + Knative Serving Revisions can be automatically scaled up and down according to incoming traffic. See [Autoscaling](./autoscaling.md) for more information. ![Diagram that displays how the Serving resources coordinate with each other.](https://github.com/knative/serving/raw/master/docs/spec/images/object_model.png) diff --git a/docs/serving/autoscaling.md b/docs/serving/configuring-the-autoscaler.md similarity index 94% rename from docs/serving/autoscaling.md rename to docs/serving/configuring-the-autoscaler.md index 2cdc60ea197..14bca3ac1bb 100644 --- a/docs/serving/autoscaling.md +++ b/docs/serving/configuring-the-autoscaler.md @@ -1,7 +1,11 @@ - -# Autoscaling +# Configuring the Autoscaler +--- +title: "Configuring the Autoscaler" +weight: 10 +type: "docs" +--- -Since Knative v0.2, per revision autoscalers have been replaced by a single shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which provides fast, request based autoscaling capabilities out of the box. +Since Knative v0.2, per revision autoscalers have been replaced by a single shared autoscaler. This is, by default, the Knative Pod Autoscaler (KPA), which provides fast, request-based autoscaling capabilities out of the box. ## Configuring Knative Pod Autoscaler @@ -97,7 +101,7 @@ containerConcurrency: 0 | 1 | 2-N If there is no `/target` annotation, the autoscaler is configured as if `/target` == `containerConcurrency`. -## Configuring CPU based autoscaling +## Configuring CPU-based autoscaling **NOTE:** You can configure Knative autoscaling to work with either the default KPA or a CPU based metric, i.e. Horizontal Pod Autoscaler (HPA), however scale-to-zero capabilities are only supported for KPA. @@ -111,4 +115,4 @@ autoscaling.knative.dev/class: hpa.autoscaling.knative.dev ## Additional resources - [Go autoscaling sample](https://knative.dev/docs/serving/samples/autoscale-go/index.html) -- Knative v0.3 Autoscaling  - A Love Story [blog post](https://medium.com/knative/knative-v0-3-autoscaling-a-love-story-d6954279a67a) +- [Knative v0.3 Autoscaling  - A Love Story blog post](https://medium.com/knative/knative-v0-3-autoscaling-a-love-story-d6954279a67a) From bda757246cfd55eb0cc86d178cbd884d92821b96 Mon Sep 17 00:00:00 2001 From: abrennan Date: Fri, 31 May 2019 11:30:32 +0100 Subject: [PATCH 4/5] updated link in README --- docs/serving/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/serving/README.md b/docs/serving/README.md index 20b0353ae58..f660a4751af 100644 --- a/docs/serving/README.md +++ b/docs/serving/README.md @@ -34,7 +34,7 @@ serverless workload behaves on the cluster: The `revision.serving.knative.dev` resource is a point-in-time snapshot of the code and configuration for each modification made to the workload. Revisions are immutable objects and can be retained for as long as useful. - Knative Serving Revisions can be automatically scaled up and down according to incoming traffic. See [Autoscaling](./autoscaling.md) for more information. + Knative Serving Revisions can be automatically scaled up and down according to incoming traffic. See [Configuring the Autoscaler](./configuring-the-autoscaler.md) for more information. ![Diagram that displays how the Serving resources coordinate with each other.](https://github.com/knative/serving/raw/master/docs/spec/images/object_model.png) From 9ad92c863068b8d356221038eb39411266ced1d5 Mon Sep 17 00:00:00 2001 From: abrennan Date: Fri, 31 May 2019 22:16:45 +0100 Subject: [PATCH 5/5] removed unnecessary heading --- docs/serving/configuring-the-autoscaler.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/serving/configuring-the-autoscaler.md b/docs/serving/configuring-the-autoscaler.md index 14bca3ac1bb..e79d8440ecb 100644 --- a/docs/serving/configuring-the-autoscaler.md +++ b/docs/serving/configuring-the-autoscaler.md @@ -1,4 +1,3 @@ -# Configuring the Autoscaler --- title: "Configuring the Autoscaler" weight: 10