-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Dynamic auto scale Kinesis-Stream ingest tasks #10985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f8834d4
2d70cc3
fc89da1
712ff53
830aa8c
0de88ae
56ab0ff
7f00511
6ab73a4
ac721c9
c402e9e
47cb43a
ba0ba3c
dbe6c2d
101a802
9a667c3
96ab362
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -141,6 +141,116 @@ A sample supervisor spec is shown below: | |
| |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|no| | ||
| |`awsExternalId`|String|The AWS external id to use for additional permissions.|no| | ||
| |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. See below for details.|no| | ||
| |`autoScalerConfig`|Object|Defines auto scaling behavior for Kinesis ingest tasks. See [Tasks Autoscaler Properties](#Task Autoscaler Properties).|no (default == null)| | ||
|
|
||
| ### Task Autoscaler Properties | ||
|
|
||
| > Note that Task AutoScaler is currently designated as experimental. | ||
|
|
||
| | Property | Description | Required | | ||
| | ------------- | ------------- | ------------- | | ||
| | `enableTaskAutoScaler` | Enable or disable the auto scaler. When false or or absent Druid disables the `autoScaler` even when `autoScalerConfig` is not null| no (default == false) | | ||
| | `taskCountMax` | Maximum number of Kinesis ingestion tasks. Must be greater than or equal to `taskCountMin`. If greater than `{numKinesisShards}`, the maximum number of reading tasks is `{numKinesisShards}` and `taskCountMax` is ignored. | yes | | ||
| | `taskCountMin` | Minimum number of Kinesis ingestion tasks. When you enable the auto scaler, Druid ignores the value of taskCount in `IOConfig` and uses`taskCountMin` for the initial number of tasks to launch.| yes | | ||
| | `minTriggerScaleActionFrequencyMillis` | Minimum time interval between two scale actions | no (default == 600000) | | ||
| | `autoScalerStrategy` | The algorithm of `autoScaler`. ONLY `lagBased` is supported for now. See [Lag Based AutoScaler Strategy Related Properties](#Lag Based AutoScaler Strategy Related Properties) for details.| no (default == `lagBased`) | | ||
|
|
||
| ### Lag Based AutoScaler Strategy Related Properties | ||
|
|
||
| The Kinesis indexing service reports lag metrics measured in time milliseconds rather than message count which is used by Kafka. | ||
|
|
||
| | Property | Description | Required | | ||
| | ------------- | ------------- | ------------- | | ||
| | `lagCollectionIntervalMillis` | Period of lag points collection. | no (default == 30000) | | ||
| | `lagCollectionRangeMillis` | The total time window of lag collection, Use with `lagCollectionIntervalMillis`,it means that in the recent `lagCollectionRangeMillis`, collect lag metric points every `lagCollectionIntervalMillis`. | no (default == 600000) | | ||
| | `scaleOutThreshold` | The Threshold of scale out action | no (default == 6000000) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this mean that when the time lage reaches 6000000 (default) the autoscaler launches another task? |
||
| | `triggerScaleOutFractionThreshold` | If `triggerScaleOutFractionThreshold` percent of lag points are higher than `scaleOutThreshold`, then do scale out action. | no (default == 0.3) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are lag points defined somewhere? Maybe we need an example of how this works together with the |
||
| | `scaleInThreshold` | The Threshold of scale in action | no (default == 1000000) | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comments as for scale out |
||
| | `triggerScaleInFractionThreshold` | If `triggerScaleInFractionThreshold` percent of lag points are lower than `scaleOutThreshold`, then do scale in action. | no (default == 0.9) | | ||
| | `scaleActionStartDelayMillis` | Number of milliseconds to delay after the supervisor starts before the first scale logic check. | no (default == 300000) | | ||
| | `scaleActionPeriodMillis` | Frequency in milliseconds to check if a scale action is triggered | no (default == 60000) | | ||
| | `scaleInStep` | Number of tasks to reduce at a time when scaling down | no (default == 1) | | ||
| | `scaleOutStep` | Number of tasks to add at a time when scaling out | no (default == 2) | | ||
|
|
||
| The following example demonstrates a supervisor spec with `lagBased` autoScaler enabled: | ||
| ```json | ||
| { | ||
| "type": "kinesis", | ||
| "dataSchema": { | ||
| "dataSource": "metrics-kinesis", | ||
| "timestampSpec": { | ||
| "column": "timestamp", | ||
| "format": "auto" | ||
| }, | ||
| "dimensionsSpec": { | ||
| "dimensions": [], | ||
| "dimensionExclusions": [ | ||
| "timestamp", | ||
| "value" | ||
| ] | ||
| }, | ||
| "metricsSpec": [ | ||
| { | ||
| "name": "count", | ||
| "type": "count" | ||
| }, | ||
| { | ||
| "name": "value_sum", | ||
| "fieldName": "value", | ||
| "type": "doubleSum" | ||
| }, | ||
| { | ||
| "name": "value_min", | ||
| "fieldName": "value", | ||
| "type": "doubleMin" | ||
| }, | ||
| { | ||
| "name": "value_max", | ||
| "fieldName": "value", | ||
| "type": "doubleMax" | ||
| } | ||
| ], | ||
| "granularitySpec": { | ||
| "type": "uniform", | ||
| "segmentGranularity": "HOUR", | ||
| "queryGranularity": "NONE" | ||
| } | ||
| }, | ||
| "ioConfig": { | ||
| "stream": "metrics", | ||
| "autoScalerConfig": { | ||
| "enableTaskAutoScaler": true, | ||
| "taskCountMax": 6, | ||
| "taskCountMin": 2, | ||
| "minTriggerScaleActionFrequencyMillis": 600000, | ||
| "autoScalerStrategy": "lagBased", | ||
| "lagCollectionIntervalMillis": 30000, | ||
| "lagCollectionRangeMillis": 600000, | ||
| "scaleOutThreshold": 600000, | ||
| "triggerScaleOutFractionThreshold": 0.3, | ||
| "scaleInThreshold": 100000, | ||
| "triggerScaleInFractionThreshold": 0.9, | ||
| "scaleActionStartDelayMillis": 300000, | ||
| "scaleActionPeriodMillis": 60000, | ||
| "scaleInStep": 1, | ||
| "scaleOutStep": 2 | ||
| }, | ||
| "inputFormat": { | ||
| "type": "json" | ||
| }, | ||
| "endpoint": "kinesis.us-east-1.amazonaws.com", | ||
| "taskCount": 1, | ||
| "replicas": 1, | ||
| "taskDuration": "PT1H", | ||
| "recordsPerFetch": 2000, | ||
| "fetchDelayMillis": 1000 | ||
| }, | ||
| "tuningConfig": { | ||
| "type": "kinesis", | ||
| "maxRowsPerSegment": 5000000 | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| #### Specifying data format | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the time period between collections or if it relates to some sort of time lag.