-
Notifications
You must be signed in to change notification settings - Fork 50
[SPARK-54789] Support label selector based filter on resources to reconcile #439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @peter-toth for review - many thanks in advance! |
docs/config_properties.md
Outdated
| | spark.kubernetes.operator.reconciler.retry.maxIntervalSeconds | Integer | -1 | false | Max interval(in seconds) of retries on unhandled controller errors. Set to non-positive for unlimited. | | ||
| | spark.kubernetes.operator.reconciler.terminationTimeoutSeconds | Integer | 30 | false | Grace period for operator shutdown before reconciliation threads are killed. | | ||
| | spark.kubernetes.operator.reconciler.trimStateTransitionHistoryEnabled | Boolean | true | true | When enabled, operator would trim state transition history when a new attempt starts, keeping previous attempt summary only. | | ||
| | spark.kubernetes.operator.reconciliationSelectors | String | | false | Comma-separated label selector(s) that the operator would be reconciling for Spark resources. If not set or set to empty string, operator would reconcile all watched Spark Apps and Clusters. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reuse the existing namespace, spark.kubernetes.operator.reconciler, like this? And, selector might be better because we use spark-app-selector and spark.kubernetes.node.selector.* before, @jiangzho .
- spark.kubernetes.operator.reconciliationSelectors
+ spark.kubernetes.operator.reconciler.selectorThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good idea! I've renamed that as spark.kubernetes.operator.reconciler.selectors , plural just to indicate the value can include multiple selectors
| - name: JAVA_VERSION | ||
| value: "17" | ||
| - name: IMAGE | ||
| value: "apache/spark:4.1.0-scala-java17" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use apache/spark:4.1.0-scala because we are currently advertising Java 21 over 17.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks!
| - name: SPARK_VERSION | ||
| value: "4.1.0" | ||
| - name: JAVA_VERSION | ||
| value: "17" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Java 21 please.
| spark.kubernetes.executor.request.cores: "0.5" | ||
| runtimeVersions: | ||
| sparkVersion: ($V_SPARK_VERSION) | ||
| jdkVersion: ($V_JAVA_VERSION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we remove this because technically jdkVersion didn't provide a meaningful effect in the community?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually yes - we shall not care more about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any reference for your definition of selectors, @jiangzho ?
that's a good idea! I've renamed that as spark.kubernetes.operator.reconciler.selectors , plural just to indicate the value can include multiple selectors
AFAIK, in K8s, it's still a single selector to match all targets. For example, everything is a single selector, fieldSelector, labelSelector, nodeSelector, and selector. We don't use in a plural form, selectors, to specific a single set of target.
selector:
matchLabels:
component: redis
matchExpressions:
- { key: tier, operator: In, values: [cache] }
- { key: environment, operator: NotIn, values: [dev] }
If you really want to use a plural form, I'd like to recommend to use matchLabels for the config name.
- spark.kubernetes.operator.reconciler.selectors
+ spark.kubernetes.operator.reconciler.matchLabels
|
|
||
| operatorConfiguration: | ||
| spark-operator.properties: |+ | ||
| spark.kubernetes.operator.reconciler.selectors=foo=bar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need more test coverage to specify multiple labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack - updated the test case such that
- Operator is configured with comma-separated key values
- Validate resources is not being reconciled if
- no label matches the given selector string
- resource has only one matching label, but not all
- resource has given label, but the value does not match
- Resource is reconciled only when matching all required label selectors
| spark.kubernetes.container.image: ($IMAGE) | ||
| spark.kubernetes.authenticate.driver.serviceAccountName: "spark" | ||
| spark.kubernetes.driver.request.cores: "0.5" | ||
| spark.kubernetes.executor.request.cores: "0.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the above two lines because those are irrelevant to this PR.
|
It seems that the following is not addressed yet. In addition, please rebase once more in order to resolve the conflicts. |
…oncile ### What changes were proposed in this pull request? This PR adds a new property `spark.kubernetes.operator.reconciliationSelectors`. If set to non empty value, operator would only reconcile custom resources that match the given selector filter. By default, operator would still reconcile all resources. ### Why are the changes needed? Introducing optional label based selectors allows operator to reconcile only a targeted subset of custom resources when configured explicitly. It's a common case for user to leverage label for isolation on resources. This can be a result of various reason, including setting-up multiple operator per environment / group in shared clusters, or for progressive adoption upon new versions. This feature would make operator fits better for prod level multi-tenant environments. ### Does this PR introduce any user-facing change? New property becomes available while behavior is the same as before by default. ### How was this patch tested? Existing e2e validates the default behavior. Added new scenario to test selector functionality. ### Was this patch authored or co-authored using generative AI tooling? No
…nf, and update test case
|
Sorry overlooked previous comment - agreed that one selector shall applied to all resources given
from https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors I've updated the naming back to singular |
|
Please update the PR description accordingly.
|
docs/config_properties.md
Outdated
| | spark.kubernetes.operator.reconciler.clusterStatusListenerClassNames | String | | false | Comma-separated names of SparkClusterStatusListener class implementations | | ||
| | spark.kubernetes.operator.reconciler.foregroundRequestTimeoutSeconds | Long | 60 | true | Timeout (in seconds) for requests made to API server. This applies only to foreground requests. | | ||
| | spark.kubernetes.operator.reconciler.intervalSeconds | Long | 120 | true | Interval (in seconds, non-negative) to reconcile Spark applications. Note that reconciliation is always expected to be triggered when app spec / status is updated. This interval controls the reconcile behavior of operator reconciliation even when there's no update on SparkApplication, e.g. to determine whether a hanging app needs to be proactively terminated. Thus this is recommended to set to above 2 minutes to avoid unnecessary no-op reconciliation. | | ||
| | spark.kubernetes.operator.reconciler.labelSelector | String | | false | Comma-separated label selector(s) that the operator would be reconciling for Spark resources. If not set or set to empty string, operator would reconcile all watched Spark Apps and Clusters. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comma-separated label selector(s) sounds a little weird. Could you revise the sentence.
| .key("spark.kubernetes.operator.reconciler.labelSelector") | ||
| .enableDynamicOverride(false) | ||
| .description( | ||
| "Comma-separated label selector(s) that the operator would be reconciling for " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto. We need to revise the phrase, Comma-separated label selector(s).
csviri
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Checked CI failures for previous commit, but seems they are not directly related to this PR |
|
cc @dongjoon-hyun - May I get another round of review for this please ? Thanks a lot |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @jiangzho and all.
Merged to main.
What changes were proposed in this pull request?
This PR adds a new property
spark.kubernetes.operator.reconciler.labelSelector. If set to non empty value, operator would only reconcile custom resources that match the given selector filter. By default, operator would still reconcile all resources.Why are the changes needed?
Introducing optional label based selectors allows operator to reconcile only a targeted subset of custom resources when configured explicitly.
It's a common case for user to leverage label for isolation on resources. This can be a result of various reason, including setting-up multiple operator per environment / group in shared clusters, or for progressive adoption upon new versions. This feature would make operator fits better for prod level multi-tenant environments.
Does this PR introduce any user-facing change?
New property becomes available while behavior is the same as before by default.
How was this patch tested?
Existing e2e validates the default behavior. Added new scenario to test selector functionality.
Was this patch authored or co-authored using generative AI tooling?
No