From eb5933721075b8679ac557b9221d5eb22f97b676 Mon Sep 17 00:00:00 2001 From: Yikun Jiang Date: Wed, 16 Mar 2022 12:06:41 +0800 Subject: [PATCH 1/2] Add doc for custom sche --- docs/running-on-kubernetes.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index a5da80a68d32d..81ce9a50be4ae 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -1722,6 +1722,25 @@ spec: image: will-be-overwritten ``` +#### Customized Kubernetes Schedulers for Spark on Kubernetes + +Spark allows users to specify a customized scheduler as Spark on Kubernetes scheduler. + +1. Specify scheduler name. + + Users can specify customized scheduler using spark.kubernetes.scheduler.name or + spark.kubernetes.{driver/executor}.scheduler.name configuration. + +2. Specify scheduler related configurations. + + Users can use [Pod template](#pod-template), existing configurations to specify label (spark.kubernetes.{driver,executor}.label.*), annotations style (spark.kubernetes.{driver/executor}.annotation.*) scheduler hints. + +3. Specify scheduler feature step. + + Users may also consider to use spark.kubernetes.{driver/executor}.pod.featureSteps to support more complex requirements and more centralized scheduler hints configure, included but not limited to: + - Creating a scheduler needed additional Kubernetes custom resource for driver/executor scheduling. + - Setting scheduler hints according to configuration or existing Pod info dynamically. + ### Stage Level Scheduling Overview Stage level scheduling is supported on Kubernetes when dynamic allocation is enabled. This also requires spark.dynamicAllocation.shuffleTracking.enabled to be enabled since Kubernetes doesn't support an external shuffle service at this time. The order in which containers for different profiles is requested from Kubernetes is not guaranteed. Note that since dynamic allocation on Kubernetes requires the shuffle tracking feature, this means that executors from previous stages that used a different ResourceProfile may not idle timeout due to having shuffle data on them. This could result in using more cluster resources and in the worst case if there are no remaining resources on the Kubernetes cluster then Spark could potentially hang. You may consider looking at config spark.dynamicAllocation.shuffleTracking.timeout to set a timeout, but that could result in data having to be recomputed if the shuffle data is really needed. From 6aadfeaed5238eb6ec4b9fce382b713f11cd965d Mon Sep 17 00:00:00 2001 From: Yikun Jiang Date: Wed, 16 Mar 2022 15:33:04 +0800 Subject: [PATCH 2/2] Apply suggestions from martin Co-authored-by: Martin Grigorov --- docs/running-on-kubernetes.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/running-on-kubernetes.md b/docs/running-on-kubernetes.md index 81ce9a50be4ae..aea7043fd8312 100644 --- a/docs/running-on-kubernetes.md +++ b/docs/running-on-kubernetes.md @@ -1724,22 +1724,22 @@ spec: #### Customized Kubernetes Schedulers for Spark on Kubernetes -Spark allows users to specify a customized scheduler as Spark on Kubernetes scheduler. +Spark allows users to specify a custom Kubernetes schedulers. 1. Specify scheduler name. - Users can specify customized scheduler using spark.kubernetes.scheduler.name or + Users can specify a custom scheduler using spark.kubernetes.scheduler.name or spark.kubernetes.{driver/executor}.scheduler.name configuration. 2. Specify scheduler related configurations. - Users can use [Pod template](#pod-template), existing configurations to specify label (spark.kubernetes.{driver,executor}.label.*), annotations style (spark.kubernetes.{driver/executor}.annotation.*) scheduler hints. + To configure the custom scheduler the user can use [Pod templates](#pod-template), add labels (spark.kubernetes.{driver,executor}.label.*) and/or annotations (spark.kubernetes.{driver/executor}.annotation.*). 3. Specify scheduler feature step. - Users may also consider to use spark.kubernetes.{driver/executor}.pod.featureSteps to support more complex requirements and more centralized scheduler hints configure, included but not limited to: - - Creating a scheduler needed additional Kubernetes custom resource for driver/executor scheduling. - - Setting scheduler hints according to configuration or existing Pod info dynamically. + Users may also consider to use spark.kubernetes.{driver/executor}.pod.featureSteps to support more complex requirements, including but not limited to: + - Create additional Kubernetes custom resources for driver/executor scheduling. + - Set scheduler hints according to configuration or existing Pod info dynamically. ### Stage Level Scheduling Overview