-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40887][K8S] Allow Spark on K8s to integrate w/ Log Service #38357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Can one of the admins verify this patch? |
|
+CC @tgravescs Since you have more context on this from yarn pov than I do |
|
@tgravescs this one is the real solution based on the feedback of #38205, would you please take a look? And should I extend the solution to Yarn or not? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like logically duplicated in both getDriverLogUrls and getDriverAttributes except the variable names. Could you try to refactor more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are few differences, e.g. toLowerCase and toUpperCase, this follows CoarseGrainedExecutorBackend#extractLogUrls and CoarseGrainedExecutorBackend#extractAttributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of creating a new class KubernetesCoarseGrainedExecutorBackend, can we do the following simply?
new CoarseGrainedExecutorBackend(rpcEnv, arguments.driverUrl, execId,
arguments.bindAddress, arguments.hostname, arguments.cores,
- env, arguments.resourcesFileOpt, resourceProfile)
+ env, arguments.resourcesFileOpt, resourceProfile) {
+ override def getDriverAttributes: Option[Map[String, String]] = Some(
+ super.getDriverAttributes.getOrElse(Map.empty) ++ Map(
+ "APP_ID" -> System.getenv(ENV_APPLICATION_ID),
+ "KUBERNETES_NAMESPACE" -> conf.get(KUBERNETES_NAMESPACE),
+ "KUBERNETES_POD_NAME" -> System.getenv(ENV_DRIVER_POD_NAME)))
+ }Then, we can remove the following file from this PR.
- resource-managers/kubernetes/core/src/main/scala/org/apache/spark/executor/KubernetesCoarseGrainedExecutorBackend.scala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had better put ths line should be before ENV_DRIVER_BIND_ADDRESS.
BTW, Could you spin off KUBERNETES_POD_NAME and SPARK_DRIVER_POD_NAME addition to a new PR, @pan3793 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, opened #39160 for this.
|
Hi, @pan3793 . I'm interested in this PR for Apache Spark 3.4.0. Although we have one month until
|
|
Thanks @dongjoon-hyun, updated the code based on your comments, and sorry for late reply, because my team is suffering from COVID-19 😢 |
|
Update PR state. Currently, the PR is stuck at "Is it good to let 3rd-party log service use POD NAME to access Driver log?" In #39160 (review), @dongjoon-hyun left a concern
Yes, it's an exception, but I think it may be the right direction. Because:
Would like to hear more thoughts from the community, cc @holdenk @mridulm @yaooqinn @Yikun @attilapiros @jzhuge @LuciferYang @cxzl25, thanks! [1] https://github.com/apple/batch-processing-gateway/blob/main/src/main/java/com/apple/spark/rest/ApplicationGetLogRest.java#L327 |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Provide a flexible way on K8s for Driver and Executor by using env vars to configure external log service links(pattern) and attributes, on both live Spark UI and SHS.
The full design doc is https://docs.google.com/document/d/1MfB39LD4B4Rp7MDRxZbMKMbdNSe6V6mBmMQ-gkCnM-0/edit?usp=sharing
APP_IDKUBENETES_POD_NAMEKUBENETES_NAMESPACESPARK_DRIVER_LOG_URL_SPARK_DRIVER_ATTRIBUTE_SPARK_LOG_URL_SPARK_EXECUTOR_ATTRIBUTE_Always do log URLs replacement for Driver before sending SparkListenerApplicationStart into the LiveListenerBus, so that the Driver could have the log URL replacement ability on live UI, as Executor does.
Always do log URLs replacement for Executor,
spark.ui.custom.executor.log.urlis provided, as-is;Why are the changes needed?
Currently, there is no out-of-box log solution for Spark on K8s.
For Spark on Yarn case, Spark provides stdout/stderr log links on Spark UI for the Driver and each Executor which redirects to the Yarn log pages, but for the resource manager which does not provide the out-of-box log services, like K8s, Spark has no log links on Spark UI.
Does this PR introduce any user-facing change?
Yes, users who deploy Spark on K8s could add custom log links in the Spark UI by configurations.
How was this patch tested?
Local mode
K8s mode
An online Spark on K8s application using Kibana as the external log service w/ the following configurations.