Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 21, 2022

What changes were proposed in this pull request?

This PR proposes to expose SPARK_DRIVER_POD_NAME as an environment variable in Diver Pod.

Why are the changes needed?

This is first part of SPARK-40887, the pod name could be a criteria for fetching logs from external log service.

Does this PR introduce any user-facing change?

Yes, new env variable SPARK_DRIVER_POD_NAME available in Driver Pod.

How was this patch tested?

UT added.

@pan3793
Copy link
Member Author

pan3793 commented Dec 21, 2022

cc @dongjoon-hyun

@LuciferYang
Copy link
Contributor

How many subtasks will SPARK-40887 be divided into? If there are more, do you mind make SPARK-40887 as an Umbrella and make this one as a sub task?

@pan3793
Copy link
Member Author

pan3793 commented Dec 22, 2022

two or three maybe, I plan to split independent parts from SPARK-40887 and rebase it once the split PR gets merged. not sure if valuable to create an Umbrella since it's not a very big change.

@LuciferYang
Copy link
Contributor

If there are only two or three, it's OK

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reviewing this design again in detail, I have a concern. Currently, Apache Spark uses K8s Service entity via DriverServiceFeatureStep to access Spark driver pod in K8s environment. The proposed design is a kind of exception. Do you think you can revise this log service design to use Driver Service instead?

@pan3793
Copy link
Member Author

pan3793 commented Dec 29, 2022

@dongjoon-hyun The design does not force to use POD_NAME nor SVC_NAME as criteria to access driver/executor logs, it totally depends on how the external log service aggregates logs.

To allow identifying Spark driver logs by service name, we just need to expose it as an attribute in KubernetesClusterSchedulerBackend#getDriverAttributes, then it can be used in the log URL pattern.

So, here we need to expose the common attributes so that users can use them as criteria to fetch logs from the external log service.

As I said in the proposal, I think the following attributes are generic

My proposed generic attributes are

  • APP_ID
  • KUBENETES_POD_NAME
  • KUBENETES_NAMESPACE

we can expose DRIVER_SVC_NAME as well but also keep KUBENETES_POD_NAME.

@pan3793
Copy link
Member Author

pan3793 commented Mar 13, 2023

cross-refer comments from #40392 (comment)

Your PR tried to add SPARK_DRIVER_POD_NAME to Driver Pod to expose it to 3rd party pods.

... to use Driver Service instead

Is the service name a kind of official API to allow 3rd party components to access Spark Driver in K8s? If yes, what about executor?

My vision is exposing both driver and executor in an unified way to the log service, and aggregate logs by Pod is much straightforward, just like Yarn does, by container. So my first candidate is Pod Name, the second one is Pod IP.

@dongjoon-hyun I do understand we should be careful to add each ENV variable, configuration, etc. If you think the Pod IP is acceptable, then it's sufficient now and we can get Driver Pod IP by env SPARK_DRIVER_BIND_ADDRESS and executor Pod IP by env SPARK_EXECUTOR_POD_IP

@pan3793
Copy link
Member Author

pan3793 commented Mar 14, 2023

@pan3793
Copy link
Member Author

pan3793 commented Mar 24, 2023

@github-actions
Copy link

github-actions bot commented Jul 3, 2023

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants