-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-37687][K8S] Removes the compile time dependency on the OkHttp http client #48446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Kubernetes client supports plugging in different http clients via the Java service provider interface. Since we have a dependency on OkHttp it is not possible to plug in other http clients.
| "Kubernetes client config: " + | ||
| new ObjectMapper().writerWithDefaultPrettyPrinter().writeValueAsString(config)) | ||
| new KubernetesClientBuilder() | ||
| .withHttpClientFactory(factoryWithCustomDispatcher) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom dispatcher uses daemon threads. Not setting this custom dispatcher implies that we will now have non-daemon threads in the http client.
cc: @Yikun if you could please take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom dispatcher uses daemon threads.
that's the exact purpose of overwriting the HttpClientFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought so. How about adding something like this?
val clientFactory = HttpClientUtils.getHttpClientFactory
val clientBuilder = new KubernetesClientBuilder().withConfig(config)
clientFactory match {
case _: OkHttpClientFactory =>
val dispatcher = new Dispatcher(
ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
val factory = new OkHttpClientFactory() {
override def additionalConfig(builder: OkHttpClient.Builder): Unit = {
builder.dispatcher(dispatcher)
}
}
clientBuilder.withHttpClientFactory(factory).build()
case _ => clientBuilder.build()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against that, after making this change, how do you configure the K8s client builder to use another HTTP client factory without touching the code?
BTW, please restore unnecessary code changes from your patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the job jar, you would include a dependency on another http client, for eg.
<dependency>
<groupId>io.fabric8</groupId>
<artifactId>kubernetes-httpclient-jdk</artifactId>
</dependency>
These dependency jars contain the file META-INF/services/io.fabric8.kubernetes.client.http.HttpClient$Factory containing the name of the class implementing the provider, which the K8s client builder should pick up.
The job jar could also include META-INF/services/io.fabric8.kubernetes.client.http.HttpClient$Factory with the name of the class we want to use.
I'll restore the changes - somehow running /dev/scalafmt introduced those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then how do you choose the exact one you want to use if there are multi impls under the classpath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can set priority in the provider configuration file:
1: <class1>
2: <class2>
The code in fabric8 appears to be honoring that priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense, thanks for sharing the information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @pan3793 minor correction to what I said earlier. The priority comes from the factory itself. So if there are multiple options, we could extend the factory we want and assign priority, like this:
class MyHttpClientFactory extends JdkHttpClientFactory {
override def priority(): Int = -1
}
and then add MyHttpClientFactory to the configuration file.
Regardless, with this change it should be possible to choose an implementation without touching code.
| val clientCertFile = sparkConf | ||
| .getOption(s"$kubernetesAuthConfPrefix.$CLIENT_CERT_FILE_CONF_SUFFIX") | ||
| // TODO(SPARK-37687): clean up direct usage of OkHttpClient, see also: | ||
| // https://github.com/fabric8io/kubernetes-client/issues/3547 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this comment still valid
|
Hi @dongjoon-hyun @yaooqinn could you please take a look? |
|
Let me close this PR since SPARK-37687 is resolved via |
|
Thank you for making this PR before. It's more natural to follow the upstream |
Sure, yes I agree solving upstream is better. |
What changes were proposed in this pull request?
Removes the dependency on the OkHttp http client for the kubernetes client.
Why are the changes needed?
The Kubernetes client supports plugging in different http clients via the Java service provider interface. Since we have a dependency on OkHttp it is not possible to plug in other http clients.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added a unit test
Was this patch authored or co-authored using generative AI tooling?
No