Upgrade the fabric client to support newer versions of k8s by churromorales · Pull Request #13804 · apache/druid

churromorales · 2023-02-14T18:56:39Z

This PR contains 2 items.

Upgrade the k8s client to support newer versions of k8s.
Ability to override specific Metrics Monitors

We found that when we used the TaskCountStatsMonitor in the overlord config, the peon tasks would not start because they inherit the monitors from the parent process, which used to be the Middle Manager (but that would never have that monitor originally). So now you can override this value with the following config:

druid.indexer.runner.peonMonitors

…ty to override peon monitors

a2l007 · 2023-02-14T21:36:52Z

 |`druid.indexer.runner.javaOptsArray`|`JsonArray`|java opts for the task.|`-Xmx1g`|No|
 |`druid.indexer.runner.labels`|`JsonObject`|Additional labels you want to add to peon pod|`{}`|No|
 |`druid.indexer.runner.annotations`|`JsonObject`|Additional annotations you want to add to peon pod|`{}`|No|
+|`druid.indexer.runner.peonMonitors`|`JsonArray`|An override for the `druid.monitoring.monitors`, For the situation you have monitors setup, and do not want to inherit those from the overlord.|`[]`|No|


Nit: It would be useful to mention this property in the extension specific docs as well.

… class not founds at runtime

+import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.when;
+
+class LogWatchInputStreamTest


a2l007 · 2023-02-20T18:47:23Z

                      .waitUntilCondition(
-                          x -> x != null && x.getStatus() != null && x.getStatus().getActive() == null,
+                          x -> x != null && x.getStatus() != null && x.getStatus().getActive() == null
+                          && (x.getStatus().getFailed() != null || x.getStatus().getSucceeded() !=null),


Why is this change needed?

This was a tricky one. For some background

the overlord launches a task, waits for it to complete and then returns the status. Now in k8s < 1.25 it would just mean once the pod is not active, go grab the status.
Now the contact has changed a bit, because in 1.25 they introduced finalizers:
kubernetes/kubernetes#110948

And we noticed that when we ran tasks on a 1.25 k8s druid cluster they would complete fine but marked as failure.
We outputted the job status and noticed that the job was not active, but both success and failure were not set, but there was this field that had

uncountedTerminatedPods=UncountedTerminatedPods(failed=[], succeeded=[e916cbf9-467a-45f3-86a7-3767145d6384], additionalProperties={})

which from the docs:

UncountedTerminatedPods holds the UIDs of Pods that have terminated but the job controller hasn't yet accounted for in the status counters. The job controller creates pods with a finalizer. When a pod terminates (succeeded or failed), the controller does three steps to account for it in the job status: (1) Add the pod UID to the arrays in this field. (2) Remove the pod finalizer. (3) Remove the pod UID from the arrays while increasing the corresponding counter. This field is beta-level. The job controller only makes use of this field when the feature gate JobTrackingWithFinalizers is enabled (enabled by default). Old jobs might not be tracked using this field, in which case the field remains null.

So now what happens is the job goes from a state where it is not active, to having uncountedTerminatedPods to then having a status with success or failure. I will push up a one-line fix to make this work, but for those of you working with 1.25 version of k8s, I’m sure you will be affected as well.

Basically add another check to wait on,
Right now we wait for this:

// block until job.getStatus() != null && job.getActive() == null // then return return job.getStatus().getSucceeded() != null

So the change has to become

// block until job.getStatus() != null && job.getActive() == null && (job.getStatus().getFailed() != null || job.getStatus().getSucceeded() !=null) // then return return job.getStatus().getSucceeded() != null

This should keep things backwards compatible and working in all versions of k8s

So for the older k8s version, we can expect that when job.getActive() is null, one of success or failure field must be set. Question - do we need to check for getActive() result at all? can we just block till the job either succeeded or failed?

that could work, want me to change it? This definitely works above, I don't have access to many different k8s environments, and someone in the community helped me test the above on 1.25. I could definitely change this, but I unfortunately have no way of testing this out.

Please make this change, we can test it out on a v1.25 cluster

@nlippis I believe you can test by taking this commit and making that small change and apply it onto the 25 branch. I think if it works for you, ill resubmit this with the change.

churromorales · 2023-02-21T22:43:38Z

@abhishekagarwal87 what do you think? This should bring the client up-to-date as well as fix things for those druid users running on k8s 1.25+ that were having issues.

abhishekagarwal87 · 2023-02-22T05:26:35Z

                      .waitUntilCondition(
-                          x -> x != null && x.getStatus() != null && x.getStatus().getActive() == null,
+                          x -> x != null && x.getStatus() != null && x.getStatus().getActive() == null
+                          && (x.getStatus().getFailed() != null || x.getStatus().getSucceeded() !=null),


So for the older k8s version, we can expect that when job.getActive() is null, one of success or failure field must be set. Question - do we need to check for getActive() result at all? can we just block till the job either succeeded or failed?

abhishekagarwal87 · 2023-02-22T08:34:29Z

+{
+
+  @Test
+  void makingCodeCoverageHappy()


Hmm. This is a bit of a bummer. Is there no way to avoid this?

So I had tests for a class, then i upgraded a client library and had to change some things. but the functions were all tested before. Now i would expect that if tests pass I would be good, since it was already tested before. But turns out if you have to make any changes to make the new library apis work, then code coverage complains. It is unfortunate here, because I didn't change any behavior.

aren't there any tests covering that path?

i added a getter to a class, the code coverage tool complained that i added new code without a test. So I added a test for the getter. That was new code, so i guess its okay to add a test if that's what druid wants. Although I feel tests like this are not really that useful and make the whole suite slow in general.

abhishekagarwal87 · 2023-02-22T08:35:01Z

 import static org.junit.jupiter.api.Assertions.assertEquals;

 // must have a kind / minikube cluster installed and the image pushed to your repository
+@Disabled


We are using K3S now. do you still same issues?

k3s? This was for anyone who had an existing k8s cluster, they could just run this test and point it there. The k8s integration tests are definitely not flexible enough for me to test mm-less without rewriting a lot of it.

how was this running before?

I had a kind cluster running locally, it was in the original patch. For anyone that wants to test with a local k8s cluster I left this test in the codebase. If you want me to remove it, I'll be happy to get rid of it. I find it useful for when I make changes, since I can't integration test this with the way current druid-it tests are setup.

ektravel

Left one suggestion for the doc.

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

abhishekagarwal87 · 2023-03-01T15:19:56Z

@churromorales - can you address the build failures?

techdocsmith

Small stylistic change for doc.

churromorales · 2023-03-03T23:04:37Z

@churromorales - can you address the build failures?
I addressed them in a comment above, but maybe it was missed.

So I had tests for the DruidKubernetesPeonClient class, then i upgraded a client library and had to change some things. but the functions were all tested before. Now i would expect that if tests pass I would be good, since it was already tested before. But turns out if you have to make any changes to make the new library apis work, then code coverage complains. It is unfortunate here, because I didn't change any behavior. I can add more tests, but they wont be meaningful here, I believe the build problems are all because I updated the library and had to make some code changes to make it work. There are no logic changes here, except for the issue in 1.25, which I did test.

jwitko

very much needed changes. thank you for making

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

…ests caused by a merge conflict

churromorales · 2023-03-13T23:43:08Z

@abhishekagarwal87 you good with this now? I explained about the test coverage in an earlier comment.

mindreader · 2023-03-15T13:38:37Z

Do you also need to make this change?

--- a/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
+++ b/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
@@ -180,7 +180,7 @@ public class KubernetesTaskRunner implements TaskLogStreamer, TaskRunner
                     completedPhase = monitorJob(peonPod, k8sTaskId);
                   } else {
                     Job job = existingJob.get();
-                    if (job.getStatus().getActive() == null) {
+                    if (job.getStatus() != null && job.getStatus().getActive() == null && (job.getStatus().getFailed() != null || job.getStatus().getSucceeded() !=null)) {
                       if (job.getStatus().getSucceeded() != null) {
                         completedPhase = new JobResponse(job, PeonPhase.SUCCEEDED);
                       } else {

churromorales · 2023-03-21T18:44:01Z

Do you also need to make this change?

--- a/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
+++ b/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
@@ -180,7 +180,7 @@ public class KubernetesTaskRunner implements TaskLogStreamer, TaskRunner
                     completedPhase = monitorJob(peonPod, k8sTaskId);
                   } else {
                     Job job = existingJob.get();
-                    if (job.getStatus().getActive() == null) {
+                    if (job.getStatus() != null && job.getStatus().getActive() == null && (job.getStatus().getFailed() != null || job.getStatus().getSucceeded() !=null)) {
                       if (job.getStatus().getSucceeded() != null) {
                         completedPhase = new JobResponse(job, PeonPhase.SUCCEEDED);
                       } else {

I believe that is the crux of the change. I explain why here: #13804 (comment) Or let me know if I misunderstood you.

a2l007 · 2023-03-21T20:41:53Z

@abhishekagarwal87 Are there any additional concerns with this PR apart from the code coverage failure? From the discussion it looks like there isn't really a way to make the coverage tool pass without adding non-meaningful tests.

mindreader · 2023-03-22T02:01:52Z

Or let me know if I misunderstood you.

Do you also need to make this change?

--- a/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
+++ b/extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java
@@ -180,7 +180,7 @@ public class KubernetesTaskRunner implements TaskLogStreamer, TaskRunner
                     completedPhase = monitorJob(peonPod, k8sTaskId);
                   } else {
                     Job job = existingJob.get();
-                    if (job.getStatus().getActive() == null) {
+                    if (job.getStatus() != null && job.getStatus().getActive() == null && (job.getStatus().getFailed() != null || job.getStatus().getSucceeded() !=null)) {
                       if (job.getStatus().getSucceeded() != null) {
                         completedPhase = new JobResponse(job, PeonPhase.SUCCEEDED);
                       } else {

I believe that is the crux of the change. I explain why here: #13804 (comment) Or let me know if I misunderstood you.

Sorry I wasn't clear. There are two spots in the code that have that same success / failure logic, I only see one in this pr, after the job is monitored. But I also see one before the job is monitored just after it is spawned in KubernetesTaskRunner. If the logic is not necessary at that point, please feel free to ignore me.

abhishekagarwal87 · 2023-03-23T09:18:40Z

@a2l007 - Sorry if I am missing something but if CC is failing anyway, there is no need to add makingCodeCoverageHappy() test in this PR. That's the only comment I have. LGTM otherwise.

churromorales · 2023-03-28T18:09:57Z

it is a good question. I think the way I coded it was not very clear, I apologize for that.
I tried to make the KubernetesTaskRunnerTest cover the different scenarios.

Lets look at this case....

startup overlord,
1st thing it does is go to k8s and see if there are any running tasks. If there are tasks, watch them and do bookkeeping when they finish.

The conditional block you are referring to handles that case. It first does a check to see if the task is not active, then checks the state. If it is still running it does the if statement you suggested (in following lines to monitorJob())`, so no need to do it twice. It's just a short circuit in case its already finished, grab the status and return, no need to make more k8s calls.

Does that help make sense of how things are working? These are good questions, and I want to make sure the code is doing the correct thing.

abhishekagarwal87 · 2023-03-29T07:58:54Z

@churromorales - Thank you for resolving the conflicts. There are build failures. can you take a look?

churromorales · 2023-03-29T18:28:33Z

@abhishekagarwal87 I pushed up the fix, there were some conflicts due to something else getting merged before this.
Now there are issues downloading node, I just want to let you know, I don't have anymore cycles to follow up on this patch. If there are any more changes, feel free to take over and make whatever changes to merge if you want, or if you don't want that is fine too. I just wont be following up on this anymore.

churromorales · 2023-03-30T00:19:09Z

@abhishekagarwal87 The test failing are from the other PR that was merged, I don't have cycles to look and fix whatever is wrong and we honestly forked our druid so whether it goes upstream is not a big deal.

I think the thing is the other PR is using all the old libraries, and it can't deserialize the pod template anymore. So if you want to update this feature to work with newer k8s versions, that will have to be sorted out at some point, or you can just close off this PR and worry about upgrades when the time comes. As it stands the mm-less wont work in newer versions of k8s so, I'll leave it up to imply what they want to do.

abhishekagarwal87 · 2023-03-30T04:34:30Z

I understand @churromorales. would you be open to giving write access on your branch to someone who wants to take this PR forward?

clintropolis · 2023-04-06T04:54:15Z

closing since these changes were added to #14028

Upgrade the fabric client to support newer versions of k8s, add abili…

b7f8422

…ty to override peon monitors

churromorales mentioned this pull request Feb 14, 2023

Upgrading fabric k8s client for k8s overlord extensions #13759

Closed

churromorales added 2 commits February 14, 2023 11:46

Fix dependencies check in build

ae4248f

Have to remove the concrete dependency and keep the api

c77cb94

a2l007 reviewed Feb 14, 2023

View reviewed changes

churromorales added 4 commits February 14, 2023 14:50

Forgot to update the expected file in this commit

922afaa

Adding code coverage, fixing some inspections

e655777

Made a mistake on which mocks to call verify on

9d14e46

Have to add the client back as a runtime dependency, otherwise we get…

9be2a23

… class not founds at runtime

github-advanced-security AI found potential problems Feb 15, 2023

View reviewed changes

abhishekagarwal87 added the Kubernetes label Feb 15, 2023

Last change and some log fixes to work with k8s 1.25

4d76e5f

github-actions Bot added the Area - Documentation label Feb 17, 2023

a2l007 reviewed Feb 20, 2023

View reviewed changes

a2l007 approved these changes Feb 20, 2023

View reviewed changes

abhishekagarwal87 reviewed Feb 22, 2023

View reviewed changes

Comment thread ...es-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesTaskRunner.java Outdated

abhishekagarwal87 reviewed Feb 22, 2023

View reviewed changes

ektravel reviewed Feb 27, 2023

View reviewed changes

Comment thread docs/development/extensions-contrib/k8s-jobs.md Outdated

ektravel reviewed Feb 27, 2023

View reviewed changes

churromorales and others added 2 commits February 28, 2023 12:45

Update docs/development/extensions-contrib/k8s-jobs.md

1e97768

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

Changes per PR review

55875f2

techdocsmith requested changes Mar 2, 2023

View reviewed changes

Comment thread docs/development/extensions-contrib/k8s-jobs.md Outdated

jwitko approved these changes Mar 10, 2023

View reviewed changes

churromorales and others added 3 commits March 10, 2023 11:52

Merge branch 'master' into k8s-mm-less-fixes

4b836f3

Update docs/development/extensions-contrib/k8s-jobs.md

4e8bb9d

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

Small fix to make the config match the docs and fixed a typo in the t…

c28d260

…ests caused by a merge conflict

mindreader mentioned this pull request Mar 15, 2023

Race condition when killing kafka ingestion tasks with replication > 1 #13705

Closed

churromorales and others added 2 commits March 28, 2023 09:56

Merge branch 'master' into k8s-mm-less-fixes

79c1e6b

Removing the un-needed code coverage happy test

82e316a

Merge conflicts due to the PodTemplateTaskAdapter commit

2871150

churromorales closed this Mar 30, 2023

abhishekagarwal87 added this to the 26.0 milestone Apr 4, 2023

nlippis mentioned this pull request Apr 4, 2023

K8s mm less fixes #14028

Merged

10 tasks

abhishekagarwal87 reopened this Apr 5, 2023

clintropolis closed this Apr 6, 2023

clintropolis removed this from the 26.0 milestone Apr 6, 2023

Conversation

churromorales commented Feb 14, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Check notice

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

churromorales commented Feb 21, 2023

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ektravel left a comment

Choose a reason for hiding this comment

Uh oh!

abhishekagarwal87 commented Mar 1, 2023

Uh oh!

techdocsmith left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

churromorales commented Mar 3, 2023

Uh oh!

jwitko left a comment

Choose a reason for hiding this comment

Uh oh!

churromorales commented Mar 13, 2023

Uh oh!

mindreader commented Mar 15, 2023

Uh oh!

churromorales commented Mar 21, 2023

Uh oh!

a2l007 commented Mar 21, 2023

Uh oh!

mindreader commented Mar 22, 2023

Uh oh!

abhishekagarwal87 commented Mar 23, 2023

Uh oh!

churromorales commented Mar 28, 2023

Uh oh!

abhishekagarwal87 commented Mar 29, 2023

Uh oh!

churromorales commented Mar 29, 2023

Uh oh!

churromorales commented Mar 30, 2023

Uh oh!

abhishekagarwal87 commented Mar 30, 2023

Uh oh!

clintropolis commented Apr 6, 2023