Skip to content

Fix ForkingTaskRunnerTest#16323

Merged
kfaraz merged 2 commits intoapache:masterfrom
kfaraz:fix_fork_runner_test
Apr 24, 2024
Merged

Fix ForkingTaskRunnerTest#16323
kfaraz merged 2 commits intoapache:masterfrom
kfaraz:fix_fork_runner_test

Conversation

@kfaraz
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz commented Apr 23, 2024

ForkingTaskRunner uses static fields to track task counts which are reported in metrics.
This causes some tests like ForkingTaskRunnerTest.testInvalidTaskContextJavaOptsArray() to give false positives if the variables had been set by a preceding test.

Changes:

  • Use non-static fields to track task counts in ForkingTaskRunner
  • Update assertions in tests to ensure that the tests are idempotent.

@IgorBerman
Copy link
Copy Markdown
Contributor

@kfaraz don't you think there should be
failedTaskCount .incrementAndGet();
near LOGGER.info(t, "Exception caught during execution"); at wrapping try-catch?
I mean when we pass bad arguments the IllegalArgumentException is thrown but the code never increments failedTaskCount

@IgorBerman
Copy link
Copy Markdown
Contributor

btw, those counters used in metrics reporting, so removing static will cause change in those metrics... i mean from test perspective static causing races, but if we looking at metric level and WorkerTaskCountStatsProvider I can understand why it's static since it's "Proides task / task count status at the level of individual worker nodes"

@IgorBerman
Copy link
Copy Markdown
Contributor

if testing framework makes sure that each test runs one after another and not in parallel, other option would be to reset static variables in some visible for test method.
Additional option would be to use some interface that a) decouples from ForkingTaskRunnerTest & WorkerTaskCountStatsProvider and b) "collects" reports from ForkingTaskRunnerTest. for tests it will 1 implementation, race free.
for production it can wrap those static variables and WorkerTaskCountStatsProvider will take those reports from this interface

@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Apr 24, 2024

but if we looking at metric level and WorkerTaskCountStatsProvider I can understand why it's static since it's "Proides task / task count status at the level of individual worker nodes"

@IgorBerman , I am not sure I follow. How would this be affected if we make the fields non-static?
ForkingTaskRunner is used only by MiddleManagers and each MiddleManager would have only a single instance of it. Keeping the fields static doesn't serve any purpose, afaict.

@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Apr 24, 2024

don't you think there should be
failedTaskCount .incrementAndGet();
near LOGGER.info(t, "Exception caught during execution"); at wrapping try-catch?

Yes, I wanted to do this but I decided to do it later as I need to look at all the exceptions and if it would be better to not throw an exception at all and simply return a TaskStatus.failure() in those cases. But I need to test it properly before making those changes.

The changes in this PR are more straightforward and just meant to get the build right without altering any behaviour.

@IgorBerman
Copy link
Copy Markdown
Contributor

oh, ok, if it's singleton then it will work and no need in static. I missed this part

@kfaraz kfaraz changed the title Fix fork runner test Fix ForkingTaskRunnerTest Apr 24, 2024
+ " must be an array of strings.")
);
Assert.assertEquals(1L, (long) forkingTaskRunner.getWorkerFailedTaskCount());
Assert.assertEquals(0L, (long) forkingTaskRunner.getWorkerFailedTaskCount());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are both failed and successful counts 0?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ideally the failed count should be 1, but it is currently not being tracked.
We can fix up the code later as mentioned here: #16323 (comment).

Copy link
Copy Markdown
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, @kfaraz. LGTM

@kfaraz kfaraz merged commit 1dabb02 into apache:master Apr 24, 2024
@kfaraz kfaraz deleted the fix_fork_runner_test branch April 24, 2024 08:35
@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Apr 24, 2024

Thanks for the review, @AmatyaAvadhanula , @IgorBerman !

@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants