MLE-26865, MLE-26867: Fixes for issues identified by Polaris#581
MLE-26865, MLE-26867: Fixes for issues identified by Polaris#581SiddharthMaheshB merged 2 commits intomarklogic:developfrom
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ss spurious wakeup vulnerability Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses two Polaris findings in the local-threading/autoscaling path: replacing wait()/notify() synchronization used when starting MultithreadedMapper runners with a CountDownLatch, and guarding an autoscaling path that could divide by zero when there are no active tasks.
Changes:
- Replace
pool.wait()/notify()coordination duringLocalMapTasksubmission with a per-taskCountDownLatch. - Add a guard in
ThreadManager.ThreadPollerto exit early whengetActiveTaskCounts()returns 0. - Wire latch propagation from
LocalMapTaskintoMultithreadedMapperso runner creation signals the submitter thread.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/main/java/com/marklogic/contentpump/ThreadManager.java |
Switches task-submission coordination to CountDownLatch.await() and adds an early-return guard for activeTaskCounts == 0 in the poller. |
src/main/java/com/marklogic/contentpump/MultithreadedMapper.java |
Adds latch field/setter and counts down after creating runners (replacing threadPool.notify()). |
src/main/java/com/marklogic/contentpump/LocalJobRunner.java |
Adds a per-task latch, exposes it, passes it into MultithreadedMapper, and counts it down on task failure. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
NeoSaber
left a comment
There was a problem hiding this comment.
The changes in this PR look fine to me, so I am approving it.
However, I am concerned about the issue Copilot noticed in ThreadManager. The fix it suggested looks wrong to me, but its analysis of the problem it found with the way randomIndexes.get(i) gets used does seem to be accurate based on what I am seeing in the code. It's a little hard to believe an issue like that isn't crashing frequently, so something else might be going on. I think there needs to be a new bug ticket made for that issue so it can be investigated more deeply by someone and fixed if it is really a problem.
|
I have created a bug for it: https://progresssoftware.atlassian.net/browse/MLE-28260, will work on it later. |
This PR contains fixes for two issues identified through Polaris scans:
I ran the unit test and 06mlcp, both passing with no failures other than expected ones.