Skip to content

Remove stray reference to fix OOM while merging sketches#13475

Merged
cryptoe merged 10 commits intoapache:masterfrom
adarshsanjeev:sketch-merge-memory-fix
Dec 8, 2022
Merged

Remove stray reference to fix OOM while merging sketches#13475
cryptoe merged 10 commits intoapache:masterfrom
adarshsanjeev:sketch-merge-memory-fix

Conversation

@adarshsanjeev
Copy link
Copy Markdown
Contributor

@adarshsanjeev adarshsanjeev commented Dec 1, 2022

This fixes a potential OOM issue introduced in #13205. While fetching sketches, a reference to the future sketch is kept to cancel them, in case one of the fetches fail. However, this results in a reference to the sketch even after the merge is successful. This PR changes this to a set of futures from which completed futures are removed so that these sketches can be garbage collected.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@abhishekagarwal87 abhishekagarwal87 added this to the 25.0 milestone Dec 1, 2022
Copy link
Copy Markdown
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fix UT testable? Or if there already exists a UT for this, can you please point me to it? Further, please run the functional test on this change which led to the discovery of this bug.

}));
});

partitionFuture.whenComplete((result, exception) -> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a corner case, it could happen that partitionFuture finishes before all futures are queued up in the executor. To avoid that one simple solution could be to have a CountDownLatch with value as workerCount. The whenComplete would then wait for latch to become 0 before cancelling all the futures.
Maybe we could also have a better solution with cancellation - will update on this thread if I find any utility.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a failsafe condition or would this be required? From my understanding, the successful completion of partitionFuture depends on the completion of the underlying futures, and in case of unsuccessful completion, we shouldn't have any issue with prematurely cancelling any underlying future. I might have missed something though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a question in addition to Laksh's point, since we are attaching this callback in the same thread after queueing up the futures, would this trigger without all the futures present anyway?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agree that since attaching this callback is in the same thread after queuing - this shouldn't trigger without futures. Although, that raises the question that if the future finishes before this callback is attached, will this callback execute while attaching it? if so, will that callback execution happen on the same thread attaching it?

Copy link
Copy Markdown
Member

@rohangarg rohangarg Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is it necessary to cancel the pending/running futures? It would be incase we'll be retrying the stats collection stage - otherwise if we're going to fail the whole job, it is not necessary to fail them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the future that is added as a callback should still get executed if the completable future is already finished executing, though the javadoc doesn't guarantee which thread would be executing it if an executor is also not passed as an argument.

It is also not a hard necessity that we cancel every pending future but it depends on number of workers which could be up to a 1000, which would waste a lot of time, but correctness would still be present.

Also, with regard to retrying, part of this will be updated with the worker retry PR, to only fetch the statistics which are needed even if a worker restarts, so after that we would be retrying the stats collection.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the code, I think it would be the thread which is attaching the callback which runs it incase the future is already finished.
Oh ok - thanks for the context! My point was more towards the fact that if we're going to fail the whole job and throw the whole executor on any failure, then we could skip the whole cancellation as well - but from what you told seems like it is not the case in the retry logic.

@kfaraz kfaraz added Bug Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Dec 2, 2022
@adarshsanjeev
Copy link
Copy Markdown
Contributor Author

There are unit tests in WorkerSketchFetcherTest which test if tasks would be cancelled, but there aren't any tests to see if any references to large objects like key stats are leaked. I suppose we could test something like that with WeakReference?

I have reproduced the bug and functionally tested this out locally, and these changes seems to fix the bug.

latch.await();
return snapshotListenableFuture;
return Futures.immediateFuture(mock(ClusterByStatisticsSnapshot.class));
}).when(workerClient).fetchClusterByStatisticsSnapshot(any(), any(), anyInt());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. shouldn't it be eq("1") in the first argument, since below we're adding a different mock for eq("2") ?
  2. Can we return a custom future from here which is never done and also captures the cancel invocation to increment the cancel counter - that could remove the need for spying executor service and creating new testing constructors for fetcher?
  3. Why are we doing latch.await() in every thread and using sleep in the main test thread? Didn't understand that part

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the answers to be more concise and clear instead of layering mocks.

I didn't quite understand the second point, if the executorService future is cancelled, it does not call cancel on the custom future, since it is unaware of it entirely. It instead throws an InterruptedException if the task has already started (which is find and handled properly in our code), however, this our custom mock would not know that it was interrupted.

The sleep in the main thread was to resolve a rare test failure where the asserts were happening before the futures could be cancelled, causing the test to fail prematurely.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes I missed that this method is for a part of the computation and not the whole computation itself.
Although the same concept can be still applied and we can make the computation in this mocking long timed and wait for an interruption arisen from the cancel. I tried the following approach locally : c536a3f
The goal of the above is to minimize mocking/spying from the attributes of the class being UT-ed - but I'm ok with the any of the approaches for now since this change is planned for release 25.0

@rohangarg
Copy link
Copy Markdown
Member

There are unit tests in WorkerSketchFetcherTest which test if tasks would be cancelled, but there aren't any tests to see if any references to large objects like key stats are leaked. I suppose we could test something like that with WeakReference?
I have reproduced the bug and functionally tested this out locally, and these changes seems to fix the bug.

I think doing the functional test is ok for now, maybe in future one way to test it could be to generate multiple small sketches in a test and write them to file. Then we can setup the fetcher for those sketches and check if the fetcher runs stably - it will automatically OOM incase it ends up taking a lot of memory.

Copy link
Copy Markdown
Member

@rohangarg rohangarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the CI passes

@cryptoe
Copy link
Copy Markdown
Contributor

cryptoe commented Dec 8, 2022

UT's and IT's passed for MSQ. As the worker retry PR #13353 is redoing WorkerSketchFetcher, I guess its okay to make an exception for less code coverage on workerSketchFetcher.

@cryptoe cryptoe merged commit fbf76ad into apache:master Dec 8, 2022
adarshsanjeev added a commit to adarshsanjeev/druid that referenced this pull request Dec 8, 2022
* Remove stray reference to fix OOM while merging sketches

* Update future to add result from executor service

* Update tests and address review comments

* Address review comments

* Moved mock

* Close threadpool on teardown

* Remove worker task cancel
cryptoe pushed a commit that referenced this pull request Dec 8, 2022
…3529)

* Remove stray reference to fix OOM while merging sketches

* Update future to add result from executor service

* Update tests and address review comments

* Address review comments

* Moved mock

* Close threadpool on teardown

* Remove worker task cancel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants