Batch segment retrieval from the metadata store#15305
Merged
abhishekrb19 merged 13 commits intoapache:masterfrom Nov 6, 2023
Merged
Batch segment retrieval from the metadata store#15305abhishekrb19 merged 13 commits intoapache:masterfrom
abhishekrb19 merged 13 commits intoapache:masterfrom
Conversation
… are retrieved. - This is a failing test case that needs to be ignored.
zachjsh
reviewed
Nov 3, 2023
CaseyPan
pushed a commit
to CaseyPan/druid
that referenced
this pull request
Nov 17, 2023
* Add a unit test that fails when used segments with too many intervals are retrieved. - This is a failing test case that needs to be ignored. * Batch the intervals (use 100 as it's consistent with batching in other places). * move the filtering inside the batch * Account for limit cross the batch splits. * Adjustments * Fixup and add tests * small refactor * add more tests. * remove wrapper. * Minor edits * assert out of range
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Used segments retrieval fails when there are too many intervals (about 120 with Derby). Same thing can happen with multiple intervals as with unused segments.
Previously, the intervals weren't capped in the SQL query and is bloated by the fact that we add grouped
ORclause per interval to handle the eternity interval here. This PR splits up theSELECTquery into multiple batches, with 100 intervals/batch. This is similar to the batching strategy with a cap on max number of segments announced at once.Core changes:
A fix localized to kill tasks that originally exposed this bug is available here - #15306. We'd still separately want this change in the server as the issue can happen more broadly.
This PR has: