CircularList round-robin iterator for the KillUnusedSegments duty#16719
CircularList round-robin iterator for the KillUnusedSegments duty#16719abhishekrb19 merged 12 commits intoapache:masterfrom
Conversation
Currently there's a fairness problem in the KillUnusedSegments duty
where the duty consistently selects the same set of datasources as discovered
from the metadata store or dynamic config params. This is a problem especially
when there are multiple unused. In a medium to large cluster, while we can increase
the task slots to increase the likelihood of broader coverage. This patch adds a simple
round-robin iterator to select datasources and has the following properties:
1. Starts with an initial random cursor position in an ordered list of candidates.
2. Consecutive {@code next()} iterations from {@link #getIterator()} are guaranteed to be deterministic
unless the set of candidates change when {@link #updateCandidates(Set)} is called.
3. Guarantees that no duplicate candidates are returned in two consecutive {@code next()} iterations.
1. Clarify javadocs on the ordered list. Also flesh out the details a bit more. 2. Rename the test hooks to make intent clearer and fix typo. 3. Add NotThreadSafe annotation. 4. Remove one potentially noisy log that's in the path of iteration.
a83a5fd to
fd61ce8
Compare
kfaraz
left a comment
There was a problem hiding this comment.
Overall, makes sense to me. Left suggestions about the data structure used.
| final Set<String> remainingDatasourcesToKill = new HashSet<>(dataSourcesToKill); | ||
| final Set<String> datasourcesKilled = new HashSet<>(); |
There was a problem hiding this comment.
It is redundant to maintain both these sets. Let's just keep datasourcesKilled, we can build the other from it as it is only needed once at the end.
There was a problem hiding this comment.
Yeah, that's fair. remainingDatasourcesToKill is used as a termination condition for the iteration. We can remove datasourcesKilled because it can be easily computed at the end for logging.
… condition. Remove redundant comments. Remove rendundant variable tracking.
kfaraz
left a comment
There was a problem hiding this comment.
Thanks a lot for incorporating the feedback, @abhishekrb19 !!
The implementation looks good to me.
I have left some suggestions, none of which are blockers for this PR.
| /** | ||
| * Set up multiple datasources {@link #DS1}, {@link #DS2} and {@link #DS3} with unused segments with 2 kill task | ||
| * slots. Running the kill duty each time should pick at least one unique datasource in a round-robin manner. | ||
| */ |
There was a problem hiding this comment.
Style:
I generally prefer leaving one-line comments within the test method where relevant instead of javadocs (unless the test is too complicated). Ideally, the test name and the test code itself should be descriptive enough to clarify what is being verified.
…ache#16719) * Round-robin iterator for datasources to kill. Currently there's a fairness problem in the KillUnusedSegments duty where the duty consistently selects the same set of datasources as discovered from the metadata store or dynamic config params. This is a problem especially when there are multiple unused. In a medium to large cluster, while we can increase the task slots to increase the likelihood of broader coverage. This patch adds a simple round-robin iterator to select datasources and has the following properties: 1. Starts with an initial random cursor position in an ordered list of candidates. 2. Consecutive {@code next()} iterations from {@link #getIterator()} are guaranteed to be deterministic unless the set of candidates change when {@link #updateCandidates(Set)} is called. 3. Guarantees that no duplicate candidates are returned in two consecutive {@code next()} iterations. * Renames in RoundRobinIteratorTest. * Address review comments. 1. Clarify javadocs on the ordered list. Also flesh out the details a bit more. 2. Rename the test hooks to make intent clearer and fix typo. 3. Add NotThreadSafe annotation. 4. Remove one potentially noisy log that's in the path of iteration. * Add null check to input candidates. * More commentary. * Addres review feedback: downgrade some new info logs to debug; invert condition. Remove redundant comments. Remove rendundant variable tracking. * CircularList adjustments. * Updates to CircularList and cleanup RoundRobinInterator. * One more case and add more tests. * Make advanceCursor private for now. * Review comments.
Problem
Currently, there is a fairness problem in the
KillUnusedSegmentsduty where the duty consistently selects the same set of datasources as discovered from the metadata store or dynamic config parameters. This is particularly problematic when there are a steady stream of unused segments created by retention rules and fewer kill task slots.Fix
To address the fairness problem, this patch introduces a sorted
CircularListthat provides a round-robin iterator for selecting datasources to kill. The kill duty also tracks the previously killed datasource to avoid selecting consecutive duplicate datasources, provided there are other datasources available. This optimization is especially beneficial for smaller clusters where task slots are limited by default. When the set of datasources changes during a kill run, the circular list instance is refreshed; otherwise, the previous state is resumed.A few other heuristic greedy-based approaches that were considered:
The problem with the above approaches is that computing and determining the heuristics can be expensive and non-trivial in the scope of the kill duty. Also, these greedy approaches won't necessarily address the fairness problem at hand in all scenarios.
Miscellaneous changes
This patch also includes a few annotation fixes and refactoring/debug logging improvements.
Future work
Refactor
RoundRobinServerSelector.CircularServerListto use the commonCircularListimplementation.Release note
The
KillUnusedSegmentscoordinator duty now selects datasources in a round-robin manner during each run, ensuring varied selection instead of repeatedly choosing the same set of datasources.This PR has: