Kill tasks honor the buffer period of unused segments#15710
Merged
abhishekrb19 merged 19 commits intoapache:masterfrom Jan 19, 2024
Merged
Kill tasks honor the buffer period of unused segments#15710abhishekrb19 merged 19 commits intoapache:masterfrom
abhishekrb19 merged 19 commits intoapache:masterfrom
Conversation
- The coordinator duty KillUnusedSegments determines an umbrella interval for each datasource to determine the kill interval. There can be multiple unused segments in an umbrella interval with different used_status_last_updated timestamps. For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour old one. - However, when a kill task is instantiated with this umbrella interval, it’d kill all the unused segments regardless of the last updated timestamp. We need kill tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing unused segments in the kill interval prematurely.
f98d291 to
f3f6236
Compare
| Assert.assertEquals(task.getId(), taskQuery.getId()); | ||
| Assert.assertEquals(task.getDataSource(), taskQuery.getDataSource()); | ||
| Assert.assertEquals(task.getInterval(), taskQuery.getInterval()); | ||
| Assert.assertNull(taskQuery.getMarkAsUnused()); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation
f3f6236 to
ce0ae56
Compare
zachjsh
reviewed
Jan 17, 2024
zachjsh
reviewed
Jan 17, 2024
zachjsh
reviewed
Jan 17, 2024
zachjsh
reviewed
Jan 17, 2024
zachjsh
reviewed
Jan 17, 2024
zachjsh
approved these changes
Jan 18, 2024
This is consistent with the column name `used_status_last_updated`.
kfaraz
reviewed
Jan 18, 2024
Contributor
kfaraz
left a comment
There was a problem hiding this comment.
Left initial feedback, haven't gone through a few files yet.
Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
…data storage coordinator interface. Removes the following interface methods in favor of a new method added: - retrieveUnusedSegmentsForInterval(String, Interval) - retrieveUnusedSegmentsForInterval(String, Interval, Integer)
7996dd6 to
0995ea0
Compare
jon-wei
approved these changes
Jan 18, 2024
kfaraz
approved these changes
Jan 19, 2024
Contributor
kfaraz
left a comment
There was a problem hiding this comment.
Thanks for the changes, @abhishekrb19 !
Contributor
Author
LakshSingla
pushed a commit
to LakshSingla/druid
that referenced
this pull request
Jan 31, 2024
* Kill tasks should honor the buffer period of unused segments. - The coordinator duty KillUnusedSegments determines an umbrella interval for each datasource to determine the kill interval. There can be multiple unused segments in an umbrella interval with different used_status_last_updated timestamps. For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour old one. - However, when a kill task is instantiated with this umbrella interval, it’d kill all the unused segments regardless of the last updated timestamp. We need kill tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing unused segments in the kill interval prematurely. * Clarify default behavior in docs. * test comments * fix canDutyRun() * small updates. * checkstyle * forbidden api fix * doc fix, unused import, codeql scan error, and cleanup logs. * Address review comments * Rename maxUsedFlagLastUpdatedTime to maxUsedStatusLastUpdatedTime This is consistent with the column name `used_status_last_updated`. * Apply suggestions from code review Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com> * Make period Duration type * Remove older variants of runKilLTask() in OverlordClient interface * Test can now run without waiting for canDutyRun(). * Remove previous variants of retrieveUnusedSegments from internal metadata storage coordinator interface. Removes the following interface methods in favor of a new method added: - retrieveUnusedSegmentsForInterval(String, Interval) - retrieveUnusedSegmentsForInterval(String, Interval, Integer) * Chain stream operations * cleanup * Pass in the lastUpdatedTime to markUnused test function and remove sleep. --------- Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
abhishekagarwal87
pushed a commit
that referenced
this pull request
Jan 31, 2024
* Kill tasks should honor the buffer period of unused segments. - The coordinator duty KillUnusedSegments determines an umbrella interval for each datasource to determine the kill interval. There can be multiple unused segments in an umbrella interval with different used_status_last_updated timestamps. For example, consider an unused segment that is 30 days old and one that is 1 hour old. Currently the kill task after the 30-day mark would kill both the unused segments and not retain the 1-hour old one. - However, when a kill task is instantiated with this umbrella interval, it’d kill all the unused segments regardless of the last updated timestamp. We need kill tasks and RetrieveUnusedSegmentsAction to honor the bufferPeriod to avoid killing unused segments in the kill interval prematurely. * Clarify default behavior in docs. * test comments * fix canDutyRun() * small updates. * checkstyle * forbidden api fix * doc fix, unused import, codeql scan error, and cleanup logs. * Address review comments * Rename maxUsedFlagLastUpdatedTime to maxUsedStatusLastUpdatedTime This is consistent with the column name `used_status_last_updated`. * Apply suggestions from code review * Make period Duration type * Remove older variants of runKilLTask() in OverlordClient interface * Test can now run without waiting for canDutyRun(). * Remove previous variants of retrieveUnusedSegments from internal metadata storage coordinator interface. Removes the following interface methods in favor of a new method added: - retrieveUnusedSegmentsForInterval(String, Interval) - retrieveUnusedSegmentsForInterval(String, Interval, Integer) * Chain stream operations * cleanup * Pass in the lastUpdatedTime to markUnused test function and remove sleep. --------- Co-authored-by: Abhishek Radhakrishnan <abhishek.rb19@gmail.com> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem Description
With #12599, operators can configure
druid.coordinator.kill.bufferPeriod,allowing the
KillUnusedSegmentsduty to utilize the buffer period to find candidate unused segments.The coordinator duty
KillUnusedSegmentsdetermines an umbrella intervalfor each datasource to determine the kill interval. The umbrella interval can be a widened
interval and there can be multiple unused segments in this interval with different
used_status_last_updatedtimestamps. For example, consider two unused segments in theinterval
[2023-01-01/2023-01-05]that were marked as unused at different times, one thatis 30 days old and another that is 1 hour old.
However, when a kill task is instantiated with this umbrella interval, it’d kill all the unused segments
found in the interval regardless of the last updated timestamp. In the above example, the kill task
after the 30-day mark would kill both the unused segments and not retain the 1-hour old one.
Fix
Pass in the
maxUsedFlagLastUpdatedTimeto the kill task and the unused segments retrieve task action.Adjust the metadata SQL query to account for
used_status_last_updated<=maxUsedStatusLastUpdatedTimewhenmaxUsedFlagLastUpdatedTimeis non-null.Release note
maxUsedStatusLastUpdatedTime. When set to a date time, the kill task considers segments in the specifiedintervalmarked as unused no later than this time. The default behavior is to kill all unused segments in the interval regardless of the time when segments where marked as unused.This PR has: