Enable Continuous auto kill by zachjsh · Pull Request #14831 · apache/druid

zachjsh · 2023-08-15T19:31:45Z

Description

This change enables the KillUnusedSegments coordinator duty to be scheduled continuously. Things that prevented this, or made this difficult before were the following:

If scheduled at fast enough rate, the duty would find the same intervals to kill for the same datasources, while kill tasks submitted for those same datasources and intervals were already underway, thus wasting task slots on duplicated work.
The task resources used by auto kill were previously unbounded. Each duty run period, if unused
segments were found for any datasource, a kill task would be submitted to kill them.

This pr solves for both of these issues:

The duty keeps track of the end time of the last interval found when killing unused segments for each datasource, in a in memory map. The end time for each datasource, if found, is used as the start time lower bound, when searching for unused intervals for that same datasource. Each duty run, we remove any datasource keys from this map that are no longer found to match datasources in the system, or in whitelist, and also remove a datasource entry, if there is found to be no unused segments for the datasource, which happens when we fail to find an interval which includes unused segments. Removing the datasource entry from the map, allows for searching for unusedSegments in the datasource from the beginning of time once again
The unbounded task resource usage can be mitigated with coordinator dynamic config added as part of ba957a9

Operators can configure continous auto kill by providing coordinator runtime properties similar to the following:

druid.coordinator.period.indexingPeriod=PT60S
druid.coordinator.kill.period=PT60S

And providing sensible limits to the killTask usage via coordinator dynamic properties.

Release Note

The value of druid.coordinator.kill.period can now be as low as druid.coordinator.period.indexingPeriod. It was earlier required to be strictly greater than the indexingPeriod.
The leader Coordinator now keeps track of the last submitted kill task for a given datasource to avoid submitting duplicate kill tasks.

This PR has:

* fix stuff

maytasm · 2023-08-16T04:52:01Z

Personally, I think #12599 is a must have before enabling continuous auto kill. Otherwise one tiny mistake in your loadRule can result in data being permanently deleted from s3 right away.

suneet-s

I have not had a chance to look through the tests yet, but this is some early feedback.

suneet-s · 2023-08-16T15:46:40Z

 |`druid.coordinator.kill.pendingSegments.on`|Boolean flag for whether or not the Coordinator clean up old entries in the `pendingSegments` table of metadata store. If set to true, Coordinator will check the created time of most recently complete task. If it doesn't exist, it finds the created time of the earliest running/pending/waiting tasks. Once the created time is found, then for all dataSources not in the `killPendingSegmentsSkipList` (see [Dynamic configuration](#dynamic-configuration)), Coordinator will ask the Overlord to clean up the entries 1 day or more older than the found created time in the `pendingSegments` table. This will be done periodically based on `druid.coordinator.period.indexingPeriod` specified.|true|
 |`druid.coordinator.kill.on`|Boolean flag for whether or not the Coordinator should submit kill task for unused segments, that is, permanently delete them from metadata store and deep storage. If set to true, then for all whitelisted dataSources (or optionally all), Coordinator will submit tasks periodically based on `period` specified. A whitelist can be set via dynamic configuration `killDataSourceWhitelist` described later.<br /><br />When `druid.coordinator.kill.on` is true, segments are eligible for permanent deletion once their data intervals are older than `druid.coordinator.kill.durationToRetain` relative to the current time. If a segment's data interval is older than this threshold at the time it is marked unused, it is eligible for permanent deletion immediately after being marked unused.|false|
-|`druid.coordinator.kill.period`|How often to send kill tasks to the indexing service. Value must be greater than `druid.coordinator.period.indexingPeriod`. Only applies if kill is turned on.|P1D (1 Day)|
+|`druid.coordinator.kill.period`| How often to send kill tasks to the indexing service. Value must be greater `druid.coordinator.period.indexingPeriod` if set. Only applies if kill is turned on.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | NULL (indexing period is used) |


This change should be release noted. Can you update the PR description to make sure it gets picked up in the release notes please.

thanks done.

suneet-s · 2023-08-16T17:05:43Z

+      return new KillUnusedSegments(segmentsMetadataManager, overlordClient, config);
+    } else {
+      if (killUnusedSegmentsDutyFromCustomGroups.size() > 1) {
+        log.warn(


Can you throw an exception here instead. This is a misconfiguration, and it would be better to fail startup so the operator can fix the config issue.

In the exception message can you add the names of the custom groups where the duty is configured to make it easy for the operator to fix the config issue.

suneet-s · 2023-08-16T17:23:48Z

+
+  /**
+   * Used to keep track of the last interval end time that was killed for each
+   * datasource.
+   */
+  private final Map<String, DateTime> datasourceToLastKillIntervalEnd;


I'm not sure if this approach is the best way. If the leader changes, the in-memory map is essentially wiped out when a new leader comes online, so it is possible to re-schedule an interval that is already in the process of being killed. So 2 questions on this

What is the impact if the same interval is scheduled to be killed twice? Can we test / validate that nothing bad happens? If the task just waits on a lock and then succeeds as there is no data to delete, I think that is reasonable.

Is it possible to get the list of intervals that are being killed from the kill task payloads? Would that put a lot of stress on the master nodes if we polled the task payloads?

testing revealed that 1. is what happens, so thought it was no the end of the world that we still have potential of submitting duplicate tasks.

suneet-s · 2023-08-16T17:27:45Z

@@ -144,6 +144,12 @@ Optional<Iterable<DataSegment>> iterateAllUsedNonOvershadowedSegmentsForDatasour
   */
  List<Interval> getUnusedSegmentIntervals(String dataSource, DateTime maxEndTime, int limit);


Is there a reason not to get rid of this function from the interface? It looks like there are no callers other than the KillUnusedSegmentsDuty

suneet-s

Personally, I think #12599 is a must have before enabling continuous auto kill. Otherwise one tiny mistake in your loadRule can result in data being permanently deleted from s3 right away.

I think this patch is actually just providing the ability to schedule kill tasks continuously. By default kill tasks run every indexing period, which is every 30 minutes. I agree #12599 is good to have, but I do not think it needs to block this change. Hopefully, both will be merged soon enough, that the order of merging these changes won't matter too much.

kfaraz

As discussed with @zachjsh , I am not sure see I see the appeal of making KillUnusedSegments a custom duty. The only motivation seems to be to be able to have very low periods of scheduling kill tasks.

My primary argument against the current approach is the concept of having the kill duty be continuous while also needing to specify a period for it. If we want it to be truly continuous, users should only be required to enable/disable kill and not have to specify any period for it.

As a compromise, we could just tie kill tasks to the indexingPeriod (typically 30 to 60 min) for the time being as that is frequent enough for the purposes of cleaning up unused segments. In the future, if we want even more frequent kill tasks, we could simply tie it up to the coordinator period itself (typically 1 min) (which might be a little too aggressive but it's a thought).

cc: @suneet-s

kfaraz · 2023-08-17T14:54:26Z

-
    final long currentTimeMillis = System.currentTimeMillis();
-    if (lastKillTime + period > currentTimeMillis) {
+    if (null != period && (lastKillTime + period > currentTimeMillis)) {


I think we have already validated that period is not null in the constructor.

kfaraz · 2023-08-17T14:54:57Z

+    this.period = null != config.getCoordinatorKillPeriod() ? config.getCoordinatorKillPeriod().getMillis() : null;
    Preconditions.checkArgument(
-        this.period > config.getCoordinatorIndexingPeriod().getMillis(),
+        null == this.period || this.period > config.getCoordinatorIndexingPeriod().getMillis(),


Null handling and validation can be done in the DruidCoordinatorConfig itself.

now checking against the dutyPeriod which is not always the same as indexing period. Think this needs to be done here, consequently.

Co-authored-by: Suneet Saldanha <suneet@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

ektravel

Reviewed changes to docs/configuration/index.md.

vtlim · 2023-08-21T17:29:48Z

Docs LGTM

kfaraz

Thanks for the changes, @zachjsh ! The changes look good to me.

kfaraz · 2023-08-23T04:16:43Z

      int limit,
-      DateTime maxUsedFlagLastUpdatedTime
-  );
+      DateTime maxUsedFlagLastUpdatedTime);


Nit: style

Suggested change

DateTime maxUsedFlagLastUpdatedTime);

DateTime maxUsedFlagLastUpdatedTime

);

zachjsh added 2 commits August 14, 2023 11:26

* store state in auto kill duty so that we dont resubmit kill intervals

40da93e

* add tests

55ae7bb

* fix stuff

zachjsh requested review from jasonk000, maytasm and suneet-s August 15, 2023 19:32

suneet-s added Area - Operations Release Notes labels Aug 15, 2023

github-advanced-security AI found potential problems Aug 15, 2023

View reviewed changes

Comment thread server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java Fixed

zachjsh added 2 commits August 15, 2023 18:12

* fix failing test

55f53ca

* update docs

b598810

github-actions Bot added the Area - Documentation label Aug 15, 2023

* remove unnecessary stubbing

0e785e2

suneet-s reviewed Aug 16, 2023

View reviewed changes

kfaraz reviewed Aug 17, 2023

View reviewed changes

zachjsh and others added 6 commits August 18, 2023 13:05

Apply suggestions from code review

8696df1

Co-authored-by: Suneet Saldanha <suneet@apache.org> Co-authored-by: Kashif Faraz <kashif.faraz@gmail.com>

* address review comments

0f6d795

Merge remote-tracking branch 'apache/master' into continuous-auto-kill

96f1554

* update docs

d444b71

* move minStartTime condition as last condition in query

ed0f11a

* fix failing test

d0bbaa7

zachjsh requested review from kfaraz and suneet-s August 18, 2023 21:41

Merge remote-tracking branch 'apache/master' into continuous-auto-kill

f20b77f

ektravel reviewed Aug 21, 2023

View reviewed changes

Comment thread docs/configuration/index.md Outdated

ektravel reviewed Aug 21, 2023

View reviewed changes

* fix docs and defaults

384b6d4

* do not run auto kill in indexing duties if defined in custom group.

b50069b

zachjsh added 3 commits August 22, 2023 23:24

* dont use custom duty group

3f9c050

* greater than or equal to

9691435

* remove erroneous customDuty group

e170d46

kfaraz approved these changes Aug 23, 2023

View reviewed changes

zachjsh merged commit 0c76df1 into apache:master Aug 23, 2023

zachjsh deleted the continuous-auto-kill branch August 23, 2023 13:23

LakshSingla added this to the 28.0 milestone Oct 12, 2023

renatocron mentioned this pull request Nov 2, 2023

Kill Task improvements #15312

Open

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

		@@ -144,6 +144,12 @@ Optional<Iterable<DataSegment>> iterateAllUsedNonOvershadowedSegmentsForDatasour
		*/
		List<Interval> getUnusedSegmentIntervals(String dataSource, DateTime maxEndTime, int limit);

	DateTime maxUsedFlagLastUpdatedTime);
	DateTime maxUsedFlagLastUpdatedTime
	);

Conversation

zachjsh commented Aug 15, 2023 • edited by kfaraz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Release Note

Uh oh!

Uh oh!

maytasm commented Aug 16, 2023

Uh oh!

suneet-s left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

suneet-s left a comment

Choose a reason for hiding this comment

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ektravel left a comment

Choose a reason for hiding this comment

Uh oh!

vtlim commented Aug 21, 2023

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

zachjsh commented Aug 15, 2023 •

edited by kfaraz

Loading