Skip to content

Automatic pendingSegments cleanup#5149

Merged
jon-wei merged 11 commits intoapache:masterfrom
jihoonson:pending-segments
Dec 20, 2017
Merged

Automatic pendingSegments cleanup#5149
jon-wei merged 11 commits intoapache:masterfrom
jihoonson:pending-segments

Conversation

@jihoonson
Copy link
Copy Markdown
Contributor

@jihoonson jihoonson commented Dec 11, 2017

With this patch, the coordinator periodically deletes pendingSegments from the pendingSegments table. You can test by adding druid.coordinator.kill.pendingSegments.on=true to your coordinator configuration file.

Additionally, I added TaskStatusPlus which contains taskId, createdTime, queueInsertionTime, taskState, runningDuration, and taskLocation. We currently have 4 similar classes, i.e., TaskStatus, TaskRunnerWorkItem, TaskResponseObject in OverlordResource, and TaskResponseObject for integration tests, but they are designed to be used in some specific classes. I would love to add a new one to represent all these classes and remove others, but it will cause a lot of code changes which makes difficult to be done together in this PR. So, in this PR,

  • TaskResponseObjects are replaced by TaskStatusPlus and moved to druid-api. This is to use the same class when coordinators call overlord APIs.
  • The Status enum in TaskStatus is extracted as a separate TaskState enumeration and moved to druid-api.
  • TaskLocation is moved to druid-api.

This change is Reviewable

Copy link
Copy Markdown
Contributor

@jon-wei jon-wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial review, will revisit this later today

}
}

private List<TaskStatus> getRecentlyFinishedTaskSTatusesSince(long start, Ordering<TaskStuff> createdDateDesc)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an extra capitalized T:

getRecentlyFinishedTaskSTatusesSince > getRecentlyFinishedTaskStatusesSince

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

for (ImmutableDruidDataSource dataSource : params.getDataSources()) {
if (!params.getCoordinatorDynamicConfig().getKillPendingSegmentsSkipList().contains(dataSource.getName())) {
log.info(
"Kill pendingSegments created until [%s] for dataSource[%s]",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string format arguments are in reverse order here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

DruidCoordinatorSegmentKiller.class
).addConditionBinding(
"druid.coordinator.kill.pendingSegments.on",
predicate -> Objects.equals(predicate, "true"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use Predicates.equalTo("true") to be consistent with the other bindings here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

}

@JsonProperty
public Set<String> getKillPendingSegmentsSkipList()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment that explains this is used to prevent automatic pending segment deletion for some datasources?

I was a bit confused initially and thought the skip list was referring to this: https://en.wikipedia.org/wiki/Skip_list

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.


Preconditions.checkArgument(
!deleteInterval.overlaps(activeTaskInterval),
"Cannot delete pendingSegments because there is at least one running task created at %s",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I wonder if throwing an exception here is too aggressive, if I understand correctly:

  • Since tasks can take a varying amount of time to complete, it seems like it wouldn't be too uncommon to have a situation where there are active tasks, created before the most recently completed task, that are still running
  • Would it be better to just return 0 here without an exception, or maybe use minCreatedDateOfActiveTasks as the end of the pending segment deletion interval?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I changed to gather all incomplete tasks and the last complete task and find the earliest createdTime. This makes less exceptions occur when the coordinator kills pending segments.

I think throwing an exception here is fine now because it rarely occurs 1) if there is a bug in DruidCoordinatorCleanupPendingSegments and 2) humans call the overlord API directly with a wrong interval.

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also believe that #5161 is exacerbated by this patch (as it makes it possible to reissue a previously-pending segment identifier, which could cause a similar issue to the race from #5161). But I think that's fine, since the fix to #5161 should involve overwriting, which would fix both problems.

Interval tryInterval,
Interval rowInterval,
boolean logOnFail,
boolean skipSegmentLineageCheck
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipSegmentLineageCheck can be removed from IndexerMetadataStorageConnector's allocatePendingSegment too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you elaborate on this? I guess you mean IndexerMetadataStorageCoordinator, but skipSegmentLineageCheck is used in IndexerSQLMetadataStorageCoordinator.allocatePendingSegment().

}
}

private List<TaskStatus> getRecentlyFinishedTaskSTatusesSince(long start, Ordering<TaskStuff> createdDateDesc)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalization is a bit weird on TaskSTatuses

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}
}
);
final List<TaskStatusPlus> completeTasks = recentlyFinishedTasks
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please retain the comment // Would be nice to include the real created date, but the TaskStorage API doesn't yet allow it. (or even make it possible!)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the comment. I'm thinking to refactor some classes related to TaskStatus. The APIs to get createdDate and queueInsertionTime will be added when it's done!

@Produces(MediaType.APPLICATION_JSON)
public Response killPendingSegments(
@PathParam("dataSource") String dataSource,
@QueryParam("interval") String deleteIntervalString,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to make this an Interval rather than String? We do have a jackson deserializer set up from json strings to Druid Intervals.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc of QueryParam is saying

The type T of the annotated parameter, field or property must either:

  1. Be a primitive type
  2. Have a constructor that accepts a single String argument
  3. Have a static method named valueOf or fromString that accepts a single String argument (see, for example, Integer.valueOf(String))
  4. Be List, Set or SortedSet, where T satisfies 2 or 3 above. The resulting collection is read-only.

I guess the deleteInterval should be passed in the HTTP message body rather than QueryParam or PathParam (PathParm also has similar requirements with QueryParam) to make this type-safe. If so, the as-is looks better to me because it's simple.

.map(task -> taskStorageQueryAdapter.getCreatedTime(task.getId()))
.min(Comparator.naturalOrder());

final Interval activeTaskInterval = new Interval(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include some kind of buffer zone here to account for the fact that clocks are not guaranteed to be in sync. I'm not sure how much grace period is needed but probably somewhere between 10 minutes and 24 hours. What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean we need a buffer when the coordinator kills pending segments. Or do you mean the buffer is needed when IndexerMetadataStorageAdapter checks the given interval overlaps the createdDate of any running tasks?
The former one sounds good. I added a buffer of 24 hours and now the endDate of the interval for killing pendingSegments is decided by min(createdDates of running/pending/waiting/complete tasks, DateTimes.nowUtc()).

);

if (!authResult.isAllowed()) {
throw new ForbiddenException(authResult.toString());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

authResult.getMessage() would make more sense I think. Also, now that I look at AuthResult, it seems that setMessage is never called and so it could be removed and message made final. It's unrelated to this patch but still a nice cleanup.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jihoonson
Copy link
Copy Markdown
Contributor Author

I also believe that #5161 is exacerbated by this patch (as it makes it possible to reissue a previously-pending segment identifier, which could cause a similar issue to the race from #5161). But I think that's fine, since the fix to #5161 should involve overwriting, which would fix both problems.

@gianm Good point. I agree.

@jon-wei
Copy link
Copy Markdown
Contributor

jon-wei commented Dec 20, 2017

LGTM, can you fix the CI errors?

@pdeva
Copy link
Copy Markdown
Contributor

pdeva commented Mar 31, 2018

note that this configuration is not documented in the 0.12.0 docs

@jihoonson
Copy link
Copy Markdown
Contributor Author

jihoonson commented Apr 2, 2018

Thank you for the report! Raised #5563.

}
createdTimes.sort(Comparators.naturalNullsFirst());

// There should be at least one createdTime because the current time is added to the 'createdTimes' list if there
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment accurate? I don't see where it's implemented, which suggests that this helper can crash in a brand new cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants