-
Notifications
You must be signed in to change notification settings - Fork 3.8k
overlord helpers framework and tasklog auto cleanup #3677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,6 +29,7 @@ | |
| import org.apache.commons.io.FileUtils; | ||
|
|
||
| import java.io.File; | ||
| import java.io.FileFilter; | ||
| import java.io.IOException; | ||
| import java.io.InputStream; | ||
|
|
||
|
|
@@ -89,4 +90,38 @@ public void killAll() throws IOException | |
| log.info("Deleting all task logs from local dir [%s].", config.getDirectory().getAbsolutePath()); | ||
| FileUtils.deleteDirectory(config.getDirectory()); | ||
| } | ||
|
|
||
| @Override | ||
| public void killOlderThan(final long timestamp) throws IOException | ||
| { | ||
| File taskLogDir = config.getDirectory(); | ||
| if (taskLogDir.exists()) { | ||
|
|
||
| if (!taskLogDir.isDirectory()) { | ||
| throw new IOException(String.format("taskLogDir [%s] must be a directory.", taskLogDir)); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For future reference |
||
| } | ||
|
|
||
| File[] files = taskLogDir.listFiles( | ||
| new FileFilter() | ||
| { | ||
| @Override | ||
| public boolean accept(File f) | ||
| { | ||
| return f.lastModified() < timestamp; | ||
| } | ||
| } | ||
| ); | ||
|
|
||
| for (File file : files) { | ||
| log.info("Deleting local task log [%s].", file.getAbsolutePath()); | ||
| FileUtils.forceDelete(file); | ||
|
|
||
| if (Thread.currentThread().isInterrupted()) { | ||
| throw new IOException( | ||
| new InterruptedException("Thread interrupted. Couldn't delete all tasklogs.") | ||
| ); | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| /* | ||
| * Licensed to Metamarkets Group Inc. (Metamarkets) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. Metamarkets licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package io.druid.indexing.overlord.helpers; | ||
|
|
||
| import java.util.concurrent.ScheduledExecutorService; | ||
|
|
||
| /** | ||
| */ | ||
| public interface OverlordHelper | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While you are in this code, u should also consider merging overlord and coordinator :P
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmmm, actually we can. I believe we have had the desire for quite some time and it was discussed in the dev sync meeting too and agreed upon. I would imagine introducing a CliMaster (a "master" node) that binds/start everything that coordinator and overlords do. also, have only one "leader election" for the "leader master" instead of the two.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @himanshug Yeah that would be pretty great, we don't have to do it as part of this PR though, but I think it would be a really useful to have
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, will do that in separate PR. let us keep this one as is. |
||
| { | ||
| boolean isEnabled(); | ||
| void schedule(ScheduledExecutorService exec); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i think the method here should be run()
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the way i structured things are that helpers schedule themselves on the executor supplied by manager. this is done so that all helpers can have their independent schedule and can potentially run in parallel. |
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,91 @@ | ||
| /* | ||
| * Licensed to Metamarkets Group Inc. (Metamarkets) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. Metamarkets licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package io.druid.indexing.overlord.helpers; | ||
|
|
||
| import com.google.inject.Inject; | ||
| import io.druid.java.util.common.concurrent.ScheduledExecutorFactory; | ||
| import io.druid.java.util.common.lifecycle.LifecycleStart; | ||
| import io.druid.java.util.common.lifecycle.LifecycleStop; | ||
| import io.druid.java.util.common.logger.Logger; | ||
|
|
||
| import java.util.Set; | ||
| import java.util.concurrent.ScheduledExecutorService; | ||
|
|
||
| /** | ||
| */ | ||
| public class OverlordHelperManager | ||
| { | ||
| private static final Logger log = new Logger(OverlordHelperManager.class); | ||
|
|
||
| private final ScheduledExecutorFactory execFactory; | ||
| private final Set<OverlordHelper> helpers; | ||
|
|
||
| private volatile ScheduledExecutorService exec; | ||
| private final Object startStopLock = new Object(); | ||
| private volatile boolean started = false; | ||
|
|
||
| @Inject | ||
| public OverlordHelperManager( | ||
| ScheduledExecutorFactory scheduledExecutorFactory, | ||
| Set<OverlordHelper> helpers) | ||
| { | ||
| this.execFactory = scheduledExecutorFactory; | ||
| this.helpers = helpers; | ||
| } | ||
|
|
||
| @LifecycleStart | ||
| public void start() | ||
| { | ||
| synchronized (startStopLock) { | ||
| if (!started) { | ||
| log.info("OverlordHelperManager is starting."); | ||
|
|
||
| for (OverlordHelper helper : helpers) { | ||
| if (helper.isEnabled()) { | ||
| if (exec == null) { | ||
| exec = execFactory.create(1, "Overlord-Helper-Manager-Exec--%d"); | ||
| } | ||
| helper.schedule(exec); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this logic is basically just repeating what the coordinator is doing high level comment, can we have this as a coordinator helper?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, it is similar. but, i did it on overlord because tasks and tasklogs are owned by overlord and their retention should also be handled at the overlord.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the executor is having issues and rejecting execution, what is the intended behavior here?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that would be a fatal situation, will fail the lifecycle start and consequently prohibit druid process startup.... which is intended behavior in this case.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cool |
||
| } | ||
| } | ||
| started = true; | ||
|
|
||
| log.info("OverlordHelperManager is started."); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| @LifecycleStop | ||
| public void stop() | ||
| { | ||
| synchronized (startStopLock) { | ||
| if (started) { | ||
| log.info("OverlordHelperManager is stopping."); | ||
| if (exec != null) { | ||
| exec.shutdownNow(); | ||
| exec = null; | ||
| } | ||
| started = false; | ||
|
|
||
| log.info("OverlordHelperManager is stopped."); | ||
| } | ||
| } | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest
&& taskLogDir.isDirectory()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, throwing an IOException if it is not a directory would be a reasonable alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done