-
Notifications
You must be signed in to change notification settings - Fork 3.8k
stop supervisor in case of metadata failures #3456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,7 @@ | |
|
|
||
| import com.google.common.base.Optional; | ||
| import com.google.common.base.Preconditions; | ||
| import com.google.common.base.Throwables; | ||
| import com.google.inject.Inject; | ||
| import com.metamx.common.Pair; | ||
| import com.metamx.common.lifecycle.LifecycleStart; | ||
|
|
@@ -140,7 +141,7 @@ public Optional<SupervisorReport> getSupervisorStatus(String id) | |
|
|
||
| /** | ||
| * Stops a supervisor with a given id and then removes it from the list. | ||
| * <p> | ||
| * <p/> | ||
| * Caller should have acquired [lock] before invoking this method to avoid contention with other threads that may be | ||
| * starting and stopping supervisors. | ||
| * | ||
|
|
@@ -164,7 +165,7 @@ private boolean possiblyStopAndRemoveSupervisorInternal(String id, boolean write | |
|
|
||
| /** | ||
| * Creates a supervisor from the provided spec and starts it if there is not already a supervisor with that id. | ||
| * <p> | ||
| * <p/> | ||
| * Caller should have acquired [lock] before invoking this method to avoid contention with other threads that may be | ||
| * starting and stopping supervisors. | ||
| * | ||
|
|
@@ -177,13 +178,23 @@ private boolean createAndStartSupervisorInternal(SupervisorSpec spec, boolean pe | |
| return false; | ||
| } | ||
|
|
||
| Supervisor supervisor = spec.createSupervisor(); | ||
| supervisor.start(); // try starting the supervisor first so we don't persist a bad spec | ||
|
|
||
| if (persistSpec) { | ||
| metadataSupervisorManager.insert(id, spec); | ||
| } | ||
|
|
||
| Supervisor supervisor = null; | ||
| try { | ||
| supervisor = spec.createSupervisor(); | ||
| supervisor.start(); | ||
| } | ||
| catch (Exception e) { | ||
| // Supervisor creation or start failed write tombstone only when trying to start a new supervisor | ||
| if (persistSpec) { | ||
| metadataSupervisorManager.insert(id, new NoopSupervisorSpec()); | ||
| } | ||
| Throwables.propagate(e); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. wondering is this needs to have a retry for some kind of transitory exception or just fail the task. I think with metadata storage we have in place some retry mechanism.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is ok for now .. i don't think we have retry mechanism at least for this insert operation |
||
| } | ||
|
|
||
| supervisors.put(id, Pair.of(supervisor, spec)); | ||
| return true; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does supervisor.start() failure guarantee that things are not left in a corrupted state and supervisor.stop(true) does not need to be called?
@dclim ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@himanshug to be safe, we should probably make a call to consumer.close() in KafkaSupervisor.start() if the Kafka consumer was instantiated but an exception was thrown afterwards; @pjain1 do you mind adding this?