We had the following exception occur in all overlords at the same time.
Failed to lead: {class=io.druid.indexing.overlord.TaskMaster, exceptionType=class java.lang.reflect.InvocationTargetException, exceptionMessage=null}
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:350)
at com.metamx.common.lifecycle.Lifecycle.start(Lifecycle.java:259)
at io.druid.indexing.overlord.TaskMaster$1.takeLeadership(TaskMaster.java:141)
at org.apache.curator.framework.recipes.leader.LeaderSelector$WrappedListener.takeLeadership(LeaderSelector.java:534)
at org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:399)
at org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:441)
at org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64)
at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245)
at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@1925ca4 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@6928fd46[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 151]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:159)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:135)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:121)
at io.druid.indexing.overlord.autoscaling.AbstractWorkerResourceManagementStrategy.startManagement(AbstractWorkerResourceManagementStrategy.java:63)
at io.druid.indexing.overlord.autoscaling.AbstractWorkerResourceManagementStrategy.startManagement(AbstractWorkerResourceManagementStrategy.java:34)
at io.druid.indexing.overlord.RemoteTaskRunner.start(RemoteTaskRunner.java:312)
... 19 more
and
Failed to lead: {class=io.druid.indexing.overlord.TaskMaster, exceptionType=class java.lang.reflect.InvocationTargetException, exceptionMessage=null}
sun.reflect.GeneratedMethodAccessor155.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:350)
at com.metamx.common.lifecycle.Lifecycle.start(Lifecycle.java:259)
at io.druid.indexing.overlord.TaskMaster$1.takeLeadership(TaskMaster.java:141)
at org.apache.curator.framework.recipes.leader.LeaderSelector$WrappedListener.takeLeadership(LeaderSelector.java:534)
at org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:399)
at org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:441)
at org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64)
at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245)
at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2928a0b1 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@6928fd46[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 151]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:159)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:135)
at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:121)
at io.druid.indexing.overlord.autoscaling.AbstractWorkerResourceManagementStrategy.startManagement(AbstractWorkerResourceManagementStrategy.java:63)
at io.druid.indexing.overlord.autoscaling.AbstractWorkerResourceManagementStrategy.startManagement(AbstractWorkerResourceManagementStrategy.java:34)
at io.druid.indexing.overlord.RemoteTaskRunner.start(RemoteTaskRunner.java:312)
... 18 more
Somehow the overlords ended up in this state and just kept repeating this failure mode. It required a restart of them all to overcome.
We had the following exception occur in all overlords at the same time.
and
Somehow the overlords ended up in this state and just kept repeating this failure mode. It required a restart of them all to overcome.