When Historical node loading a segment failed at first time, It may not load this segment again until LRU cache is Invalidation or Stream Index Task is failed because of completionTimeout limitation.
2020-12-07T06:49:17,343 ERROR [Coordinator-Exec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - Server[http://druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] Failed segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] request[SegmentChangeRequestLoad] with cause [Stopping load queue peon.].
...
2020-12-07T06:52:49,509 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.rules.LoadRule - Assigning 'primary' for segment [xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] to server [druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] in tier [_default_tier]
....
2020-12-07T06:52:53,515 ERROR [Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - Server[http://druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] Failed segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] request[SegmentChangeRequestLoad] with cause [Exception loading segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]].
...
2020-12-07T06:53:24,647 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.rules.LoadRule - Assigning 'primary' for segment [xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] to server [druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] in tier [_default_tier]
...
2020-12-07T06:53:24,652 ERROR [Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - Server[http://druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] Failed segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] request[SegmentChangeRequestLoad] with cause [Exception loading segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]].
...
2020-12-07T06:53:59,732 INFO [Coordinator-Exec--0] org.apache.druid.server.coordinator.rules.LoadRule - Assigning 'primary' for segment [xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] to server [druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] in tier [_default_tier]
...
2020-12-07T06:53:59,737 ERROR [Master-PeonExec--0] org.apache.druid.server.coordinator.HttpLoadQueuePeon - Server[http://druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083] Failed segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12] request[SegmentChangeRequestLoad] with cause [Exception loading segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]].
...
2020-12-07T06:52:53,393 INFO [SimpleDataSegmentChangeHandler-0] org.apache.druid.storage.s3.S3DataSegmentPuller - Loaded 67610584 bytes from [CloudObjectLocation{bucket='pqm-druid-dev', path='rtstorage/segments/xxxx__load__segment__test/2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z/2020-12-07T05:39:35.003Z/13/affbed9a-c609-42f7-9c6a-6089ef5efac5/index.zip'}] to [/var/druid/segment-cache/xxxx__load__segment__test/2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z/2020-12-07T05:39:35.003Z/13]
2020-12-07T06:52:53,437 INFO [SimpleDataSegmentChangeHandler-0] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Announcing segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_13] at existing path[/druid/segments/druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083/druid-dev-8-historical-0.druid-dev-8-historical.druid-dev-8.svc.cluster.local:8083_historical__default_tier_2020-12-07T06:52:52.295Z_f39ed4961cac496898fdbcacb6e922ed1693]
2020-12-07T06:52:53,447 INFO [SimpleDataSegmentChangeHandler-1] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12
2020-12-07T06:52:53,507 WARN [SimpleDataSegmentChangeHandler-1] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - No path to unannounce segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]
2020-12-07T06:52:53,507 INFO [SimpleDataSegmentChangeHandler-1] org.apache.druid.server.SegmentManager - Told to delete a queryable on dataSource[xxxx__load__segment__test] for interval[2020-12-07T03:00:00.000Z/2020-12-07T04:00:00.000Z] and version[2020-12-07T05:39:35.003Z] that I don't have.
2020-12-07T06:52:53,507 INFO [SimpleDataSegmentChangeHandler-1] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Deleting directory[/var/druid/segment-cache/xxxx__load__segment__test/2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z/2020-12-07T05:39:35.003Z/12]
2020-12-07T06:52:53,509 WARN [SimpleDataSegmentChangeHandler-1] org.apache.druid.segment.loading.StorageLocation - SegmentDir[/var/druid/segment-cache/xxxx__load__segment__test/2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z/2020-12-07T05:39:35.003Z/12] is not found under this location[/var/druid/segment-cache]
2020-12-07T06:52:53,509 WARN [SimpleDataSegmentChangeHandler-1] org.apache.druid.server.coordination.SegmentLoadDropHandler - Unable to delete segmentInfoCacheFile[/var/druid/segment-cache/info_dir/xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]
2020-12-07T06:52:53,512 ERROR [SimpleDataSegmentChangeHandler-1] org.apache.druid.server.coordination.SegmentLoadDropHandler - Failed to load segment for dataSource: xxxxx
org.apache.druid.segment.loading.SegmentLoadingException: Exception loading segment[xxxx__load__segment__test_2020-12-07T03:00:00.000Z_2020-12-07T04:00:00.000Z_2020-12-07T05:39:35.003Z_12]
at org.apache.druid.server.coordination.SegmentLoadDropHandler.loadSegment(SegmentLoadDropHandler.java:263) ~[druid-server-0.17.1.jar:0.17.1]
at org.apache.druid.server.coordination.SegmentLoadDropHandler.addSegment(SegmentLoadDropHandler.java:307) ~[druid-server-0.17.1.jar:0.17.1]
at org.apache.druid.server.coordination.SegmentLoadDropHandler$1.lambda$addSegment$1(SegmentLoadDropHandler.java:513) ~[druid-server-0.17.1.jar:0.17.1]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_221]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_221]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_221]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_221]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_221]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_221]
Caused by: java.lang.NullPointerException
at org.apache.druid.common.utils.SerializerUtils.readString(SerializerUtils.java:61) ~[druid-core-0.17.1.jar:0.17.1]
at org.apache.druid.segment.IndexIO$V9IndexLoader.deserializeColumn(IndexIO.java:677) ~[druid-processing-0.17.1.jar:0.17.1]
at org.apache.druid.segment.IndexIO$V9IndexLoader.load(IndexIO.java:617) ~[druid-processing-0.17.1.jar:0.17.1]
at org.apache.druid.segment.IndexIO.loadIndex(IndexIO.java:194) ~[druid-processing-0.17.1.jar:0.17.1]
at org.apache.druid.segment.loading.MMappedQueryableSegmentizerFactory.factorize(MMappedQueryableSegmentizerFactory.java:48) ~[druid-processing-0.17.1.jar:0.17.1]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:150) ~[druid-server-0.17.1.jar:0.17.1]
at org.apache.druid.server.SegmentManager.getAdapter(SegmentManager.java:198) ~[druid-server-0.17.1.jar:0.17.1]
at org.apache.druid.server.SegmentManager.loadSegment(SegmentManager.java:157) ~[druid-server-0.17.1.jar:0.17.1]
at org.apache.druid.server.coordination.SegmentLoadDropHandler.loadSegment(SegmentLoadDropHandler.java:259) ~[druid-server-0.17.1.jar:0.17.1]
... 9 more
2020-12-07T06:52:53,518 INFO [SimpleDataSegmentChangeHandler-0] org.apache.druid.server.coordination.SegmentLoadDropHandler - Loading segment xxxx__load__segment__test_2020-12-07T02:00:00.000Z_2020-12-07T03:00:00.000Z_2020-12-07T02:16:46.090Z_17
2020-12-07T06:52:53,519 INFO [SimpleDataSegmentChangeHandler-0] org.apache.druid.storage.s3.S3DataSegmentPuller - Pulling index at path[CloudObjectLocation{bucket='pqm-druid-dev', path='rtstorage/segments/xxxx__load__segment__test/2020-12-07T02:00:00.000Z_2020-12-07T03:00:00.000Z/2020-12-07T02:16:46.090Z/17/587cf37e-73ca-4628-8c65-d90e290b65fc/index.zip'}] to outDir[/var/druid/segment-cache/xxxx__load__segment__test/2020-12-07T02:00:00.000Z_2020-12-07T03:00:00.000Z/2020-12-07T02:16:46.090Z/17]
2020-12-07T05:54:06,004 INFO [[index_kafka_xxxx__load__segment__test_ed12482207579a5_mkdnhpfh]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Dropped segment[xxxx__load__segment__test_2020-12-07T02:00:00.000Z_2020-12-07T03:00:00.000Z_2020-12-07T02:16:46.090Z_28].
2020-12-07T05:55:05,951 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T05:56:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T05:57:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T05:58:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T05:59:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:00:05,947 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:01:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:02:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:03:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:04:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:05:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:06:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:07:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:08:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:09:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:10:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:11:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:12:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:13:05,949 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:14:05,949 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:15:05,950 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:16:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:17:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:18:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:19:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:20:05,948 INFO [coordinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2020-12-07T02:00:00.000Z/2020-12-07T03:00:00.000Z, version='2020-12-07T02:16:46.090Z', partitionNumber=17}]]
2020-12-07T06:20:27,386 INFO [parent-monitor-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Triggering JVM shutdown.
2020-12-07T06:20:27,387 INFO [Thread-125] org.apache.druid.cli.CliPeon - Running shutdown hook
2020-12-07T06:20:27,387 INFO [Thread-125] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [ANNOUNCEMENTS]
2020-12-07T06:20:27,388 INFO [Thread-125] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/announcements/druid-dev-8-middle-manager-medium-0.druid-dev-8-middle-manager-medium.druid-dev-8.svc.cluster.local:8100]
2020-12-07T06:20:27,398 INFO [Thread-125] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/segments/druid-dev-8-middle-manager-medium-0.druid-dev-8-middle-manager-medium.druid-dev-8.svc.cluster.local:8100/druid-dev-8-middle-manager-medium-0.druid-dev-8-middle-manager-medium.druid-dev-8.svc.cluster.local:8100_indexer-executor__default_tier_2020-12-07T04:50:06.819Z_6a488817791a4d8498ae15fedafe66dd0]
2020-12-07T06:20:27,400 INFO [Thread-125] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/listeners/lookups/__default/http:druid-dev-8-middle-manager-medium-0.druid-dev-8-middle-manager-medium.druid-dev-8.svc.cluster.local:8100]
2020-12-07T06:20:27,401 INFO [Thread-125] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/internal-discovery/PEON/druid-dev-8-middle-manager-medium-0.druid-dev-8-middle-manager-medium.druid-dev-8.svc.cluster.local:8100]
2020-12-07T06:20:27,403 INFO [Thread-125] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [SERVER]
2020-12-07T06:20:27,407 INFO [Thread-125] org.eclipse.jetty.server.AbstractConnector - Stopped
Affected Version
All, using
druid.coordinator.loadqueuepeon.type=httpDescription
When Historical node loading a segment failed at first time, It may not load this segment again until LRU cache is Invalidation or Stream Index Task is failed because of completionTimeout limitation.
Here is coordinator logs :
Here is Historical logs :
Here is Kafka ingest tasks log
Keep
Still waiting for Handoff for Segmentsand failed.Here is what happens:
Hisotircal is download and unzip a segment but crashed and segmnet is damaged.
Historical re-started(lazy on start false).
Historical loads that segment again but failed because that segment is damaged.
Coordinator keep letting historical load this segment again and again and again.
Historical always responses failure loading current segment based on LRU cache but never try it again.
Ingest Task hangs and failed after completionTimeout.