From c819d764bea4a0c796a15965edec6bca23f6b676 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Thu, 8 Apr 2021 16:40:43 -0700 Subject: [PATCH 01/12] add docs --- docs/configuration/index.md | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 5a91a772b861..7fe4fb52860a 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -339,6 +339,17 @@ Coordinator and Overlord log changes to lookups, segment load/drop rules, dynami |`druid.audit.manager.auditHistoryMillis`|Default duration for querying audit history.|1 week| |`druid.audit.manager.includePayloadAsDimensionInMetric`|Boolean flag on whether to add `payload` column in service metric.|false| +You can also configure the Overlord to automatically retain the audit logs (in metadata storage tables) only for last `x` milliseconds and/or last `y` number of audit logs by configuring following additional properties. +Note that both `druid.indexer.audit.kill.durationToRetain` and `druid.indexer.audit.kill.sizeToRetain` can be enabled at the same time to enforce both limits. + +|Property|Description|Required?|Default| +|--------|-----------|---------|-------| +|`druid.indexer.audit.kill.enabled`|Boolean value for whether to enable deletion of old audit logs. If set to true, Overlord will periodically remove audit logs from the audit table entries in metadata storage. |No |false| +|`druid.indexer.audit.kill.durationToRetain`| In milliseconds, audit logs to be retained created in last x milliseconds. |Yes if `druid.indexer.audit.kill.enabled` is true and `druid.indexer.audit.kill.sizeToRetain` is not set. | None| +|`druid.indexer.audit.kill.sizeToRetain`| Maximum number of audit logs to be retained. Druid will delete audit logs starting from oldest once the number of task logs exceed this value. |Yes if `druid.indexer.audit.kill.enabled` is true and `druid.indexer.audit.kill.durationToRetain` is not set. |None| +|`druid.indexer.audit.kill.initialDelay`| Number of milliseconds after Overlord start when first auto kill is run. |No |random value less than 300000 (5 mins)| +|`druid.indexer.audit.kill.delay`| Number of milliseconds of delay between successive executions of auto kill run. |No |21600000 (6 hours)| + ### Enabling Metrics Druid processes periodically emit metrics and different metrics monitors can be included. Each process can overwrite the default list of monitors. @@ -545,15 +556,17 @@ If you are running the indexing service in remote mode, the task logs must be st |--------|-----------|-------| |`druid.indexer.logs.type`|Choices:noop, s3, azure, google, hdfs, file. Where to store task logs|file| -You can also configure the Overlord to automatically retain the task logs in log directory and entries in task-related metadata storage tables only for last x milliseconds by configuring following additional properties. -Caution: Automatic log file deletion typically works based on log file modification timestamp on the backing store, so large clock skews between druid processes and backing store nodes might result in unintended behavior. - -|Property|Description|Default| -|--------|-----------|-------| -|`druid.indexer.logs.kill.enabled`|Boolean value for whether to enable deletion of old task logs. If set to true, Overlord will submit kill tasks periodically based on `druid.indexer.logs.kill.delay` specified, which will delete task logs from the log directory as well as tasks and tasklogs table entries in metadata storage except for tasks created in the last `druid.indexer.logs.kill.durationToRetain` period. |false| -|`druid.indexer.logs.kill.durationToRetain`| Required if kill is enabled. In milliseconds, task logs and entries in task-related metadata storage tables to be retained created in last x milliseconds. |None| -|`druid.indexer.logs.kill.initialDelay`| Optional. Number of milliseconds after Overlord start when first auto kill is run. |random value less than 300000 (5 mins)| -|`druid.indexer.logs.kill.delay`|Optional. Number of milliseconds of delay between successive executions of auto kill run. |21600000 (6 hours)| +You can also configure the Overlord to automatically retain the task logs in log directory and entries in task-related metadata storage tables only for last `x` milliseconds and/or last `y` number of task logs by configuring following additional properties. +Caution: Automatic log file deletion with `druid.indexer.logs.kill.durationToRetain` typically works based on log file modification timestamp on the backing store, so large clock skews between druid processes and backing store nodes might result in unintended behavior. +Note that both `druid.indexer.logs.kill.durationToRetain` and `druid.indexer.logs.kill.sizeToRetain` can be enabled at the same time to enforce both limits. + +|Property|Description|Required?|Default| +|--------|-----------|---------|-------| +|`druid.indexer.logs.kill.enabled`|Boolean value for whether to enable deletion of old task logs. If set to true, Overlord will submit kill tasks periodically to delete task logs from the log directory as well as tasks and tasklogs table entries in metadata storage. |No |false| +|`druid.indexer.logs.kill.durationToRetain`| In milliseconds, task logs and entries in task-related metadata storage tables to be retained created in last x milliseconds. |Yes if `druid.indexer.logs.kill.enabled` is true and `druid.indexer.logs.kill.sizeToRetain` is not set. | None| +|`druid.indexer.logs.kill.sizeToRetain`| Maximum number of task logs and entries in task-related metadata storage tables to be retained. Druid will delete task logs starting from oldest once the number of task logs exceed this value. |Yes if `druid.indexer.logs.kill.enabled` is true and `druid.indexer.logs.kill.durationToRetain` is not set. |None| +|`druid.indexer.logs.kill.initialDelay`| Number of milliseconds after Overlord start when first auto kill is run. |No |random value less than 300000 (5 mins)| +|`druid.indexer.logs.kill.delay`| Number of milliseconds of delay between successive executions of auto kill run. |No |21600000 (6 hours)| #### File Task Logs From 34128f7cef6250ca88d941b9ecb905bf809976b8 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Mon, 12 Apr 2021 23:33:35 -0700 Subject: [PATCH 02/12] add impl --- .../org/apache/druid/audit/AuditManager.java | 8 ++ docs/configuration/index.md | 38 ++++----- docs/operations/metrics.md | 2 + ...oordinatorMetadataStoreManagementDuty.java | 36 +++++++++ .../druid/server/audit/SQLAuditManager.java | 19 +++++ .../server/coordinator/DruidCoordinator.java | 27 ++++++- .../coordinator/DruidCoordinatorConfig.java | 12 +++ .../server/coordinator/KillAuditLog.java | 81 +++++++++++++++++++ .../CuratorDruidCoordinatorTest.java | 3 + .../coordinator/DruidCoordinatorTest.java | 3 + .../coordinator/HttpLoadQueuePeonTest.java | 1 + .../server/coordinator/LoadQueuePeonTest.java | 3 + .../coordinator/LoadQueuePeonTester.java | 1 + .../TestDruidCoordinatorConfig.java | 27 +++++++ .../duty/KillUnusedSegmentsTest.java | 1 + .../org/apache/druid/cli/CliCoordinator.java | 26 ++++-- 16 files changed, 260 insertions(+), 28 deletions(-) create mode 100644 server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java create mode 100644 server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java diff --git a/core/src/main/java/org/apache/druid/audit/AuditManager.java b/core/src/main/java/org/apache/druid/audit/AuditManager.java index 6389350fea03..5e325fe9ba3e 100644 --- a/core/src/main/java/org/apache/druid/audit/AuditManager.java +++ b/core/src/main/java/org/apache/druid/audit/AuditManager.java @@ -80,4 +80,12 @@ public interface AuditManager * @return list of AuditEntries satisfying the passed parameters */ List fetchAuditHistory(String type, int limit); + + /** + * Remove audit logs created older than the given timestamp. + * + * @param timestamp timestamp in milliseconds + * @return number of audit logs removed + */ + int removeAuditLogsOlderThan(long timestamp); } diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 7fe4fb52860a..50965b4c5860 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -339,17 +339,6 @@ Coordinator and Overlord log changes to lookups, segment load/drop rules, dynami |`druid.audit.manager.auditHistoryMillis`|Default duration for querying audit history.|1 week| |`druid.audit.manager.includePayloadAsDimensionInMetric`|Boolean flag on whether to add `payload` column in service metric.|false| -You can also configure the Overlord to automatically retain the audit logs (in metadata storage tables) only for last `x` milliseconds and/or last `y` number of audit logs by configuring following additional properties. -Note that both `druid.indexer.audit.kill.durationToRetain` and `druid.indexer.audit.kill.sizeToRetain` can be enabled at the same time to enforce both limits. - -|Property|Description|Required?|Default| -|--------|-----------|---------|-------| -|`druid.indexer.audit.kill.enabled`|Boolean value for whether to enable deletion of old audit logs. If set to true, Overlord will periodically remove audit logs from the audit table entries in metadata storage. |No |false| -|`druid.indexer.audit.kill.durationToRetain`| In milliseconds, audit logs to be retained created in last x milliseconds. |Yes if `druid.indexer.audit.kill.enabled` is true and `druid.indexer.audit.kill.sizeToRetain` is not set. | None| -|`druid.indexer.audit.kill.sizeToRetain`| Maximum number of audit logs to be retained. Druid will delete audit logs starting from oldest once the number of task logs exceed this value. |Yes if `druid.indexer.audit.kill.enabled` is true and `druid.indexer.audit.kill.durationToRetain` is not set. |None| -|`druid.indexer.audit.kill.initialDelay`| Number of milliseconds after Overlord start when first auto kill is run. |No |random value less than 300000 (5 mins)| -|`druid.indexer.audit.kill.delay`| Number of milliseconds of delay between successive executions of auto kill run. |No |21600000 (6 hours)| - ### Enabling Metrics Druid processes periodically emit metrics and different metrics monitors can be included. Each process can overwrite the default list of monitors. @@ -556,17 +545,15 @@ If you are running the indexing service in remote mode, the task logs must be st |--------|-----------|-------| |`druid.indexer.logs.type`|Choices:noop, s3, azure, google, hdfs, file. Where to store task logs|file| -You can also configure the Overlord to automatically retain the task logs in log directory and entries in task-related metadata storage tables only for last `x` milliseconds and/or last `y` number of task logs by configuring following additional properties. -Caution: Automatic log file deletion with `druid.indexer.logs.kill.durationToRetain` typically works based on log file modification timestamp on the backing store, so large clock skews between druid processes and backing store nodes might result in unintended behavior. -Note that both `druid.indexer.logs.kill.durationToRetain` and `druid.indexer.logs.kill.sizeToRetain` can be enabled at the same time to enforce both limits. +You can also configure the Overlord to automatically retain the task logs in log directory and entries in task-related metadata storage tables only for last x milliseconds by configuring following additional properties. +Caution: Automatic log file deletion typically works based on log file modification timestamp on the backing store, so large clock skews between druid processes and backing store nodes might result in unintended behavior. -|Property|Description|Required?|Default| -|--------|-----------|---------|-------| -|`druid.indexer.logs.kill.enabled`|Boolean value for whether to enable deletion of old task logs. If set to true, Overlord will submit kill tasks periodically to delete task logs from the log directory as well as tasks and tasklogs table entries in metadata storage. |No |false| -|`druid.indexer.logs.kill.durationToRetain`| In milliseconds, task logs and entries in task-related metadata storage tables to be retained created in last x milliseconds. |Yes if `druid.indexer.logs.kill.enabled` is true and `druid.indexer.logs.kill.sizeToRetain` is not set. | None| -|`druid.indexer.logs.kill.sizeToRetain`| Maximum number of task logs and entries in task-related metadata storage tables to be retained. Druid will delete task logs starting from oldest once the number of task logs exceed this value. |Yes if `druid.indexer.logs.kill.enabled` is true and `druid.indexer.logs.kill.durationToRetain` is not set. |None| -|`druid.indexer.logs.kill.initialDelay`| Number of milliseconds after Overlord start when first auto kill is run. |No |random value less than 300000 (5 mins)| -|`druid.indexer.logs.kill.delay`| Number of milliseconds of delay between successive executions of auto kill run. |No |21600000 (6 hours)| +|Property|Description|Default| +|--------|-----------|-------| +|`druid.indexer.logs.kill.enabled`|Boolean value for whether to enable deletion of old task logs. If set to true, Overlord will submit kill tasks periodically based on `druid.indexer.logs.kill.delay` specified, which will delete task logs from the log directory as well as tasks and tasklogs table entries in metadata storage except for tasks created in the last `druid.indexer.logs.kill.durationToRetain` period. |false| +|`druid.indexer.logs.kill.durationToRetain`| Required if kill is enabled. In milliseconds, task logs and entries in task-related metadata storage tables to be retained created in last x milliseconds. |None| +|`druid.indexer.logs.kill.initialDelay`| Optional. Number of milliseconds after Overlord start when first auto kill is run. |random value less than 300000 (5 mins)| +|`druid.indexer.logs.kill.delay`|Optional. Number of milliseconds of delay between successive executions of auto kill run. |21600000 (6 hours)| #### File Task Logs @@ -734,6 +721,15 @@ These Coordinator static configurations can be defined in the `coordinator/runti |`druid.coordinator.asOverlord.enabled`|Boolean value for whether this Coordinator process should act like an Overlord as well. This configuration allows users to simplify a druid cluster by not having to deploy any standalone Overlord processes. If set to true, then Overlord console is available at `http://coordinator-host:port/console.html` and be sure to set `druid.coordinator.asOverlord.overlordService` also. See next.|false| |`druid.coordinator.asOverlord.overlordService`| Required, if `druid.coordinator.asOverlord.enabled` is `true`. This must be same value as `druid.service` on standalone Overlord processes and `druid.selectors.indexing.serviceName` on Middle Managers.|NULL| +##### Metadata Management + +|Property|Description|Required?|Default| +|--------|-----------|---------|-------| +|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks. |No |PT3600S (1 hour)| +|`druid.coordinator.kill.audit.on`| Boolean value for whether to enable automatic deletion of audit logs. If set to true, Coordinator will periodically remove audit logs from the audit table entries in metadata storage.| No | False| +|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| +|`druid.coordinator.kill.audit.durationToRetain`| In milliseconds, audit logs to be retained created in last x milliseconds. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| + ##### Segment Management |Property|Possible Values|Description|Default| |--------|---------------|-----------|-------| diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md index 9a1d2c7e4983..7d377484620a 100644 --- a/docs/operations/metrics.md +++ b/docs/operations/metrics.md @@ -256,6 +256,8 @@ These metrics are for the Druid Coordinator and are reset each time the Coordina |`interval/skipCompact/count`|Total number of intervals of this datasource that are skipped (not eligible for auto compaction) by the auto compaction.|datasource.|Varies.| |`coordinator/time`|Approximate Coordinator duty runtime in milliseconds. The duty dimension is the string alias of the Duty that is being run.|duty.|Varies.| |`coordinator/global/time`|Approximate runtime of a full coordination cycle in milliseconds. The `dutyGroup` dimension indicates what type of coordination this run was. i.e. Historical Management vs Indexing|`dutyGroup`|Varies.| +|`metadata/kill/audit/count`|Total number of audit logs deleted from metadata store audit table.| |Varies.| + If `emitBalancingStats` is set to `true` in the Coordinator [dynamic configuration](../configuration/index.md#dynamic-configuration), then [log entries](../configuration/logging.md) for class `org.apache.druid.server.coordinator.duty.EmitClusterStatsAndMetrics` will have extra information on balancing diff --git a/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java b/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java new file mode 100644 index 000000000000..22474f040c09 --- /dev/null +++ b/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.guice.annotations; + +import com.google.inject.BindingAnnotation; + +import java.lang.annotation.ElementType; +import java.lang.annotation.Retention; +import java.lang.annotation.RetentionPolicy; +import java.lang.annotation.Target; + +/** + */ +@BindingAnnotation +@Target({ElementType.FIELD, ElementType.PARAMETER, ElementType.METHOD}) +@Retention(RetentionPolicy.RUNTIME) +public @interface CoordinatorMetadataStoreManagementDuty +{ +} \ No newline at end of file diff --git a/server/src/main/java/org/apache/druid/server/audit/SQLAuditManager.java b/server/src/main/java/org/apache/druid/server/audit/SQLAuditManager.java index 9ea53c670e1a..b85d71aec30f 100644 --- a/server/src/main/java/org/apache/druid/server/audit/SQLAuditManager.java +++ b/server/src/main/java/org/apache/druid/server/audit/SQLAuditManager.java @@ -39,6 +39,7 @@ import org.skife.jdbi.v2.Handle; import org.skife.jdbi.v2.IDBI; import org.skife.jdbi.v2.Query; +import org.skife.jdbi.v2.Update; import org.skife.jdbi.v2.tweak.HandleCallback; import java.io.IOException; @@ -206,6 +207,24 @@ public List fetchAuditHistory(final String type, int limit) return fetchAuditHistoryLastEntries(null, type, limit); } + @Override + public int removeAuditLogsOlderThan(final long timestamp) + { + DateTime dateTime = DateTimes.utc(timestamp); + return dbi.withHandle( + handle -> { + Update sql = handle.createStatement( + StringUtils.format( + "DELETE FROM %s WHERE created_date < :date_time", + getAuditTable() + ) + ); + return sql.bind("date_time", dateTime.toString()) + .execute(); + } + ); + } + private List fetchAuditHistoryLastEntries(final String key, final String type, int limit) throws IllegalArgumentException { diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java index 58af776863d3..3b4245f6939e 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java @@ -47,6 +47,7 @@ import org.apache.druid.discovery.DruidLeaderSelector; import org.apache.druid.guice.ManageLifecycle; import org.apache.druid.guice.annotations.CoordinatorIndexingServiceDuty; +import org.apache.druid.guice.annotations.CoordinatorMetadataStoreManagementDuty; import org.apache.druid.guice.annotations.Self; import org.apache.druid.java.util.common.DateTimes; import org.apache.druid.java.util.common.IAE; @@ -149,6 +150,7 @@ public class DruidCoordinator private final ServiceAnnouncer serviceAnnouncer; private final DruidNode self; private final Set indexingServiceDuties; + private final Set metadataStoreManagementDuties; private final BalancerStrategyFactory factory; private final LookupCoordinatorManager lookupCoordinatorManager; private final DruidLeaderSelector coordLeaderSelector; @@ -162,6 +164,7 @@ public class DruidCoordinator private ListeningExecutorService balancerExec; private static final String HISTORICAL_MANAGEMENT_DUTIES_DUTY_GROUP = "HistoricalManagementDuties"; + private static final String METADATA_STORE_MANAGEMENT_DUTIES_DUTY_GROUP = "MetadataStoreManagementDuties"; private static final String INDEXING_SERVICE_DUTIES_DUTY_GROUP = "IndexingServiceDuties"; private static final String COMPACT_SEGMENTS_DUTIES_DUTY_GROUP = "CompactSegmentsDuties"; @@ -180,7 +183,8 @@ public DruidCoordinator( LoadQueueTaskMaster taskMaster, ServiceAnnouncer serviceAnnouncer, @Self DruidNode self, - @CoordinatorIndexingServiceDuty Set indexingServiceDuties, + @CoordinatorIndexingServiceDuty Set metadataStoreManagementDuties, + @CoordinatorMetadataStoreManagementDuty Set indexingServiceDuties, BalancerStrategyFactory factory, LookupCoordinatorManager lookupCoordinatorManager, @Coordinator DruidLeaderSelector coordLeaderSelector, @@ -204,6 +208,7 @@ public DruidCoordinator( self, new ConcurrentHashMap<>(), indexingServiceDuties, + metadataStoreManagementDuties, factory, lookupCoordinatorManager, coordLeaderSelector, @@ -228,6 +233,7 @@ public DruidCoordinator( DruidNode self, ConcurrentMap loadQueuePeonMap, Set indexingServiceDuties, + Set metadataStoreManagementDuties, BalancerStrategyFactory factory, LookupCoordinatorManager lookupCoordinatorManager, DruidLeaderSelector coordLeaderSelector, @@ -253,6 +259,7 @@ public DruidCoordinator( this.serviceAnnouncer = serviceAnnouncer; this.self = self; this.indexingServiceDuties = indexingServiceDuties; + this.metadataStoreManagementDuties = metadataStoreManagementDuties; this.exec = scheduledExecutorFactory.create(1, "Coordinator-Exec--%d"); @@ -617,6 +624,12 @@ private void becomeLeader() ) ); } + dutiesRunnables.add( + Pair.of( + new DutiesRunnable(makeMetadataStoreManagementDuties(), startingLeaderCounter, METADATA_STORE_MANAGEMENT_DUTIES_DUTY_GROUP), + config.getCoordinatorMetadataStoreManagementPeriod() + ) + ); for (final Pair dutiesRunnable : dutiesRunnables) { // CompactSegmentsDuty can takes a non trival amount of time to complete. @@ -702,6 +715,18 @@ private List makeIndexingServiceDuties() return ImmutableList.copyOf(duties); } + private List makeMetadataStoreManagementDuties() + { + List duties = new ArrayList<>(); + duties.addAll(metadataStoreManagementDuties); + + log.debug( + "Done making metadata store management duties %s", + duties.stream().map(duty -> duty.getClass().getName()).collect(Collectors.toList()) + ); + return ImmutableList.copyOf(duties); + } + private List makeCompactSegmentsDuty() { return ImmutableList.of(compactSegments); diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java index 130d7e88d840..ec76a098fb46 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java @@ -39,6 +39,10 @@ public abstract class DruidCoordinatorConfig @Default("PT1800s") public abstract Duration getCoordinatorIndexingPeriod(); + @Config("druid.coordinator.period.metadataStoreManagementPeriod") + @Default("PT3600s") + public abstract Duration getCoordinatorMetadataStoreManagementPeriod(); + @Config("druid.coordinator.kill.period") @Default("P1D") public abstract Duration getCoordinatorKillPeriod(); @@ -51,6 +55,14 @@ public abstract class DruidCoordinatorConfig @Default("0") public abstract int getCoordinatorKillMaxSegments(); + @Config("druid.coordinator.kill.audit.period") + @Default("P1D") + public abstract Duration getCoordinatorAuditKillPeriod(); + + @Config("druid.coordinator.kill.audit.durationToRetain") + @Default("PT-1s") + public abstract Duration getCoordinatorAuditKillDurationToRetain(); + @Config("druid.coordinator.load.timeout") public Duration getLoadTimeoutDelay() { diff --git a/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java b/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java new file mode 100644 index 000000000000..43bd8f986e0f --- /dev/null +++ b/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator; + +import com.google.common.base.Preconditions; +import com.google.inject.Inject; +import org.apache.druid.audit.AuditManager; +import org.apache.druid.java.util.common.logger.Logger; +import org.apache.druid.java.util.emitter.service.ServiceEmitter; +import org.apache.druid.java.util.emitter.service.ServiceMetricEvent; +import org.apache.druid.server.audit.SQLAuditManager; +import org.apache.druid.server.coordinator.duty.CompactSegments; +import org.apache.druid.server.coordinator.duty.CoordinatorDuty; + +public class KillAuditLog implements CoordinatorDuty +{ + private static final Logger log = new Logger(KillAuditLog.class); + + private final long period; + private final long retainDuration; + private long lastKillTime = 0; + + private final AuditManager auditManager; + + @Inject + public KillAuditLog( + SQLAuditManager auditManager, + DruidCoordinatorConfig config + ) + { + this.period = config.getCoordinatorAuditKillPeriod().getMillis(); + Preconditions.checkArgument( + this.period > config.getCoordinatorMetadataStoreManagementPeriod().getMillis(), + "coordinator audit kill period must be greater than druid.coordinator.period.metadataStoreManagementPeriod" + ); + this.retainDuration = config.getCoordinatorAuditKillDurationToRetain().getMillis(); + Preconditions.checkArgument(this.retainDuration >= 0, "coordinator audit kill retainDuration must be >= 0"); + log.info( + "Audit Kill Task scheduling enabled with period [%s], retainDuration [%s]", + this.period, + this.retainDuration + ); + this.auditManager = auditManager; + } + + @Override + public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) + { + if ((lastKillTime + period) < System.currentTimeMillis()) { + lastKillTime = System.currentTimeMillis(); + + long timestamp = System.currentTimeMillis() - retainDuration; + int auditRemoved = auditManager.removeAuditLogsOlderThan(timestamp); + ServiceEmitter emitter = params.getEmitter(); + emitter.emit( + new ServiceMetricEvent.Builder().build( + "metadata/kill/audit/count", + auditRemoved + ) + ); + } + return params; + } +} diff --git a/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java b/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java index dd382e31a65f..1de54a217c40 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java @@ -169,6 +169,7 @@ public void setUp() throws Exception new Duration(COORDINATOR_PERIOD), null, null, + null, new Duration(COORDINATOR_PERIOD), null, 10, @@ -247,6 +248,7 @@ public void unannounce(DruidNode node) druidNode, loadManagementPeons, null, + null, new CostBalancerStrategyFactory(), EasyMock.createNiceMock(LookupCoordinatorManager.class), new TestDruidLeaderSelector(), @@ -546,6 +548,7 @@ public void unannounce(DruidNode node) druidNode, loadManagementPeons, null, + null, new CostBalancerStrategyFactory(), EasyMock.createNiceMock(LookupCoordinatorManager.class), new TestDruidLeaderSelector(), diff --git a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java index 31bfeeccc77a..97e00332ea02 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java @@ -141,6 +141,7 @@ public void setUp() throws Exception new Duration(COORDINATOR_PERIOD), null, null, + null, new Duration(COORDINATOR_PERIOD), null, 10, @@ -212,6 +213,7 @@ public void unannounce(DruidNode node) druidNode, loadManagementPeons, null, + null, new CostBalancerStrategyFactory(), EasyMock.createNiceMock(LookupCoordinatorManager.class), new TestDruidLeaderSelector(), @@ -706,6 +708,7 @@ public void testBalancerThreadNumber() null, null, null, + null, ZkEnablementConfig.ENABLED ); diff --git a/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java b/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java index 10d7ba24c40e..056b39895cef 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java @@ -80,6 +80,7 @@ public class HttpLoadQueuePeonTest null, null, null, + null, 10, Duration.ZERO ) diff --git a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java index 4a2a25f682c2..ee03e6af4125 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java @@ -95,6 +95,7 @@ public void testMultipleLoadDropSegments() throws Exception null, null, null, + null, 10, Duration.millis(0) ) @@ -287,6 +288,7 @@ public void testFailAssignForNonTimeoutFailures() throws Exception null, null, null, + null, new Duration(1), null, null, @@ -339,6 +341,7 @@ public void testFailAssignForLoadDropTimeout() throws Exception null, null, null, + null, new Duration(1), null, null, diff --git a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java index fb9287320dca..048feefa34a5 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java @@ -41,6 +41,7 @@ public LoadQueuePeonTester() null, null, null, + null, new Duration(1), null, null, diff --git a/server/src/test/java/org/apache/druid/server/coordinator/TestDruidCoordinatorConfig.java b/server/src/test/java/org/apache/druid/server/coordinator/TestDruidCoordinatorConfig.java index fef244a62961..135f8d0557e5 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/TestDruidCoordinatorConfig.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/TestDruidCoordinatorConfig.java @@ -27,9 +27,12 @@ public class TestDruidCoordinatorConfig extends DruidCoordinatorConfig private final Duration coordinatorStartDelay; private final Duration coordinatorPeriod; private final Duration coordinatorIndexingPeriod; + private final Duration metadataStoreManagementPeriod; private final Duration loadTimeoutDelay; private final Duration coordinatorKillPeriod; private final Duration coordinatorKillDurationToRetain; + private final Duration coordinatorAuditKillPeriod; + private final Duration coordinatorAuditKillDurationToRetain; private final Duration getLoadQueuePeonRepeatDelay; private final int coordinatorKillMaxSegments; @@ -37,9 +40,12 @@ public TestDruidCoordinatorConfig( Duration coordinatorStartDelay, Duration coordinatorPeriod, Duration coordinatorIndexingPeriod, + Duration metadataStoreManagementPeriod, Duration loadTimeoutDelay, Duration coordinatorKillPeriod, Duration coordinatorKillDurationToRetain, + Duration coordinatorAuditKillPeriod, + Duration coordinatorAuditKillDurationToRetain, int coordinatorKillMaxSegments, Duration getLoadQueuePeonRepeatDelay ) @@ -47,9 +53,12 @@ public TestDruidCoordinatorConfig( this.coordinatorStartDelay = coordinatorStartDelay; this.coordinatorPeriod = coordinatorPeriod; this.coordinatorIndexingPeriod = coordinatorIndexingPeriod; + this.metadataStoreManagementPeriod = metadataStoreManagementPeriod; this.loadTimeoutDelay = loadTimeoutDelay; this.coordinatorKillPeriod = coordinatorKillPeriod; this.coordinatorKillDurationToRetain = coordinatorKillDurationToRetain; + this.coordinatorAuditKillPeriod = coordinatorAuditKillPeriod; + this.coordinatorAuditKillDurationToRetain = coordinatorAuditKillDurationToRetain; this.coordinatorKillMaxSegments = coordinatorKillMaxSegments; this.getLoadQueuePeonRepeatDelay = getLoadQueuePeonRepeatDelay; } @@ -72,6 +81,12 @@ public Duration getCoordinatorIndexingPeriod() return coordinatorIndexingPeriod; } + @Override + public Duration getCoordinatorMetadataStoreManagementPeriod() + { + return metadataStoreManagementPeriod; + } + @Override public Duration getCoordinatorKillPeriod() { @@ -84,6 +99,18 @@ public Duration getCoordinatorKillDurationToRetain() return coordinatorKillDurationToRetain; } + @Override + public Duration getCoordinatorAuditKillPeriod() + { + return coordinatorAuditKillPeriod; + } + + @Override + public Duration getCoordinatorAuditKillDurationToRetain() + { + return coordinatorAuditKillDurationToRetain; + } + @Override public int getCoordinatorKillMaxSegments() { diff --git a/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java index 38aa78e4467c..fe29f445afee 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java @@ -105,6 +105,7 @@ private void testFindIntervalForKill(List segmentIntervals, Interval e null, null, Duration.parse("PT76400S"), + null, new Duration(1), Duration.parse("PT86400S"), Duration.parse("PT86400S"), diff --git a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java index 1876a28836ec..8941e7f72081 100644 --- a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java +++ b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java @@ -46,6 +46,7 @@ import org.apache.druid.guice.LifecycleModule; import org.apache.druid.guice.ManageLifecycle; import org.apache.druid.guice.annotations.CoordinatorIndexingServiceDuty; +import org.apache.druid.guice.annotations.CoordinatorMetadataStoreManagementDuty; import org.apache.druid.guice.annotations.EscalatedGlobal; import org.apache.druid.guice.http.JettyHttpClientModule; import org.apache.druid.java.util.common.concurrent.Execs; @@ -68,6 +69,7 @@ import org.apache.druid.server.coordinator.CachingCostBalancerStrategyConfig; import org.apache.druid.server.coordinator.DruidCoordinator; import org.apache.druid.server.coordinator.DruidCoordinatorConfig; +import org.apache.druid.server.coordinator.KillAuditLog; import org.apache.druid.server.coordinator.KillStalePendingSegments; import org.apache.druid.server.coordinator.LoadQueueTaskMaster; import org.apache.druid.server.coordinator.duty.CoordinatorDuty; @@ -214,14 +216,14 @@ public void configure(Binder binder) LifecycleModule.register(binder, Server.class); LifecycleModule.register(binder, DataSourcesResource.class); - final ConditionalMultibind conditionalMultibind = ConditionalMultibind.create( + // Binding for Set of indexing service coordinator Ddty + final ConditionalMultibind conditionalIndexingServiceDutyMultibind = ConditionalMultibind.create( properties, binder, CoordinatorDuty.class, CoordinatorIndexingServiceDuty.class ); - - if (conditionalMultibind.matchCondition("druid.coordinator.merge.on", Predicates.equalTo("true"))) { + if (conditionalIndexingServiceDutyMultibind.matchCondition("druid.coordinator.merge.on", Predicates.equalTo("true"))) { throw new UnsupportedOperationException( "'druid.coordinator.merge.on' is not supported anymore. " + "Please consider using Coordinator's automatic compaction instead. " @@ -230,19 +232,31 @@ public void configure(Binder binder) + "for more details about compaction." ); } - - conditionalMultibind.addConditionBinding( + conditionalIndexingServiceDutyMultibind.addConditionBinding( "druid.coordinator.kill.on", Predicates.equalTo("true"), KillUnusedSegments.class ); - conditionalMultibind.addConditionBinding( + conditionalIndexingServiceDutyMultibind.addConditionBinding( "druid.coordinator.kill.pendingSegments.on", "true", Predicates.equalTo("true"), KillStalePendingSegments.class ); + // Binding for Set of metadata store management coordinator Ddty + final ConditionalMultibind conditionalMetadataStoreManagementDutyMultibind = ConditionalMultibind.create( + properties, + binder, + CoordinatorDuty.class, + CoordinatorMetadataStoreManagementDuty.class + ); + conditionalMetadataStoreManagementDutyMultibind.addConditionBinding( + "druid.coordinator.kill.audit.on", + Predicates.equalTo("true"), + KillAuditLog.class + ); + bindNodeRoleAndAnnouncer( binder, Coordinator.class, From ca2d632628420427f789fc74a2f3a668241ad89b Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Mon, 12 Apr 2021 23:34:27 -0700 Subject: [PATCH 03/12] fix checkstyle --- .../annotations/CoordinatorMetadataStoreManagementDuty.java | 2 +- .../java/org/apache/druid/server/coordinator/KillAuditLog.java | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java b/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java index 22474f040c09..47cde2b3394c 100644 --- a/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java +++ b/server/src/main/java/org/apache/druid/guice/annotations/CoordinatorMetadataStoreManagementDuty.java @@ -33,4 +33,4 @@ @Retention(RetentionPolicy.RUNTIME) public @interface CoordinatorMetadataStoreManagementDuty { -} \ No newline at end of file +} diff --git a/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java b/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java index 43bd8f986e0f..707e519ebcb4 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java @@ -26,7 +26,6 @@ import org.apache.druid.java.util.emitter.service.ServiceEmitter; import org.apache.druid.java.util.emitter.service.ServiceMetricEvent; import org.apache.druid.server.audit.SQLAuditManager; -import org.apache.druid.server.coordinator.duty.CompactSegments; import org.apache.druid.server.coordinator.duty.CoordinatorDuty; public class KillAuditLog implements CoordinatorDuty From 8176c6467a2ba3a5770018da4c67c38d5b262f39 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Mon, 12 Apr 2021 23:55:30 -0700 Subject: [PATCH 04/12] fix test --- .../server/coordinator/CuratorDruidCoordinatorTest.java | 2 ++ .../druid/server/coordinator/DruidCoordinatorTest.java | 2 ++ .../druid/server/coordinator/HttpLoadQueuePeonTest.java | 2 ++ .../apache/druid/server/coordinator/LoadQueuePeonTest.java | 6 ++++++ .../druid/server/coordinator/LoadQueuePeonTester.java | 2 ++ .../server/coordinator/duty/KillUnusedSegmentsTest.java | 2 ++ 6 files changed, 16 insertions(+) diff --git a/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java b/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java index 1de54a217c40..a8deb667dd05 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/CuratorDruidCoordinatorTest.java @@ -172,6 +172,8 @@ public void setUp() throws Exception null, new Duration(COORDINATOR_PERIOD), null, + null, + null, 10, new Duration("PT0s") ); diff --git a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java index 97e00332ea02..82eab3c96b1a 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java @@ -144,6 +144,8 @@ public void setUp() throws Exception null, new Duration(COORDINATOR_PERIOD), null, + null, + null, 10, new Duration("PT0s") ); diff --git a/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java b/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java index 056b39895cef..7b07d7ffa61c 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/HttpLoadQueuePeonTest.java @@ -81,6 +81,8 @@ public class HttpLoadQueuePeonTest null, null, null, + null, + null, 10, Duration.ZERO ) diff --git a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java index ee03e6af4125..24e68063b6a7 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTest.java @@ -96,6 +96,8 @@ public void testMultipleLoadDropSegments() throws Exception null, null, null, + null, + null, 10, Duration.millis(0) ) @@ -292,6 +294,8 @@ public void testFailAssignForNonTimeoutFailures() throws Exception new Duration(1), null, null, + null, + null, 10, new Duration("PT1s") ) @@ -345,6 +349,8 @@ public void testFailAssignForLoadDropTimeout() throws Exception new Duration(1), null, null, + null, + null, 10, new Duration("PT1s") ) diff --git a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java index 048feefa34a5..5185452779c3 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/LoadQueuePeonTester.java @@ -45,6 +45,8 @@ public LoadQueuePeonTester() new Duration(1), null, null, + null, + null, 10, new Duration("PT1s") ) diff --git a/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java index fe29f445afee..cd766660dfbe 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillUnusedSegmentsTest.java @@ -109,6 +109,8 @@ private void testFindIntervalForKill(List segmentIntervals, Interval e new Duration(1), Duration.parse("PT86400S"), Duration.parse("PT86400S"), + null, + null, 1000, Duration.ZERO ) From b298b18898413ce46502a51a542cb2e508e5ef2f Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Tue, 13 Apr 2021 01:21:25 -0700 Subject: [PATCH 05/12] add test --- .../server/coordinator/DruidCoordinator.java | 8 +- .../coordinator/{ => duty}/KillAuditLog.java | 13 +- .../server/audit/SQLAuditManagerTest.java | 71 +++++++++ .../coordinator/DruidCoordinatorTest.java | 2 +- .../coordinator/duty/KillAuditLogTest.java | 139 ++++++++++++++++++ .../org/apache/druid/cli/CliCoordinator.java | 2 +- 6 files changed, 223 insertions(+), 12 deletions(-) rename server/src/main/java/org/apache/druid/server/coordinator/{ => duty}/KillAuditLog.java (84%) create mode 100644 server/src/test/java/org/apache/druid/server/coordinator/duty/KillAuditLogTest.java diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java index 3b4245f6939e..97064d3b0381 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java @@ -183,8 +183,8 @@ public DruidCoordinator( LoadQueueTaskMaster taskMaster, ServiceAnnouncer serviceAnnouncer, @Self DruidNode self, - @CoordinatorIndexingServiceDuty Set metadataStoreManagementDuties, - @CoordinatorMetadataStoreManagementDuty Set indexingServiceDuties, + @CoordinatorMetadataStoreManagementDuty Set metadataStoreManagementDuties, + @CoordinatorIndexingServiceDuty Set indexingServiceDuties, BalancerStrategyFactory factory, LookupCoordinatorManager lookupCoordinatorManager, @Coordinator DruidLeaderSelector coordLeaderSelector, @@ -708,7 +708,7 @@ private List makeIndexingServiceDuties() // CompactSegmentsDuty should be the last duty as it can take a long time to complete duties.addAll(makeCompactSegmentsDuty()); - log.debug( + log.info( "Done making indexing service duties %s", duties.stream().map(duty -> duty.getClass().getName()).collect(Collectors.toList()) ); @@ -720,7 +720,7 @@ private List makeMetadataStoreManagementDuties() List duties = new ArrayList<>(); duties.addAll(metadataStoreManagementDuties); - log.debug( + log.info( "Done making metadata store management duties %s", duties.stream().map(duty -> duty.getClass().getName()).collect(Collectors.toList()) ); diff --git a/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java b/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java similarity index 84% rename from server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java rename to server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java index 707e519ebcb4..265cbc91d97e 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/KillAuditLog.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java @@ -17,7 +17,7 @@ * under the License. */ -package org.apache.druid.server.coordinator; +package org.apache.druid.server.coordinator.duty; import com.google.common.base.Preconditions; import com.google.inject.Inject; @@ -25,8 +25,8 @@ import org.apache.druid.java.util.common.logger.Logger; import org.apache.druid.java.util.emitter.service.ServiceEmitter; import org.apache.druid.java.util.emitter.service.ServiceMetricEvent; -import org.apache.druid.server.audit.SQLAuditManager; -import org.apache.druid.server.coordinator.duty.CoordinatorDuty; +import org.apache.druid.server.coordinator.DruidCoordinatorConfig; +import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams; public class KillAuditLog implements CoordinatorDuty { @@ -40,14 +40,14 @@ public class KillAuditLog implements CoordinatorDuty @Inject public KillAuditLog( - SQLAuditManager auditManager, + AuditManager auditManager, DruidCoordinatorConfig config ) { this.period = config.getCoordinatorAuditKillPeriod().getMillis(); Preconditions.checkArgument( - this.period > config.getCoordinatorMetadataStoreManagementPeriod().getMillis(), - "coordinator audit kill period must be greater than druid.coordinator.period.metadataStoreManagementPeriod" + this.period >= config.getCoordinatorMetadataStoreManagementPeriod().getMillis(), + "coordinator audit kill period must be >= druid.coordinator.period.metadataStoreManagementPeriod" ); this.retainDuration = config.getCoordinatorAuditKillDurationToRetain().getMillis(); Preconditions.checkArgument(this.retainDuration >= 0, "coordinator audit kill retainDuration must be >= 0"); @@ -63,6 +63,7 @@ public KillAuditLog( public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) { if ((lastKillTime + period) < System.currentTimeMillis()) { + log.info("Running KillAuditLog duty"); lastKillTime = System.currentTimeMillis(); long timestamp = System.currentTimeMillis() - retainDuration; diff --git a/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java b/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java index 13dc10ab316f..0a6bf8963888 100644 --- a/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java +++ b/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java @@ -210,6 +210,77 @@ public void testFetchAuditHistoryByKeyAndTypeWithLimit() Assert.assertEquals(entry1, auditEntries.get(0)); } + @Test(timeout = 60_000L) + public void testRemoveAuditLogsOlderThanWithEntryOlderThanTime() throws IOException + { + AuditEntry entry = new AuditEntry( + "testKey", + "testType", + new AuditInfo( + "testAuthor", + "testComment", + "127.0.0.1" + ), + "testPayload", + DateTimes.of("2013-01-01T00:00:00Z") + ); + auditManager.doAudit(entry); + byte[] payload = connector.lookup( + derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), + "audit_key", + "payload", + "testKey" + ); + AuditEntry dbEntry = mapper.readValue(payload, AuditEntry.class); + Assert.assertEquals(entry, dbEntry); + // Do delete + auditManager.removeAuditLogsOlderThan(System.currentTimeMillis()); + // Verify the delete + payload = connector.lookup( + derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), + "audit_key", + "payload", + "testKey" + ); + Assert.assertNull(payload); + } + + @Test(timeout = 60_000L) + public void testRemoveAuditLogsOlderThanWithEntryNotOlderThanTime() throws IOException + { + AuditEntry entry = new AuditEntry( + "testKey", + "testType", + new AuditInfo( + "testAuthor", + "testComment", + "127.0.0.1" + ), + "testPayload", + DateTimes.of("2013-01-01T00:00:00Z") + ); + auditManager.doAudit(entry); + byte[] payload = connector.lookup( + derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), + "audit_key", + "payload", + "testKey" + ); + AuditEntry dbEntry = mapper.readValue(payload, AuditEntry.class); + Assert.assertEquals(entry, dbEntry); + // Do delete + auditManager.removeAuditLogsOlderThan(DateTimes.of("2012-01-01T00:00:00Z").getMillis()); + // Verify that entry was not delete + payload = connector.lookup( + derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), + "audit_key", + "payload", + "testKey" + ); + dbEntry = mapper.readValue(payload, AuditEntry.class); + Assert.assertEquals(entry, dbEntry); + } + @Test(timeout = 60_000L) public void testFetchAuditHistoryByTypeWithLimit() { diff --git a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java index 82eab3c96b1a..26fb4471629b 100644 --- a/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java +++ b/server/src/test/java/org/apache/druid/server/coordinator/DruidCoordinatorTest.java @@ -215,7 +215,7 @@ public void unannounce(DruidNode node) druidNode, loadManagementPeons, null, - null, + new HashSet<>(), new CostBalancerStrategyFactory(), EasyMock.createNiceMock(LookupCoordinatorManager.class), new TestDruidLeaderSelector(), diff --git a/server/src/test/java/org/apache/druid/server/coordinator/duty/KillAuditLogTest.java b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillAuditLogTest.java new file mode 100644 index 000000000000..b0f273cba47d --- /dev/null +++ b/server/src/test/java/org/apache/druid/server/coordinator/duty/KillAuditLogTest.java @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.server.coordinator.duty; + +import org.apache.druid.audit.AuditManager; +import org.apache.druid.java.util.emitter.service.ServiceEmitter; +import org.apache.druid.java.util.emitter.service.ServiceEventBuilder; +import org.apache.druid.server.coordinator.DruidCoordinatorRuntimeParams; +import org.apache.druid.server.coordinator.TestDruidCoordinatorConfig; +import org.joda.time.Duration; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.ExpectedException; +import org.junit.runner.RunWith; +import org.mockito.ArgumentMatchers; +import org.mockito.Mock; +import org.mockito.Mockito; +import org.mockito.junit.MockitoJUnitRunner; + +@RunWith(MockitoJUnitRunner.class) +public class KillAuditLogTest +{ + @Mock + private AuditManager mockAuditManager; + + @Mock + private DruidCoordinatorRuntimeParams mockDruidCoordinatorRuntimeParams; + + @Mock + private ServiceEmitter mockServiceEmitter; + + @Rule + public ExpectedException exception = ExpectedException.none(); + + private KillAuditLog killAuditLog; + + @Test + public void testRunSkipIfLastRunLessThanPeriod() + { + TestDruidCoordinatorConfig druidCoordinatorConfig = new TestDruidCoordinatorConfig( + null, + null, + null, + new Duration("PT5S"), + null, + null, + null, + new Duration(Long.MAX_VALUE), + new Duration("PT1S"), + 10, + null + ); + killAuditLog = new KillAuditLog(mockAuditManager, druidCoordinatorConfig); + killAuditLog.run(mockDruidCoordinatorRuntimeParams); + Mockito.verifyZeroInteractions(mockAuditManager); + } + + @Test + public void testRunNotSkipIfLastRunMoreThanPeriod() + { + Mockito.when(mockDruidCoordinatorRuntimeParams.getEmitter()).thenReturn(mockServiceEmitter); + TestDruidCoordinatorConfig druidCoordinatorConfig = new TestDruidCoordinatorConfig( + null, + null, + null, + new Duration("PT5S"), + null, + null, + null, + new Duration("PT6S"), + new Duration("PT1S"), + 10, + null + ); + killAuditLog = new KillAuditLog(mockAuditManager, druidCoordinatorConfig); + killAuditLog.run(mockDruidCoordinatorRuntimeParams); + Mockito.verify(mockAuditManager).removeAuditLogsOlderThan(ArgumentMatchers.anyLong()); + Mockito.verify(mockServiceEmitter).emit(ArgumentMatchers.any(ServiceEventBuilder.class)); + } + + @Test + public void testConstructorFailIfInvalidPeriod() + { + TestDruidCoordinatorConfig druidCoordinatorConfig = new TestDruidCoordinatorConfig( + null, + null, + null, + new Duration("PT5S"), + null, + null, + null, + new Duration("PT3S"), + new Duration("PT1S"), + 10, + null + ); + exception.expect(IllegalArgumentException.class); + exception.expectMessage("coordinator audit kill period must be >= druid.coordinator.period.metadataStoreManagementPeriod"); + killAuditLog = new KillAuditLog(mockAuditManager, druidCoordinatorConfig); + } + + @Test + public void testConstructorFailIfInvalidRetainDuration() + { + TestDruidCoordinatorConfig druidCoordinatorConfig = new TestDruidCoordinatorConfig( + null, + null, + null, + new Duration("PT5S"), + null, + null, + null, + new Duration("PT6S"), + new Duration("PT-1S"), + 10, + null + ); + exception.expect(IllegalArgumentException.class); + exception.expectMessage("coordinator audit kill retainDuration must be >= 0"); + killAuditLog = new KillAuditLog(mockAuditManager, druidCoordinatorConfig); + } +} diff --git a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java index 8941e7f72081..17f16a3f4c6d 100644 --- a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java +++ b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java @@ -69,7 +69,7 @@ import org.apache.druid.server.coordinator.CachingCostBalancerStrategyConfig; import org.apache.druid.server.coordinator.DruidCoordinator; import org.apache.druid.server.coordinator.DruidCoordinatorConfig; -import org.apache.druid.server.coordinator.KillAuditLog; +import org.apache.druid.server.coordinator.duty.KillAuditLog; import org.apache.druid.server.coordinator.KillStalePendingSegments; import org.apache.druid.server.coordinator.LoadQueueTaskMaster; import org.apache.druid.server.coordinator.duty.CoordinatorDuty; From 55786301d97878f79616af8c6696aeab89ce9b1c Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Tue, 13 Apr 2021 01:23:36 -0700 Subject: [PATCH 06/12] fix checkstyle --- services/src/main/java/org/apache/druid/cli/CliCoordinator.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java index 17f16a3f4c6d..9e3c65b3cf1f 100644 --- a/services/src/main/java/org/apache/druid/cli/CliCoordinator.java +++ b/services/src/main/java/org/apache/druid/cli/CliCoordinator.java @@ -69,10 +69,10 @@ import org.apache.druid.server.coordinator.CachingCostBalancerStrategyConfig; import org.apache.druid.server.coordinator.DruidCoordinator; import org.apache.druid.server.coordinator.DruidCoordinatorConfig; -import org.apache.druid.server.coordinator.duty.KillAuditLog; import org.apache.druid.server.coordinator.KillStalePendingSegments; import org.apache.druid.server.coordinator.LoadQueueTaskMaster; import org.apache.druid.server.coordinator.duty.CoordinatorDuty; +import org.apache.druid.server.coordinator.duty.KillAuditLog; import org.apache.druid.server.coordinator.duty.KillUnusedSegments; import org.apache.druid.server.http.ClusterResource; import org.apache.druid.server.http.CompactionResource; From 971b6607c7342eeb5ea90238ef5f244af171c712 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Tue, 13 Apr 2021 12:46:43 -0700 Subject: [PATCH 07/12] fix checkstyle --- .../apache/druid/server/coordinator/DruidCoordinator.java | 5 +++-- website/.spelling | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java index 97064d3b0381..f43823dcabbf 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java @@ -717,8 +717,9 @@ private List makeIndexingServiceDuties() private List makeMetadataStoreManagementDuties() { - List duties = new ArrayList<>(); - duties.addAll(metadataStoreManagementDuties); + List duties = ImmutableList.builder() + .addAll(metadataStoreManagementDuties) + .build(); log.info( "Done making metadata store management duties %s", diff --git a/website/.spelling b/website/.spelling index 017a1a91c125..2b981983df2b 100644 --- a/website/.spelling +++ b/website/.spelling @@ -1671,6 +1671,7 @@ PT1S PT24H PT300S PT30S +PT3600S PT5M PT5S PT60S From 362097eede9e3495bce88b2b9bf8405ee25b7bbb Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Wed, 14 Apr 2021 12:58:07 -0700 Subject: [PATCH 08/12] fix test --- .../server/audit/SQLAuditManagerTest.java | 74 ++++++++++--------- 1 file changed, 40 insertions(+), 34 deletions(-) diff --git a/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java b/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java index d8437735a693..d336e84e6f1c 100644 --- a/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java +++ b/server/src/test/java/org/apache/druid/server/audit/SQLAuditManagerTest.java @@ -128,15 +128,15 @@ public void testAuditEntrySerde() throws IOException public void testAuditMetricEventBuilderConfig() { AuditEntry entry = new AuditEntry( - "testKey", - "testType", - new AuditInfo( - "testAuthor", - "testComment", - "127.0.0.1" - ), - "testPayload", - DateTimes.of("2013-01-01T00:00:00Z") + "testKey", + "testType", + new AuditInfo( + "testAuthor", + "testComment", + "127.0.0.1" + ), + "testPayload", + DateTimes.of("2013-01-01T00:00:00Z") ); SQLAuditManager auditManagerWithPayloadAsDimension = new SQLAuditManager( @@ -259,18 +259,16 @@ public void testFetchAuditHistoryByKeyAndTypeWithLimit() @Test(timeout = 60_000L) public void testRemoveAuditLogsOlderThanWithEntryOlderThanTime() throws IOException { - AuditEntry entry = new AuditEntry( - "testKey", - "testType", - new AuditInfo( - "testAuthor", - "testComment", - "127.0.0.1" - ), - "testPayload", - DateTimes.of("2013-01-01T00:00:00Z") + String entry1Key = "testKey"; + String entry1Type = "testType"; + AuditInfo entry1AuditInfo = new AuditInfo( + "testAuthor", + "testComment", + "127.0.0.1" ); - auditManager.doAudit(entry); + String entry1Payload = "testPayload"; + + auditManager.doAudit(entry1Key, entry1Type, entry1AuditInfo, entry1Payload, stringConfigSerde); byte[] payload = connector.lookup( derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), "audit_key", @@ -278,7 +276,11 @@ public void testRemoveAuditLogsOlderThanWithEntryOlderThanTime() throws IOExcept "testKey" ); AuditEntry dbEntry = mapper.readValue(payload, AuditEntry.class); - Assert.assertEquals(entry, dbEntry); + Assert.assertEquals(entry1Key, dbEntry.getKey()); + Assert.assertEquals(entry1Payload, dbEntry.getPayload()); + Assert.assertEquals(entry1Type, dbEntry.getType()); + Assert.assertEquals(entry1AuditInfo, dbEntry.getAuditInfo()); + // Do delete auditManager.removeAuditLogsOlderThan(System.currentTimeMillis()); // Verify the delete @@ -294,18 +296,16 @@ public void testRemoveAuditLogsOlderThanWithEntryOlderThanTime() throws IOExcept @Test(timeout = 60_000L) public void testRemoveAuditLogsOlderThanWithEntryNotOlderThanTime() throws IOException { - AuditEntry entry = new AuditEntry( - "testKey", - "testType", - new AuditInfo( - "testAuthor", - "testComment", - "127.0.0.1" - ), - "testPayload", - DateTimes.of("2013-01-01T00:00:00Z") + String entry1Key = "testKey"; + String entry1Type = "testType"; + AuditInfo entry1AuditInfo = new AuditInfo( + "testAuthor", + "testComment", + "127.0.0.1" ); - auditManager.doAudit(entry); + String entry1Payload = "testPayload"; + + auditManager.doAudit(entry1Key, entry1Type, entry1AuditInfo, entry1Payload, stringConfigSerde); byte[] payload = connector.lookup( derbyConnectorRule.metadataTablesConfigSupplier().get().getAuditTable(), "audit_key", @@ -313,7 +313,10 @@ public void testRemoveAuditLogsOlderThanWithEntryNotOlderThanTime() throws IOExc "testKey" ); AuditEntry dbEntry = mapper.readValue(payload, AuditEntry.class); - Assert.assertEquals(entry, dbEntry); + Assert.assertEquals(entry1Key, dbEntry.getKey()); + Assert.assertEquals(entry1Payload, dbEntry.getPayload()); + Assert.assertEquals(entry1Type, dbEntry.getType()); + Assert.assertEquals(entry1AuditInfo, dbEntry.getAuditInfo()); // Do delete auditManager.removeAuditLogsOlderThan(DateTimes.of("2012-01-01T00:00:00Z").getMillis()); // Verify that entry was not delete @@ -324,7 +327,10 @@ public void testRemoveAuditLogsOlderThanWithEntryNotOlderThanTime() throws IOExc "testKey" ); dbEntry = mapper.readValue(payload, AuditEntry.class); - Assert.assertEquals(entry, dbEntry); + Assert.assertEquals(entry1Key, dbEntry.getKey()); + Assert.assertEquals(entry1Payload, dbEntry.getPayload()); + Assert.assertEquals(entry1Type, dbEntry.getType()); + Assert.assertEquals(entry1AuditInfo, dbEntry.getAuditInfo()); } @Test(timeout = 60_000L) From 55271c92e374fb41bb7dbb57af9df83f735a9682 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Fri, 16 Apr 2021 15:12:24 -0700 Subject: [PATCH 09/12] Address comments --- docs/configuration/index.md | 6 +++--- .../apache/druid/server/coordinator/DruidCoordinator.java | 4 ++-- .../druid/server/coordinator/DruidCoordinatorConfig.java | 2 +- .../apache/druid/server/coordinator/duty/KillAuditLog.java | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 5f9d891db565..2351b970af79 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -746,10 +746,10 @@ These Coordinator static configurations can be defined in the `coordinator/runti |Property|Description|Required?|Default| |--------|-----------|---------|-------| -|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks. |No |PT3600S (1 hour)| +|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. |No | PT1H| |`druid.coordinator.kill.audit.on`| Boolean value for whether to enable automatic deletion of audit logs. If set to true, Coordinator will periodically remove audit logs from the audit table entries in metadata storage.| No | False| -|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| -|`druid.coordinator.kill.audit.durationToRetain`| In milliseconds, audit logs to be retained created in last x milliseconds. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| +|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| +|`druid.coordinator.kill.audit.durationToRetain`| Duration of audit logs to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| ##### Segment Management |Property|Possible Values|Description|Default| diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java index 6e2c0cabe01a..7da4fe6b3000 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinator.java @@ -756,7 +756,7 @@ private List makeIndexingServiceDuties() // CompactSegmentsDuty should be the last duty as it can take a long time to complete duties.addAll(makeCompactSegmentsDuty()); - log.info( + log.debug( "Done making indexing service duties %s", duties.stream().map(duty -> duty.getClass().getName()).collect(Collectors.toList()) ); @@ -769,7 +769,7 @@ private List makeMetadataStoreManagementDuties() .addAll(metadataStoreManagementDuties) .build(); - log.info( + log.debug( "Done making metadata store management duties %s", duties.stream().map(duty -> duty.getClass().getName()).collect(Collectors.toList()) ); diff --git a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java index ec76a098fb46..933b974d3ad0 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/DruidCoordinatorConfig.java @@ -40,7 +40,7 @@ public abstract class DruidCoordinatorConfig public abstract Duration getCoordinatorIndexingPeriod(); @Config("druid.coordinator.period.metadataStoreManagementPeriod") - @Default("PT3600s") + @Default("PT1H") public abstract Duration getCoordinatorMetadataStoreManagementPeriod(); @Config("druid.coordinator.kill.period") diff --git a/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java b/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java index 265cbc91d97e..651cf6b60c96 100644 --- a/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java +++ b/server/src/main/java/org/apache/druid/server/coordinator/duty/KillAuditLog.java @@ -51,7 +51,7 @@ public KillAuditLog( ); this.retainDuration = config.getCoordinatorAuditKillDurationToRetain().getMillis(); Preconditions.checkArgument(this.retainDuration >= 0, "coordinator audit kill retainDuration must be >= 0"); - log.info( + log.debug( "Audit Kill Task scheduling enabled with period [%s], retainDuration [%s]", this.period, this.retainDuration @@ -63,7 +63,6 @@ public KillAuditLog( public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) { if ((lastKillTime + period) < System.currentTimeMillis()) { - log.info("Running KillAuditLog duty"); lastKillTime = System.currentTimeMillis(); long timestamp = System.currentTimeMillis() - retainDuration; @@ -75,6 +74,7 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) auditRemoved ) ); + log.info("Finished running KillAuditLog duty. Removed %,d audit logs", auditRemoved); } return params; } From 90d8095b2ab33ed18530f76249324637965bd81c Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Mon, 19 Apr 2021 18:00:41 -0700 Subject: [PATCH 10/12] Address comments --- docs/configuration/index.md | 2 +- docs/operations/metrics.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index 2351b970af79..e80c8a3fec95 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -746,7 +746,7 @@ These Coordinator static configurations can be defined in the `coordinator/runti |Property|Description|Required?|Default| |--------|-----------|---------|-------| -|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. |No | PT1H| +|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. |No | `PT1H`| |`druid.coordinator.kill.audit.on`| Boolean value for whether to enable automatic deletion of audit logs. If set to true, Coordinator will periodically remove audit logs from the audit table entries in metadata storage.| No | False| |`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| |`druid.coordinator.kill.audit.durationToRetain`| Duration of audit logs to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md index 7d377484620a..221f7a02971b 100644 --- a/docs/operations/metrics.md +++ b/docs/operations/metrics.md @@ -256,7 +256,7 @@ These metrics are for the Druid Coordinator and are reset each time the Coordina |`interval/skipCompact/count`|Total number of intervals of this datasource that are skipped (not eligible for auto compaction) by the auto compaction.|datasource.|Varies.| |`coordinator/time`|Approximate Coordinator duty runtime in milliseconds. The duty dimension is the string alias of the Duty that is being run.|duty.|Varies.| |`coordinator/global/time`|Approximate runtime of a full coordination cycle in milliseconds. The `dutyGroup` dimension indicates what type of coordination this run was. i.e. Historical Management vs Indexing|`dutyGroup`|Varies.| -|`metadata/kill/audit/count`|Total number of audit logs deleted from metadata store audit table.| |Varies.| +|`metadata/kill/audit/count`|Total number of audit logs automatically deleted from metadata store audit table per each Coordinator kill audit duty run. This metric can help adjust `druid.coordinator.kill.audit.durationToRetain` configuration based on if more or less audit logs need to be deleted per cycle. Note that this metric is only emiited when `druid.coordinator.kill.audit.on` is set to true.| |Varies.| If `emitBalancingStats` is set to `true` in the Coordinator [dynamic configuration](../configuration/index.md#dynamic-configuration), then [log entries](../configuration/logging.md) for class From bbc67467658aba13cfbeb7949d290990e883f9e8 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Mon, 19 Apr 2021 21:18:47 -0700 Subject: [PATCH 11/12] fix spelling --- docs/operations/metrics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md index 221f7a02971b..5a2d5fb76fd7 100644 --- a/docs/operations/metrics.md +++ b/docs/operations/metrics.md @@ -256,7 +256,7 @@ These metrics are for the Druid Coordinator and are reset each time the Coordina |`interval/skipCompact/count`|Total number of intervals of this datasource that are skipped (not eligible for auto compaction) by the auto compaction.|datasource.|Varies.| |`coordinator/time`|Approximate Coordinator duty runtime in milliseconds. The duty dimension is the string alias of the Duty that is being run.|duty.|Varies.| |`coordinator/global/time`|Approximate runtime of a full coordination cycle in milliseconds. The `dutyGroup` dimension indicates what type of coordination this run was. i.e. Historical Management vs Indexing|`dutyGroup`|Varies.| -|`metadata/kill/audit/count`|Total number of audit logs automatically deleted from metadata store audit table per each Coordinator kill audit duty run. This metric can help adjust `druid.coordinator.kill.audit.durationToRetain` configuration based on if more or less audit logs need to be deleted per cycle. Note that this metric is only emiited when `druid.coordinator.kill.audit.on` is set to true.| |Varies.| +|`metadata/kill/audit/count`|Total number of audit logs automatically deleted from metadata store audit table per each Coordinator kill audit duty run. This metric can help adjust `druid.coordinator.kill.audit.durationToRetain` configuration based on if more or less audit logs need to be deleted per cycle. Note that this metric is only emitted when `druid.coordinator.kill.audit.on` is set to true.| |Varies.| If `emitBalancingStats` is set to `true` in the Coordinator [dynamic configuration](../configuration/index.md#dynamic-configuration), then [log entries](../configuration/logging.md) for class From 4182ff9d2bb28786115ba34ce8d7f6c4dda23d90 Mon Sep 17 00:00:00 2001 From: Maytas Monsereenusorn Date: Tue, 20 Apr 2021 12:22:39 -0700 Subject: [PATCH 12/12] fix docs --- docs/configuration/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/configuration/index.md b/docs/configuration/index.md index e80c8a3fec95..d49bf75675fa 100644 --- a/docs/configuration/index.md +++ b/docs/configuration/index.md @@ -748,7 +748,7 @@ These Coordinator static configurations can be defined in the `coordinator/runti |--------|-----------|---------|-------| |`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. |No | `PT1H`| |`druid.coordinator.kill.audit.on`| Boolean value for whether to enable automatic deletion of audit logs. If set to true, Coordinator will periodically remove audit logs from the audit table entries in metadata storage.| No | False| -|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| +|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to True.| No| `P1D`| |`druid.coordinator.kill.audit.durationToRetain`| Duration of audit logs to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.audit.on` is set to True.| Yes if `druid.coordinator.kill.audit.on` is set to True| None| ##### Segment Management