-
Notifications
You must be signed in to change notification settings - Fork 594
HDDS-7740. [Snapshot] Implement SnapshotDeletingService #4244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: I6fe3de2e9409757bb2871bfbcbac8abd7e11dd53
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aswinshakil Thanks for working on this, given few comments
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
|
Relevant UT failure: |
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
Outdated
Show resolved
Hide resolved
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
Outdated
Show resolved
Hide resolved
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Show resolved
Hide resolved
...nager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
Show resolved
Hide resolved
...nager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
Outdated
Show resolved
Hide resolved
...nager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
Outdated
Show resolved
Hide resolved
...nager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
Outdated
Show resolved
Hide resolved
hemantk-12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @aswinshakil
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/SnapshotChainManager.java
Outdated
Show resolved
Hide resolved
...nager/src/main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotCreateRequest.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
.../main/java/org/apache/hadoop/ozone/om/request/snapshot/OMSnapshotMoveDeletedKeysRequest.java
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
| omMetadataReader = new OmMetadataReader(keyManager, prefixManager, | ||
| this, LOG, AUDIT, metrics); | ||
| omSnapshotManager = new OmSnapshotManager(this); | ||
| snapshotChainManager = new SnapshotChainManager(metadataManager); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@neils-dev mentioned this should be placed in OmMetadataManagerImpl
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
...ne-manager/src/test/java/org/apache/hadoop/ozone/om/service/TestSnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Outdated
Show resolved
Hide resolved
.../ozone-manager/src/main/java/org/apache/hadoop/ozone/om/service/SnapshotDeletingService.java
Show resolved
Hide resolved
hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto
Outdated
Show resolved
Hide resolved
|
We could also create an instance of Keydeletion/DirectoryDeletionService service that could operate on a snapshot/active OS instance. That way we can create as many instances as needed to scale. |
|
|
||
| if (nextSnapshot != null) { | ||
| omNextSnapshot = (OmSnapshot) omSnapshotManager | ||
| .checkForSnapshot(nextSnapshot.getVolumeName(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that validateAndUpdateCache() runs within the applyTransaction() method of the ozoneManager ratis state machine, and therefor needs to run quickly. Is that not correct?
|
I know we are in a hurry, but snapshot delete is the most complicated and dangerous part of the snapshot system, so I'd like to see more tests for this subsystem. If you are too busy, we can create a separate PR and have someone on my team write the tests, in particular unit tests for the following methods: In addition, I'd like a unit test that confirms that we are correctly starting and stopping within the bucket scope. |
...ain/java/org/apache/hadoop/ozone/om/response/snapshot/OMSnapshotMoveDeletedKeysResponse.java
Outdated
Show resolved
Hide resolved
|
@GeorgeJahad I agree we need more tests, I'll add them eventually along with my other PR's. |
|
I'm disabling the tests for this PR and will update it in another patch, there are current PR and my upcoming patch that would break the tests when merging to master. |
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/WithObjectID.java
Show resolved
Hide resolved
...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRequest.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/hadoop/ozone/om/response/snapshot/OMSnapshotMoveDeletedKeysResponse.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/hadoop/ozone/om/response/snapshot/OMSnapshotMoveDeletedKeysResponse.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest change where we no longer move keys into active DB lgtm (rather, those reclaimable keys are retained in the current snapshot checkpoint DB, and would be cleaned up by SDT)
Follow-up TODOs from this PR:
- Re-enable tests
- Add more tests
- Finish the rename logic
- Revisit the locking
- Implement block reclamation in SDT
|
Thanks @aswinshakil for the main logic implementation. Thanks @GeorgeJahad @sumitagrawl @neils-dev @DaveTeng0 @hemantk-12 @prashantpogde for reviewing this. |
* master: (262 commits) HDDS-8153. Integrate ContainerBalancer with MoveManager (apache#4391) HDDS-8090. When getBlock from a datanode fails, retry other datanodes. (apache#4357) HDDS-8163 Use try-with-resources to ensure close rockdb connection in SstFilteringService (apache#4402) HDDS-8065. Provide GNU long options (apache#4394) HDDS-7930. [addendum] input stream does not refresh expired block token. HDDS-7930. input stream does not refresh expired block token. (apache#4378) HDDS-7740. [Snapshot] Implement SnapshotDeletingService (apache#4244) HDDS-8076. Use container cache in Key listing API. (apache#4346) HDDS-8091. [addendum] Generate list of config tags from ConfigTag enum - Hadoop 3.1 compatibility fix (apache#4374) HDDS-8144. TestDefaultCertificateClient#testTimeBeforeExpiryGracePeriod fails as we approach DST. (apache#4382) HDDS-8151. Support fine grained lifetime for root CA certificate (apache#4386) HDDS-8150. RpcClientTest and ConfigurationSourceTest not run due to naming convention (apache#4388) HDDS-8131. Add Configuration for OM Ratis Log Purge Tuning Parameters. (apache#4371) HDDS-8133. Create ozone sh key checksum command (apache#4375) HDDS-8142. Check if no entries in Block DB for a container on container delete (apache#4379) HDDS-8118. Fail container delete on non empty chunks dir (apache#4367) HDDS-8028. JNI for RocksDB SST Dump tool (apache#4315) HDDS-8129. ContainerStateMachine allows two different tasks with the same container id running in parallel. (apache#4370) HDDS-8119. Remove loosely related AutoCloseable from SendContainerOutputStream (apache#4368) close db connection (apache#4366) ...
| private SnapshotInfo getNextActiveSnapshot(SnapshotInfo snapInfo, | ||
| SnapshotChainManager chainManager, OmSnapshotManager omSnapshotManager) | ||
| throws IOException { | ||
| while (chainManager.hasNextPathSnapshot(snapInfo.getSnapshotPath(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might cause an infinite loop because snapInfo is not getting reset.
What changes were proposed in this pull request?
The patch implements
SnapshotDeletingService, it goes through the deleted snapshot'sdeletedTableand does either of the following.deletedTableof the current snapshot.Follow-up TODO
Right now the
SnapshotDeletingServicedoesn't handle the following and will be done in the next patch.Tracked here: HDDS-7883
OMSnapshotPurgeRequestwill do these cleanups.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7740
How was this patch tested?
The patch was tested with UTs and manual testing.