Skip to content

Conversation

@kaijchen
Copy link
Member

@kaijchen kaijchen commented Mar 24, 2022

What changes were proposed in this pull request?

New class OmOpenKeyInfo was introduced.

Changed the return value of getExpiredOpenKeys from List<String> to List<OmOpenKeyInfo>.
So we don't have to construct OpenKeyBucket by parsing the open key names.
And it enables returning regular keys with FSO keys in the same call.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6506

How was this patch tested?

Unit tests

@kaijchen
Copy link
Member Author

CC @errose28 and @captainzmc for review.

@kaijchen
Copy link
Member Author

Marking this as WIP to introduce OmOpenKeyInfo class.

@kaijchen kaijchen marked this pull request as draft March 25, 2022 07:26
@kaijchen kaijchen changed the title HDDS-6506. Return List<OpenKeyBucket> in getExpiredOpenKeys HDDS-6506. Return List<OpenKeyInfo> in getExpiredOpenKeys Mar 25, 2022
@kaijchen kaijchen changed the title HDDS-6506. Return List<OpenKeyInfo> in getExpiredOpenKeys HDDS-6506. Return List<OmOpenKeyInfo> in getExpiredOpenKeys Mar 25, 2022
@kaijchen kaijchen changed the title HDDS-6506. Return List<OmOpenKeyInfo> in getExpiredOpenKeys HDDS-6506. Introduce OmOpenKeyInfo for getExpiredOpenKeys Mar 25, 2022
@kaijchen
Copy link
Member Author

HDDS-6509 is included to avoid the checkstyle issue in the new class.
It will be removed when #3237 gets merged.

@kaijchen kaijchen marked this pull request as ready for review March 25, 2022 09:04
Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @kaijchen. I agree that splitting the return value out of a string and into an object will be easier to work with later, but could you clarify why we need the client ID and parent ID fields? My understanding is that we would need 4 fields:

  • volume name
  • bucket name
  • key name (regardless of FSO or not)
  • Whether key is in open file table or open key table

I think just taking the key name from the open key/file table, sending out via Ratis, and having followers delete the key by that name as it appears in the table should work, as long as they know which table the key came from.

Note that we cannot check the bucket layout type anywhere in the open key cleanup service, since it is a background operation. The original bucket may have been deleted, and a new one created with the same name and different layout, since the service is seeing the key.

@kaijchen
Copy link
Member Author

kaijchen commented Apr 6, 2022

Thanks for reviewing @errose28. HDDS-6491 shows how this class will be used.

but could you clarify why we need the client ID and parent ID fields?

Default key name in the openKeyTable /volume/bucket/key/clientID.
FSO key name in the openFileTable parentID/key/clientID.

Note that we cannot check the bucket layout type anywhere in the open key cleanup service

That's why I also included BucketLayout in OmOpenKeyInfo.

@errose28
Copy link
Contributor

errose28 commented Apr 6, 2022

Thanks for the quick reply.

Default key name in the openKeyTable /volume/bucket/key/clientID.
FSO key name in the openFileTable parentID/key/clientID.

This is correct, but the service is cleaning out the table regardless of the key format. It iterates the correct table, and posts deletions for the keys it finds. With the current implementation, we will find a key, split it into parts, then re-assemble it again later to get the key we need to delete. As long as we know what table we got the key from, we can leave it assembled and still retrieve it again.

That's why I also included BucketLayout in OmOpenKeyInfo.

Yes, I agree we need this field in the OmOpenKeyInfo, but it is currently set using OmMetadataManagerImpl#getBucketLayout which is a stub always returning DEFAULT. Even if the method were corrected to take a bucket name and check the layout, this information could be incorrect in the scenarios given in my previous comment.

IMO there should be two methods, maybe with a helper method to handle common code: OmMetadataManagerImpl#getExpiredOpenKeys and OmMetadataManagerImpl#getExpiredOpenFiles. Each method scans only its corresponding table, and sets the bucketLayout flag according to the table it scanned. This also seems consistent with the plan described in HDDS-6491, where the cleanup service must be able to choose which table should be scanned on the current run.

@kaijchen
Copy link
Member Author

kaijchen commented Apr 6, 2022

Even if the method were corrected to take a bucket name and check the layout, this information could be incorrect in the scenarios given in my previous comment.

The method can take a bucketLayout parameter as in the draft of HDDS-6491.
It's a design choice between selecting bucketLayout in getExpiredOpenKeys() or in OpenKeyCleanupService#call().

@kaijchen
Copy link
Member Author

kaijchen commented Apr 6, 2022

@errose28 Do you think it would be better to return List<OpenKeyBucket> in getExpiredOpenKeys?

For example: kaijchen/ozone@master...kaijchen:HDDS-6506-dev

@kaijchen
Copy link
Member Author

kaijchen commented Apr 19, 2022

Closing this for now. Please refer to #3226.

@kaijchen kaijchen closed this Apr 19, 2022
@kaijchen kaijchen deleted the HDDS-6506 branch October 20, 2022 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants