HDDS-11636. Cache ozone key details #7419

vtutrinov · 2024-11-11T08:22:26Z

What changes were proposed in this pull request?

Cache ozone key details for a tiny period to avoid OM spamming with equal consequent requests

If you read a huge file through s3g, requests for the same ozone key details will be sent to OM. But the response to the requests will be the same. Hence, the result of the first request could be cached, and the following requests for the key would be redirected to the cache

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11636

How was this patch tested?

A new unit-test is implemented to check that the key details will be retrieved from the cache if consequent requests are called in a short period. Existing robot tests check that the cache will be invalidated if the key/key_metadata/key_tags are updated.

…y through s3g (ObjectEndpoint.get)

ivandika3 · 2024-11-11T09:29:21Z

@vtutrinov Thanks for the patch.

While I understand the reasoning behind this change is to reduce the number of OM calls, I'm not sure about this caching solution. While single S3G will work correctly, the consistency guarantee breaks quickly if there are concurrent requests to multiple S3Gs (e.g. a reverse proxy point to multiple S3Gs). That's why S3G should be as stateless as possible.

For example, there are two S3Gs (s3g1 and s3g2) for a single "key1". If a request download the "key1" through s3g1, it will be cached in s3g1. Afterwards, another request upload "key1" with a new body through s3g2. The cache in s3g1 will not be invalidated and a download request to s3g1 will still return the stale key content. This means that there are two versions of the same object until the cache entry expires. Even worse, the data of the overwritten might already be deleted while it's still in the S3G cache (provided the cache configuration is very large and the underlying blocks have been deleted).

IMO, a possible implementation would require some kind of cache coherence implementation (witness) similar to the one shown https://www.allthingsdistributed.com/2021/04/s3-strong-consistency.html

I think we can first revisit #5565 to implement ranged GET request for a specified part on the OM-side (instead of only on S3G) so that it only returns the List<OmKeyInfoLocation> belonging to the particular part and reduce the message size returned from OM side. However, you need to first verify whether the AWS S3 SDK downloads the large MPU key using the MPU part ranged get.

vtutrinov · 2024-11-11T10:18:18Z

@ivandika3 Thanks for the comment, sounds reasonable, will rethink a solution again and deeper

ivandika3 · 2025-03-30T05:42:19Z

Let's close this first since the alternative implementation of HDDS-11699 #7558 has been merged.

Introducing a caching layer to Ozone would be a large endeavor and should require a comprehensive design doc. We can explore these two systems that implement a caching layer on top of object storage

JuiceFS (https://juicefs.com/docs/community/guide/cache/)
Alluxio (https://docs.alluxio.io/ee-da/user/stable/en/core-services/Caching.html)

HDDS-11636. cache ozone key details (OM.getKeyInfo) on getting the ke…

429bac9

…y through s3g (ObjectEndpoint.get)

adoroszlai marked this pull request as draft November 11, 2024 15:21

ivandika3 closed this Mar 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-11636. Cache ozone key details #7419

HDDS-11636. Cache ozone key details #7419

Uh oh!

vtutrinov commented Nov 11, 2024

Uh oh!

ivandika3 commented Nov 11, 2024 •

edited

Loading

Uh oh!

vtutrinov commented Nov 11, 2024

Uh oh!

ivandika3 commented Mar 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HDDS-11636. Cache ozone key details #7419

HDDS-11636. Cache ozone key details #7419

Uh oh!

Conversation

vtutrinov commented Nov 11, 2024

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

ivandika3 commented Nov 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vtutrinov commented Nov 11, 2024

Uh oh!

ivandika3 commented Mar 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ivandika3 commented Nov 11, 2024 •

edited

Loading

ivandika3 commented Mar 30, 2025 •

edited

Loading