-
Notifications
You must be signed in to change notification settings - Fork 3k
[Draft] Snapshot reference retention evaluation #4006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Snapshot reference retention evaluation #4006
Conversation
Followed up on comments on apache#3104 Remove unused interface method for getting at all snapshot refs for a given snapshot id. TableMetadata removeSnapshots accepts a list of snapshot IDs. It's up to callers to build a list of the expected snapshot IDs to remove. Added validation that a -1 snapshot id can only happen for the main branch reference. Co-authored-by: wangzeyu <1249369293@qq.com> Co-authored-by: Jack Ye <yzhaoqin@amazon.com>
|
|
||
| public SnapshotRef build() { | ||
| return new SnapshotRef(snapshotId, type, minSnapshotsToKeep, maxSnapshotAgeMs, maxRefAgeMs); | ||
| return new SnapshotRef(snapshotId, type, minSnapshotsToKeep, maxSnapshotAgeMs, maxRefAgeMs, System.currentTimeMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am doubting this is the right way to set the timestamp. We'll need to make sure this is accurate at the time we commit the metadata operation.
|
@jackye1995 @rdblue It would be great to get your feedback on this Draft PR. Really appreciate it! |
4385f9a to
eeae071
Compare
eeae071 to
c54b9fa
Compare
| for (SnapshotRef ref : refs) { | ||
| if (ref.type().equals(SnapshotRefType.BRANCH)) { | ||
| Snapshot startingSnapshot = base.snapshot(ref.snapshotId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are looping over all refs including ones that could have been expired. Then when evaluating retention we use the branch policy. Although, the ref is expired. If we want the default to be that we fall back to global policy if the reference itself is expired, we should update this logic.
| Set<Long> ancestorsToRetain = Sets.newHashSet(); | ||
| while (ancestorIdx < ancestors.size()) { | ||
| Snapshot ancestor = tableMetadata.snapshot(ancestors.get(ancestorIdx)); | ||
| long comparisonAge = maxSnapshotAge == null ? globalExpireOlderThan : Math.max(maxSnapshotAge, globalExpireOlderThan); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on what we want the global expireOlderThan to be used for (if at all). Currently as implemented, the branch does not really "override" the global policy if the global policy has only allows for older snapshots to be expired.
This is a draft PR for addressing #3897. It builds on the implementation of references in this PR #3883 . While we get clarity on what the APIs should look like, wanted to start a draft PR on the retention logic based on https://docs.google.com/document/d/1PvxK_0ebEoX3s7nS6-LOJJZdBYr_olTWH9oepNUfJ-A/edit#.
Some aspects I want to get feedback on:
1.) Currently, for the global expiration age, we set the expiration age in the operation based on a timestamp obtained in the constructor (this can be overridden in a setter as well). However in the calculation in the draft we are using timestamps closer to the time of calculating what should be retained. So we should certainly define a consistent time for comparisons.
2.) For the retention policy evaluation, we identify what are the snapshots to retain based on branch policies such as minSnapshots and max age for the branch and the global max age. After this it could be possible that we have not retained the min snapshots for the table level. So we go through the snapshots we would expire, using a heap to identify what are the latest snapshots to retain to reach the global table policy. Those snapshots are removed from expiration, and the remaining set is then passed to the removeSnapshots API. This logic maybe overkill so would like to get feedback on that.
3.) Related to 2, would also like to get feedback on how global snapshot age should fit in with retention? Should it be possible for global snapshot age to override what's on the branch level (as implemented)? This would force users to really have to think about what's defined at the table level and the branch level to make sure they get the retention experience they desire.
There's a lot of code cleanup to do, and will definitely write tests which span multiple commit graph shapes and retention policies as we get clarity on what we want the behavior to be.
Thank you!