Skip to content

feat: batch expired snapshots#5486

Merged
eakmanrq merged 2 commits intomainfrom
eakmanrq/add_batching_to_expired_snapshots
Oct 8, 2025
Merged

feat: batch expired snapshots#5486
eakmanrq merged 2 commits intomainfrom
eakmanrq/add_batching_to_expired_snapshots

Conversation

@eakmanrq
Copy link
Collaborator

@eakmanrq eakmanrq commented Oct 6, 2025

Batches the deletion of expired snapshots by paging over the candidates using the updated_at, name, and identifier as stable fields to do the paging. Simple viz explaining what is going on (just shows updated_at but know that it includes all 3 columns):

Screenshot 2025-10-06 at 9 46 31 AM

In that example, 3 batches are processed. Getting expired snapshots a limited since it would be unbounded otherwise. Instead of passing though snapshots directly into delete instead the "boundary" is passed into the delete. That becomes the "upper boundary" for the delete to then look backwards to perform the deletes. Limit is not needed since this is bounded. There is an underlying assumption that these upper boundaries are provided in paged order (on other words, the last batch isn't provided first since that would remove the benefit of batching).

This approach compared to passing in the deleted snapshot ids directly has the following benefits:

  • In a client/server architecture this places less trust on the input from the client by asking for just the boundary instead of the explicit list of snapshot ids to delete
  • Also in client/server architecture, less information needs to be passed to perform the delete
  • If something were to change between the calculation of the batch and the eventual deletion then this approach will correctly not delete the snapshot while if the explicit list if provided then the delete would still happen

@eakmanrq eakmanrq force-pushed the eakmanrq/add_batching_to_expired_snapshots branch 2 times, most recently from c12b748 to afe4d1e Compare October 6, 2025 21:09
Copy link
Collaborator

@erindru erindru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite grok UpperBatchBoundary vs LowerBatchBoundary and when to use one vs the other. I can see the problem that it solves though

batch_boundary: BatchBoundary,
current_ts: t.Optional[int] = None,
ignore_ttl: bool = False,
) -> t.Optional[ExpiredSnapshotBatch]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to make this t.Optional vs just returning an empty list if there are no expired snapshots like it did before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct

@eakmanrq eakmanrq force-pushed the eakmanrq/add_batching_to_expired_snapshots branch from afe4d1e to b3c689f Compare October 6, 2025 22:09
@eakmanrq eakmanrq force-pushed the eakmanrq/add_batching_to_expired_snapshots branch 5 times, most recently from 17019ad to 1096c84 Compare October 7, 2025 15:31

def delete_expired_snapshots(
state_sync: StateSync,
snapshot_evaluator: SnapshotEvaluator,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, I don't think this belongs here. I don't believe State Sync should have anything to do with SnapshotEvaluator. Plus this creates a circular dependency between modules.

The logical relationship should be SnapshotEvaluator -> StateSync.

Anticipating your question - yes, I think putting cleanup_expired_views in here was a mistake.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you want it located then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps it's time to create the core/janitor.py module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider this out of scope for this PR? I'm open to following up if you think it is important but it doesn't seem like this should be required to get this change in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. But I do think it's important

@eakmanrq eakmanrq force-pushed the eakmanrq/add_batching_to_expired_snapshots branch 2 times, most recently from 08c6053 to 0f2211c Compare October 7, 2025 17:16
@eakmanrq eakmanrq force-pushed the eakmanrq/add_batching_to_expired_snapshots branch from 0f2211c to c95209a Compare October 7, 2025 17:39
@eakmanrq eakmanrq merged commit bbdbd48 into main Oct 8, 2025
34 of 36 checks passed
@eakmanrq eakmanrq deleted the eakmanrq/add_batching_to_expired_snapshots branch October 8, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants