-
Notifications
You must be signed in to change notification settings - Fork 489
JAMES-4156 ADR: Deleted message vault single bucket usage #2894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Arsnael
wants to merge
2
commits into
apache:master
Choose a base branch
from
Arsnael:adr-dmv-single-bucket
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| # 76. Deleted Message Vault: single bucket usage | ||
|
|
||
| Date: 2026-02-01 | ||
|
|
||
| ## Status | ||
|
|
||
| Accepted (lazy consensus). | ||
|
|
||
| ## Context | ||
|
|
||
| At the moment, the current [deleted message vault](https://issues.apache.org/jira/browse/JAMES-4156) uses multiple buckets to store deleted messages of users. Each bucket is generated | ||
| with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. | ||
|
|
||
| Then we when run the purge tasks, every bucket that is older than the defined retention period is being deleted. | ||
|
|
||
| However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by | ||
| doing multiple API calls on multiple buckets at once. Also some provider, like OVH, put limits on count of buckets per account. | ||
|
|
||
| ## Decision | ||
|
|
||
| Using a single bucket for storing deleted messages instead. The objects in the single bucket would be following this name pattern: | ||
| `[year]/[month]/[blob_id]`. S3 buckets are flat but we can still use the year and month as a prefix for the object name. | ||
|
|
||
| For this we can: | ||
|
|
||
| - provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. | ||
| - write only on the single bucket, fall back if necessary on old buckets for read and delete | ||
| - add the single bucket usage case to the purge task, that would do cleaning on both new and old buckets. | ||
|
|
||
| ## Consequences | ||
|
|
||
| - easier to maintain, only one bucket | ||
| - keep the bucket count for James low on S3 object storages | ||
| - read/write/delete operations on only one bucket, not multiple | ||
| - James would no longer require rights to create buckets at runtime when the deleted message vault is enabled | ||
| - migration is simple: old buckets will get removed with time until only the new single bucket remains | ||
|
|
||
| # Alternatives | ||
|
|
||
| Specific James implementation could overload an unchanged deleted message vault and provide their own however we believe | ||
| the problem and complexity of operating atop multiple bucket is detrimental for others in the community for minimal to no gains. | ||
|
|
||
Arsnael marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| # References | ||
|
|
||
| - [0075-deleted-message-vault.md](0075-deleted-message-vault.md) | ||
| - [Deleted Message Vault: use a single bucket](https://issues.apache.org/jira/browse/JAMES-4156) | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect
src/adrto be about James architecture.If I understand correctly, this one is about Delete Message Vault extension.
If so, maybe introducing a way to split extensions ADRs from the main codebase would help limit the amount of ADRs one has to read to understand James?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the overall goal to keep James understandable.
On my side a conflicting goal come to mind: discoverability. Having a single folder for ADRs makes it a single place to speak architecture and at least I know where to look.
There might exist nice solution to conciliate both goals (like referencing new ADR locations in James main ADR location like establishing
mailbox/src/adr?) - maybe other projects using ADRs have already soved this problem.Also would we intend to relocate the concerned ADRs?
IMO it would be a awesome mailing list topic!
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbaechler Not against it, but I agree with @chibenwa on the fact that then it should be discussed with the community (ML) and should be treated in an other PR when we reach a proper consensus
If we start having adrs all around the place though I'm afraid it might be confusing. What about subfolders instead?
I think it's clear, it's still in one place, subfolders allow easily for someone to just look at the core James ADRs if he just wishes to understand James basic core concepts, and also still easy to find extensions ADRs if one needs them? It would be confusing IMO to go look into the module of those extensions to find potential ADRs related to it.
? Will start a ML anyways on this today :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ML started
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Arsnael your proposal seems nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm seems my mail did not reach the server-dev ML for some reason sorry... Will see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright mistake on the destination first time (I changed laptop sorry)... Now the ML is created.
@mbaechler are you ok if we merge this ADR now and reorganize later when the consensus is reached via the ML discussion? :)