From 38b459943f0151e5129d321ecfcfd2c8f48540f2 Mon Sep 17 00:00:00 2001 From: Rene Cordier Date: Fri, 2 Jan 2026 15:34:31 +0700 Subject: [PATCH 1/2] JAMES-4156 ADR: Deleted message vault single bucket usage --- ...076-deleted-message-vault-single-bucket.md | 40 +++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 src/adr/0076-deleted-message-vault-single-bucket.md diff --git a/src/adr/0076-deleted-message-vault-single-bucket.md b/src/adr/0076-deleted-message-vault-single-bucket.md new file mode 100644 index 00000000000..9feb334a7e8 --- /dev/null +++ b/src/adr/0076-deleted-message-vault-single-bucket.md @@ -0,0 +1,40 @@ +# 76. Deleted Message Vault: single bucket usage + +Date: 2026-02-01 + +## Status + +Accepted (lazy consensus). + +## Context + +At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. + +Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. + +However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by +doing multiple API calls on multiple buckets at once. + +## Decision + +Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. + +For this we can: + +- provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. +- write only on the single bucket, fall back if necessary on old buckets for read and delete +- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. + +## Consequences + +- easier to maintain, only one bucket! +- keep the bucket count for James low on S3 object storages +- read/write/delete operations on only one bucket, not multiple. + +# References + +- [0075-deleted-message-vault.md](0075-deleted-message-vault.md) +- [Deleted Message Vault: use a single bucket](https://issues.apache.org/jira/browse/JAMES-4156) + From d4de2ca8bdb91cde52433028da52399473428ce3 Mon Sep 17 00:00:00 2001 From: Rene Cordier Date: Mon, 5 Jan 2026 10:19:33 +0700 Subject: [PATCH 2/2] fixup! JAMES-4156 ADR: Deleted message vault single bucket usage --- ...076-deleted-message-vault-single-bucket.md | 23 ++++++++++++------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/src/adr/0076-deleted-message-vault-single-bucket.md b/src/adr/0076-deleted-message-vault-single-bucket.md index 9feb334a7e8..e7c7ba6e6b6 100644 --- a/src/adr/0076-deleted-message-vault-single-bucket.md +++ b/src/adr/0076-deleted-message-vault-single-bucket.md @@ -8,30 +8,37 @@ Accepted (lazy consensus). ## Context -At the moment, the current deleted message vault uses multiple buckets to store deleted messages of users. Each bucket is generated +At the moment, the current [deleted message vault](https://issues.apache.org/jira/browse/JAMES-4156) uses multiple buckets to store deleted messages of users. Each bucket is generated with a name corresponding to a year and a month, following this pattern: `deleted-messages-[year]-[month]-01`. -Then we when run the GC tasks, every bucket that is older than the defined retention period is being deleted. +Then we when run the purge tasks, every bucket that is older than the defined retention period is being deleted. However, this solution can be a bit costly in terms of bucket count with S3 object storages and can affect performance by -doing multiple API calls on multiple buckets at once. +doing multiple API calls on multiple buckets at once. Also some provider, like OVH, put limits on count of buckets per account. ## Decision -Using a single bucket for storing deleted messages instead! The objects in the single bucket would be following this name pattern: -`[year]/[month]/[blob_id]`. S3 buckets are flat but we cna still use the year and month as a prefix for the object name. +Using a single bucket for storing deleted messages instead. The objects in the single bucket would be following this name pattern: +`[year]/[month]/[blob_id]`. S3 buckets are flat but we can still use the year and month as a prefix for the object name. For this we can: - provide a new implementation for the blob store deleted message vault that would store deleted messages on a single bucket. - write only on the single bucket, fall back if necessary on old buckets for read and delete -- add the single bucket usage case to the GC task, that would do cleaning on both new and old buckets. +- add the single bucket usage case to the purge task, that would do cleaning on both new and old buckets. ## Consequences -- easier to maintain, only one bucket! +- easier to maintain, only one bucket - keep the bucket count for James low on S3 object storages -- read/write/delete operations on only one bucket, not multiple. +- read/write/delete operations on only one bucket, not multiple +- James would no longer require rights to create buckets at runtime when the deleted message vault is enabled +- migration is simple: old buckets will get removed with time until only the new single bucket remains + +# Alternatives + +Specific James implementation could overload an unchanged deleted message vault and provide their own however we believe +the problem and complexity of operating atop multiple bucket is detrimental for others in the community for minimal to no gains. # References