[PULP-402] Greatly improve memory usage during collection syncs#2454
Conversation
The memory reduction comes in two parts: 1. Refactor first stage to limit number of coroutines running at once to 100 2. Avoid when possible json serialization of docs_blob, files, manifest & contents fields The memory growth should remain flat (or extremely minimal) after first batch (500) processing of collection versions. Most syncs should now complete with less than 1 GB of memory usage. https://issues.redhat.com/browse/PULP-402 Assssisted by: clause-opus-4.6
mdellweg
left a comment
There was a problem hiding this comment.
This is a rather huge change. But there is a recurring theme: Too much unbounded data fields (do we really need them) in the database put a lot of pressure on the memory. Can we gain something by moving them into a separate one-to-one table?
| async def _fetch_collection_version_metadata(self, api_version, collection_version_url): | ||
| async def _fetch_collection_version_metadata( | ||
| self, api_version, collection_version_url | ||
| ) -> list[Coroutine]: |
There was a problem hiding this comment.
There's probably no use for adding type hints if you don't check them. Also I think what you want to return is Awaitable.
| with open(docs_blob_path, "r") as docs_blob_file: | ||
| blob = docs_blob_file.read() | ||
| sql = ( | ||
| "UPDATE ansible_collectionversion" |
There was a problem hiding this comment.
This sounds particularly troubling. Content is supposed to be immutable, because it is shared.
Update violates it. That means there is a race condition here.
There was a problem hiding this comment.
My current reasoning is that there is no race condition based on the checks I do before hand. Before we reach the ContentSaver stage we go through the QueryContent stage, so there are two outcomes from this. 1. Content did not exists before, 2. Content exists and was replaced on the dcontent. If the content already existed then this part isn't ran. Now if we are creating the content there are two scenarios: 1. We are the first task to create the content, 2. We are actually the second+ task to create it, another parallel task/upload got to it first. The checks on whether the pulp_id is the same across _pre_save and _post_save is meant to distinguish between the two scenarios. We only want to update in scenario one to ensure only one task is updating the content post fact. This should prevent race conditions between parallel tasks.
Backport to 0.24: 💔 cherry-picking failed — conflicts found❌ Failed to cleanly apply afdaf7c on top of patchback/backports/0.24/afdaf7cdf96f38dda4b92e8d9eca3e661c2c181a/pr-2454 Backporting merged PR #2454 into main
🤖 @patchback |
Backport to 0.29: 💚 backport PR created✅ Backport PR branch: Backported as #2472 🤖 @patchback |
Backport to 0.25: 💔 cherry-picking failed — conflicts found❌ Failed to cleanly apply afdaf7c on top of patchback/backports/0.25/afdaf7cdf96f38dda4b92e8d9eca3e661c2c181a/pr-2454 Backporting merged PR #2454 into main
🤖 @patchback |
Backport to 0.28: 💔 cherry-picking failed — conflicts found❌ Failed to cleanly apply afdaf7c on top of patchback/backports/0.28/afdaf7cdf96f38dda4b92e8d9eca3e661c2c181a/pr-2454 Backporting merged PR #2454 into main
🤖 @patchback |
|
@gerrod3 Is the |
Backport to 0.22: 💔 cherry-picking failed — conflicts found❌ Failed to cleanly apply afdaf7c on top of patchback/backports/0.22/afdaf7cdf96f38dda4b92e8d9eca3e661c2c181a/pr-2454 Backporting merged PR #2454 into main
🤖 @patchback |
The memory reduction comes in two parts:
The memory growth should remain flat (or extremely minimal) after first batch (500) processing of collection versions. Most syncs should now complete with less than 1 GB of memory usage.
https://issues.redhat.com/browse/PULP-402
Assssisted by: clause-opus-4.6
Results
Perform a sync on a clean system with this requirements file:
{ 'collections': [ {'name': 'amazon.aws'}, {'name': 'community.general'}, {'name': 'ansible.netcommon'}, {'name': 'ansible.utils'}, {'name': 'ansible.posix'}, {'name': 'community.crypto'} ] }Explanation
The majority of the savings comes from not loading the JSON fields into memory as dicts. The fields can be relatively large dicts, with docs_blob sometimes being multiple MB. Python's pymalloc allocator fragments memory for the many small nested objects within the field's dict and never returns it to the OS, causing RSS to grow monotonically. We bypass the object allocation with three changes:
The second part of the savings comes from refactoring the first stage to limit the number of concurrently running coroutines. We spawn 1 coroutine per collection version to sync and each one does a fetch to grab its metadata. This metadata is again JSON that gets trapped in memory until the coroutine finishes. Because we ran them all at once this would cause a rapid increase of memory per batch that wouldn't decrease till the first stage finished. We fix this with these three changes:
runmethod uses a singlegatherwith a batched (100) list of the coroutines.add_collection_versionto be able to free it as soon as possible