Skip to content

Reduce memory usage during collection deletion#2492

Merged
gerrod3 merged 4 commits into
pulp:mainfrom
Funi1234:fix/AAP-53296
Apr 14, 2026
Merged

Reduce memory usage during collection deletion#2492
gerrod3 merged 4 commits into
pulp:mainfrom
Funi1234:fix/AAP-53296

Conversation

@Funi1234
Copy link
Copy Markdown
Contributor

@Funi1234 Funi1234 commented Apr 2, 2026

Reduce memory usage during collection deletion

Problem

Deleting collections with many versions causes worker processes to consume excessive memory and get killed with SIGKILL. This occurs because Django loads all fields (including large JSON fields) for each CollectionVersion into memory. When multiplied across many versions, this causes the worker process to be terminated.

Solution

This PR applies targeted QuerySet field selection using .only() to load only the fields needed for deletion operations, avoiding the large JSON fields that aren't required.

Changes:

  • pulp_ansible/app/galaxy/v3/views.py:
    • Use .only("pk") when iterating collection versions and their repositories in CollectionViewSet.destroy()
    • Batch AnsibleRepository lookup with filter(pk__in=...) to prevent N+1 queries
    • Use .only("namespace", "name", "version") when loading collection dependents
  • pulp_ansible/app/tasks/deletion.py:
    • Use .only("pk") when loading collection versions and iterating their repositories in delete_collection() task

By loading only the required fields, we avoid pulling large JSON blobs (docs_blob, metadata, dependencies, etc.) into memory when they're not needed for the deletion operation.

Related Work

@Funi1234
Copy link
Copy Markdown
Contributor Author

Funi1234 commented Apr 2, 2026

This will need to be backported for 0.25 (AAP 2.4/2.5/2.6) and 0.28 (AAP upstream/2.7) please.

Copy link
Copy Markdown
Contributor

@gerrod3 gerrod3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_collection_deferred_fields is a blunt instrument used to work around having to change multiple spots in our collection sync code (some of them being in pulpcore!) and as such we shouldn't overuse when trying to optimize pulp-ansible. For this case we can easily fix the issues by being smarter with the querysets we write.

Comment thread pulp_ansible/app/galaxy/v3/views.py Outdated
Comment thread pulp_ansible/app/tasks/deletion.py Outdated
@Funi1234 Funi1234 requested a review from gerrod3 April 4, 2026 21:43
Comment thread pulp_ansible/app/galaxy/v3/views.py
@Funi1234 Funi1234 requested a review from gerrod3 April 7, 2026 10:06
Copy link
Copy Markdown
Contributor

@gerrod3 gerrod3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes!

@gerrod3 gerrod3 merged commit fca4331 into pulp:main Apr 14, 2026
13 of 14 checks passed
@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.21: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.21/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.21/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.21
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.21/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.22: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.22/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.22/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.22
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.22/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.24: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.24/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.24/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.24
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.24/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.29: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.29/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.29/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.29
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.29/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.25: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.25/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.25/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.25
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.25/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

@patchback
Copy link
Copy Markdown

patchback Bot commented Apr 14, 2026

Backport to 0.28: 💔 cherry-picking failed — conflicts found

❌ Failed to cleanly apply fca4331 on top of patchback/backports/0.28/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492

Backporting merged PR #2492 into main

  1. Ensure you have a local repo clone of your fork. Unless you cloned it
    from the upstream, this would be your origin remote.
  2. Make sure you have an upstream repo added as a remote too. In these
    instructions you'll refer to it by the name upstream. If you don't
    have it, here's how you can add it:
    $ git remote add upstream https://github.com/pulp/pulp_ansible.git
  3. Ensure you have the latest copy of upstream and prepare a branch
    that will hold the backported code:
    $ git fetch upstream
    $ git checkout -b patchback/backports/0.28/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492 upstream/0.28
  4. Now, cherry-pick PR Reduce memory usage during collection deletion #2492 contents into that branch:
    $ git cherry-pick -x fca4331513d9d85b0580873555a48efbbc7c2d57
    If it'll yell at you with something like fatal: Commit fca4331513d9d85b0580873555a48efbbc7c2d57 is a merge but no -m option was given., add -m 1 as follows instead:
    $ git cherry-pick -m1 -x fca4331513d9d85b0580873555a48efbbc7c2d57
  5. At this point, you'll probably encounter some merge conflicts. You must
    resolve them in to preserve the patch from PR Reduce memory usage during collection deletion #2492 as close to the
    original as possible.
  6. Push this branch to your fork on GitHub:
    $ git push origin patchback/backports/0.28/fca4331513d9d85b0580873555a48efbbc7c2d57/pr-2492
  7. Create a PR, ensure that the CI is green. If it's not — update it so that
    the tests and any other checks pass. This is it!
    Now relax and wait for the maintainers to process your pull request
    when they have some cycles to do reviews. Don't worry — they'll tell you if
    any improvements are necessary when the time comes!

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 14, 2026
mdellweg pushed a commit that referenced this pull request Apr 15, 2026
mdellweg pushed a commit that referenced this pull request Apr 15, 2026
mdellweg pushed a commit that referenced this pull request Apr 15, 2026
mdellweg pushed a commit that referenced this pull request Apr 15, 2026
mdellweg pushed a commit that referenced this pull request Apr 22, 2026
gerrod3 added a commit to gerrod3/pulp_ansible that referenced this pull request Apr 22, 2026
mdellweg pushed a commit that referenced this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants