Skip to content

helm-operator: reduce cache memory footprint#6377

Merged
joelanford merged 1 commit intooperator-framework:masterfrom
joelanford:helm-operator-cache-selectors
Oct 4, 2023
Merged

helm-operator: reduce cache memory footprint#6377
joelanford merged 1 commit intooperator-framework:masterfrom
joelanford:helm-operator-cache-selectors

Conversation

@joelanford
Copy link
Copy Markdown
Member

@joelanford joelanford commented Mar 24, 2023

Description of the change:
This PR updates the Helm operator to use a custom cache function so that we can:

  1. Use informer label selectors (rather than a predicate) to watch only the primary CRs that match the selector in the watches.yaml file
  2. Use informer label selectors to watch only the operand objects that are being managed by the operator (rather than ALL objects that have the same kind that the operator happens to be managing)

Motivation for the change:
This should be a significant improvement in the memory usage of helm-operator, especially on clusters with a large number of objects.

Closes #6255

Checklist

If the pull request includes user-facing changes, extra documentation is required:

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2023
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 01:37 — with GitHub Actions Inactive
@joelanford joelanford force-pushed the helm-operator-cache-selectors branch from eec96ff to 3611b3b Compare March 24, 2023 20:10
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:10 — with GitHub Actions Inactive
@joelanford joelanford force-pushed the helm-operator-cache-selectors branch from 3611b3b to 7ab70c8 Compare March 24, 2023 20:19
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:19 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:20 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy March 24, 2023 20:20 — with GitHub Actions Inactive
@joelanford joelanford reopened this Sep 19, 2023
@joelanford joelanford added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Sep 19, 2023
@joelanford joelanford force-pushed the helm-operator-cache-selectors branch from 0d1ab36 to c6eca8f Compare September 21, 2023 12:05
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 21, 2023
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:05 — with GitHub Actions Inactive
@joelanford joelanford force-pushed the helm-operator-cache-selectors branch from c6eca8f to fd49b96 Compare September 21, 2023 12:29
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy September 21, 2023 12:30 — with GitHub Actions Inactive
@joelanford joelanford force-pushed the helm-operator-cache-selectors branch from fd49b96 to ce8c010 Compare October 2, 2023 13:31
@joelanford joelanford temporarily deployed to deploy October 2, 2023 13:31 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy October 2, 2023 13:31 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy October 2, 2023 13:31 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy October 2, 2023 13:31 — with GitHub Actions Inactive
@joelanford joelanford temporarily deployed to deploy October 2, 2023 13:31 — with GitHub Actions Inactive
@joelanford joelanford marked this pull request as ready for review October 2, 2023 13:31
Comment thread internal/cmd/helm-operator/run/cmd.go Outdated
defaultSelector := labels.NewSelector().Add(*req)

return cache.BuilderWithOptions(cache.Options{
SelectorsByObject: selectorsByObject,
Copy link
Copy Markdown
Member

@varshaprasad96 varshaprasad96 Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to understand this right, we cache objects based on labels from all the charts in watches.yaml only when - a single namespace, or all namespaces are watched. When multiple namespaces are watched, this gets overridden here (

options.NewCache = cache.MultiNamespacedCacheBuilder(strings.Split(namespace, ","))
) and we watch all the objects from the specified multiple namespaces. Is this intentional?

Seems like it shouldn't be the case, and if we are adding a label selector for caching then single or multiple namespaces shouldn't matter. Or am I missing something?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch. I totally missed that. We might need to bump the controller-runtime dependency to get the new and improved cache options to handle that case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure on how urgent it is to have this in. We can either track this through an issue and make sure we get this fixed as a follow up after the bump or block the PR on #6514

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I managed to account for multiple namespaces without having to bump controller-runtime. PTAL!

Make use of label selectors used by informers for both the primary CR
and for chart manifest objects

Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
Copy link
Copy Markdown
Member

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Disable caching for Helm Operator when watching all namespaces to avoid OOMKilled in big clusters

5 participants