Skip to content

Migrate kubectl-datadog autoscaling cluster to client-go pager#2960

Merged
L3n41c merged 2 commits intomainfrom
lenaic/use-pager
Apr 30, 2026
Merged

Migrate kubectl-datadog autoscaling cluster to client-go pager#2960
L3n41c merged 2 commits intomainfrom
lenaic/use-pager

Conversation

@L3n41c
Copy link
Copy Markdown
Member

@L3n41c L3n41c commented Apr 29, 2026

What does this PR do?

Migrates the kubectl-datadog autoscaling cluster install/uninstall commands from manual Kubernetes list pagination loops (metav1.ListOptions{Limit, Continue}) to client-go's pager.ListPager.EachListItem. Replaces several lo.SomeBy / lo.ContainsBy calls with stdlib slices.ContainsFunc where the semantics match.

Motivation

The previous code mixed manual pagination loops at three sites (Node and Deployment listings) with non-paginated List() calls at others. Moving to pager.ListPager.EachListItem gives a consistent high-level idiom and extends pagination to two sites that previously risked OOM on dense clusters: detectClusterAutoscaler (Deployments cluster-wide) and listKarpenterNodes (label-filtered Nodes).

In GetNodesProperties, the per-batch EC2 work was implicitly tied to the K8s page size. The migration decouples them via an explicit ec2DescribeBatchSize = 100, and stores only the Node fields downstream needs (Labels, Taints) in a small pendingNode struct — avoiding both deep corev1.Node copies and the pinning of pager page backing arrays through long-lived *corev1.Node references.

Describe how to test/QA your changes

  • go test ./cmd/kubectl-datadog/autoscaling/cluster/...
  • make lint
  • make kubectl-datadog
  • Manual smoke test: kubectl datadog autoscaling cluster install and uninstall on an EKS cluster.

Possible Drawbacks / Trade-offs

  • detectClusterAutoscaler no longer warns when several Cluster Autoscaler Deployments coexist on the same cluster. The cluster-wide enumeration order is unspecified, so when multiple matches coexist (a configuration we don't expect in practice), which one wins is non-deterministic. We accept that to keep the scan short-circuited and avoid materialising every Deployment in memory.

Additional Notes

None.

🤖 Generated with Claude Code

Replace the manual metav1.ListOptions{Limit, Continue} pagination loops
with client-go's pager.ListPager.EachListItem in the kubectl-datadog
autoscaling cluster commands. Extend pagination to two sites that
previously fetched everything in one call: detectClusterAutoscaler
(Deployments cluster-wide) and listKarpenterNodes (label-filtered
Nodes).

In GetNodesProperties, decouple the EC2 batch size from the K8s page
size via an explicit ec2DescribeBatchSize, and use a small pendingNode
struct holding only Labels and Taints — the fields downstream needs —
to avoid copying full Node structs and pinning pager page backing
arrays.

Also replace lo.SomeBy / lo.ContainsBy with stdlib slices.ContainsFunc
where the semantics match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@L3n41c L3n41c added this to the v1.27.0 milestone Apr 29, 2026
@L3n41c L3n41c changed the title Migrate kubectl-datadog autoscaling cluster to client-go pager Migrate kubectl-datadog autoscaling cluster to client-go pager Apr 29, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 42.53731% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.37%. Comparing base (099f33f) to head (43b5c54).

Files with missing lines Patch % Lines
...toscaling/cluster/install/guess/nodesproperties.go 1.61% 61 Missing ⚠️
...datadog/autoscaling/cluster/uninstall/uninstall.go 0.00% 12 Missing ⚠️
...autoscaling/cluster/common/clusterinfo/classify.go 90.00% 3 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2960      +/-   ##
==========================================
- Coverage   41.38%   41.37%   -0.02%     
==========================================
  Files         327      327              
  Lines       28952    28960       +8     
==========================================
- Hits        11982    11981       -1     
- Misses      16109    16120      +11     
+ Partials      861      859       -2     
Flag Coverage Δ
unittests 41.37% <42.53%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...g/autoscaling/cluster/install/guess/eksautomode.go 88.88% <100.00%> (ø)
...oscaling/cluster/install/guess/foreignkarpenter.go 100.00% <100.00%> (ø)
...autoscaling/cluster/common/clusterinfo/classify.go 91.52% <90.00%> (+2.54%) ⬆️
...datadog/autoscaling/cluster/uninstall/uninstall.go 0.00% <0.00%> (ø)
...toscaling/cluster/install/guess/nodesproperties.go 11.93% <1.61%> (-0.72%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 099f33f...43b5c54. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@L3n41c
Copy link
Copy Markdown
Member Author

L3n41c commented Apr 29, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@L3n41c L3n41c marked this pull request as ready for review April 30, 2026 10:15
@L3n41c L3n41c requested a review from a team April 30, 2026 10:15
@L3n41c L3n41c requested a review from a team as a code owner April 30, 2026 10:15
Fold the predicate into the slices.ContainsFunc lambda inside
hasKarpenterControllerContainer (its only caller), and collapse
"if a { return true } return b" into "return a || b" — both branches
were structurally equivalent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@datadog-prod-us1-4

This comment has been minimized.

@L3n41c L3n41c merged commit 658de74 into main Apr 30, 2026
44 of 45 checks passed
@L3n41c L3n41c deleted the lenaic/use-pager branch April 30, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants