Skip to content

Gather EndpointSlices rather than Endpoints#74084

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
danwinship:gather-extra-endpointslices
Feb 6, 2026
Merged

Gather EndpointSlices rather than Endpoints#74084
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
danwinship:gather-extra-endpointslices

Conversation

@danwinship
Copy link
Copy Markdown
Contributor

The Endpoints API is deprecated (https://kubernetes.io/blog/2025/04/24/endpoints-deprecation/). We should be gathering EndpointSlices, not Endpoints. (The EndpointSlices have all of the information that the Endpoints do, plus information about terminating endpoints, topology, and dual-stack endpoints.)

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2026
@openshift-ci openshift-ci Bot requested review from neisw and smg247 January 28, 2026 17:08
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@danwinship: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-cloud-provider-gcp-main-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.23-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.22-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.21-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.20-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.19-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.18-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.17-regression-clusterinfra-gcp-ipi-ccm openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-main-okd-scos-e2e-aws-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.21-okd-scos-e2e-aws-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.20-okd-scos-e2e-aws-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.19-okd-scos-e2e-aws-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.18-okd-scos-e2e-aws-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-main-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.23-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.22-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.21-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.20-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.19-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.18-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.17-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.16-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.15-e2e-gcp-ovn openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.14-e2e-gcp-ovn-techpreview openshift/cloud-provider-gcp presubmit Registry content changed
pull-ci-openshift-cloud-provider-gcp-release-4.13-e2e-gcp-ovn-techpreview openshift/cloud-provider-gcp presubmit Registry content changed

A total of 36043 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@danwinship
Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-master-ci-4.22-e2e-gcp-ovn

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@danwinship: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@danwinship
Copy link
Copy Markdown
Contributor Author

/test pull-ci-openshift-cloud-provider-gcp-main-e2e-gcp-ovn

@danwinship
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cloud-provider-gcp-main-e2e-gcp-ovn

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@danwinship: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@danwinship
Copy link
Copy Markdown
Contributor Author

whatever, e2e-gcp is busted right now, but the gather-extra step worked as expected and gathered EndpointSlices rather than Endpoints.

@danwinship
Copy link
Copy Markdown
Contributor Author

/assign @rikatz
(Among other things, this will make it possible to validate openshift/cluster-dns-operator#457 based on its e2e artifacts, which isn't currently possible.)

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Jan 30, 2026

even deprecated, would it make sense to keep gathering both? I am not sure if we (or support team) have some tooling that may rely on it

@danwinship
Copy link
Copy Markdown
Contributor Author

If anyone is depending on Endpoints, they need to update their tooling to use EndpointSlices instead. Endpoints are inherently incomplete (no topology, no terminating endpoints, no dual-stack).

Also, must-gather is for debugging purposes, and for debugging purposes it does not make sense to look at Endpoints, since nothing running in the cluster actually uses Endpoints any more.

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Jan 30, 2026

yes, I mean someone using that for debug tools, etc. I don't really oppose myself to it, I am just not sure someone will miss it. Checking with Miciah here just because he may have some more history on "if we have anyone using endpoints from must-gather to debug" but lgtm

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Jan 30, 2026

The router and CoreDNS migrated to EndpointSlice in OpenShift 4.6 and 4.8, respectively (see openshift/router#154 and openshift/coredns#52; internally, the router converts endpointslices into endpoints objects, which can be confusing, but it is consuming endpointslices).

@alebedev87, are aws-load-balancer-controller and external-dns using the Endpoints API? It looks like the operands support endpointslices, but external-dns-operator configures RBAC only for endpoints, not endpointslices: https://github.com/openshift/external-dns-operator/blob/4abf935dbaf067db15b8418ca9b0ac7541cb256d/bundle/manifests/external-dns_rbac.authorization.k8s.io_v1_clusterrole.yaml#L18

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Jan 30, 2026

Related question: Is aiding in the diagnosis of issues with add-on operators that are using deprecated APIs part of must-gather's purview?

@alebedev87
Copy link
Copy Markdown
Contributor

are aws-load-balancer-controller and external-dns using the Endpoints API? It looks like the operands support endpointslices, but external-dns-operator configures RBAC only for endpoints, not endpointslices: https://github.com/openshift/external-dns-operator/blob/4abf935dbaf067db15b8418ca9b0ac7541cb256d/bundle/manifests/external-dns_rbac.authorization.k8s.io_v1_clusterrole.yaml#L18

Only upstream, downstream we are on very old versions which didn't migrate the code to EndpointSlices yet. Downstream ExternalDNS is based on 0.14.7 and the endpointslices migration was done in 0.18.0. Downstream ALBC is based on 2.8.z version while the EndpointSlices support was added in 2.14.0.

@danwinship
Copy link
Copy Markdown
Contributor Author

downstream we are on very old versions which didn't migrate the code to EndpointSlices yet

(That will need to be fixed as part of OCSTRAT-886 (AWS dual-stack support).)

But anyway, that doesn't really answer the question of "do we need to include Endpoints in the CI artifacts?" The Endpoints data should be a strict subset of the EndpointSlice data, so even if your operator uses Endpoints, you can debug based on the EndpointSlice data instead. (If there ever were mismatches, then presumably that would have caused us debugging problems in the past in the opposite direction. I can't remember that ever happening (and presumably if it ever had, someone would have fixed the lack of EndpointSlices in the artifacts before now...))

I mean, we could keep both, but it seems like a waste of space.

@alebedev87
Copy link
Copy Markdown
Contributor

But anyway, that doesn't really answer the question of "do we need to include Endpoints in the CI artifacts?" The Endpoints data should be a strict subset of the EndpointSlice data, so even if your operator uses Endpoints, you can debug based on the EndpointSlice data instead.

Right, sorry, I was answering Miciah's question. I don't remember going very often to the endpoints level to debug ExternalDNS/ALBO issues but since EndpointSlices will have all available endpoints data anyway I don't see any blocking point. The migration to EndpointSlices for these 2 operators will have to addressed separately.

@danwinship danwinship force-pushed the gather-extra-endpointslices branch from 6211161 to dad96a7 Compare February 5, 2026 15:40
@danwinship
Copy link
Copy Markdown
Contributor Author

/retitle Gather EndpointSlices in addition to Endpoints

@danwinship
Copy link
Copy Markdown
Contributor Author

/retest

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Feb 6, 2026

/lgtm
/approve
The new PR now is collecting endpointslices additionally to endpoints, not excluding one in favor of the other.
Thanks!

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Feb 6, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, rikatz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@danwinship
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@danwinship: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@danwinship
Copy link
Copy Markdown
Contributor Author

/pj-rehearse ack

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@danwinship: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Feb 6, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 6, 2026

@danwinship: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-release-master-ci-4.22-e2e-gcp-ovn 6211161 link unknown /pj-rehearse periodic-ci-openshift-release-master-ci-4.22-e2e-gcp-ovn
ci/rehearse/openshift/cluster-network-operator/master/e2e-gcp-ovn dad96a7 link unknown /pj-rehearse pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 2993e0e into openshift:master Feb 6, 2026
9 of 10 checks passed
@danwinship danwinship deleted the gather-extra-endpointslices branch February 6, 2026 22:20
Sau1506mya pushed a commit to Sau1506mya/release that referenced this pull request Feb 9, 2026
mohit-sheth pushed a commit to mohit-sheth/release that referenced this pull request Feb 9, 2026
richardsonnick pushed a commit to richardsonnick/release that referenced this pull request Feb 18, 2026
memodi pushed a commit to memodi/release that referenced this pull request Feb 18, 2026
dhensel-rh pushed a commit to dhensel-rh/release that referenced this pull request Feb 19, 2026
rrasouli pushed a commit to rrasouli/release that referenced this pull request Mar 3, 2026
kannon92 pushed a commit to kannon92/release that referenced this pull request Mar 3, 2026
wangke19 pushed a commit to wangke19/release that referenced this pull request Mar 4, 2026
rrasouli pushed a commit to rrasouli/release that referenced this pull request Mar 5, 2026
weinliu pushed a commit to weinliu/release that referenced this pull request Mar 6, 2026
sdodson pushed a commit to sdodson/release that referenced this pull request Mar 8, 2026
tareqalayan pushed a commit to tareqalayan/release that referenced this pull request Mar 13, 2026
qiliRedHat pushed a commit to qiliRedHat/release that referenced this pull request Mar 13, 2026
MayXuQQ pushed a commit to MayXuQQ/release that referenced this pull request Mar 17, 2026
sairameshv pushed a commit to sairameshv/release that referenced this pull request Mar 23, 2026
zhouying7780 pushed a commit to zhouying7780/release that referenced this pull request Mar 25, 2026
rrasouli pushed a commit to rrasouli/release that referenced this pull request Mar 25, 2026
anpingli pushed a commit to anpingli/release that referenced this pull request Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants