Skip to content

OCPBUGS-3176: Disable global ipv4 and ipv6 forwarding for OVN deployments#3676

Merged
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
trozet:OCPBUGS-3176
May 23, 2023
Merged

OCPBUGS-3176: Disable global ipv4 and ipv6 forwarding for OVN deployments#3676
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
trozet:OCPBUGS-3176

Conversation

@trozet
Copy link
Copy Markdown
Contributor

@trozet trozet commented Apr 12, 2023

Enable forwarding globally can cause undesirable effects for customers with their server acting as a router (non-default behavior). OVNK now enables and restricts forwarding on a per managed interface basis. So there is no need to enable this globally anymore.

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 12, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@trozet: This pull request references Jira Issue OCPBUGS-3176, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Enable forwarding globally can cause undesirable effects for customers with their server acting as a router (non-default behavior). OVNK now enables and restricts forwarding on a per managed interface basis. So there is no need to enable this globally anymore.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot requested a review from anuragthehatter April 12, 2023 20:55
@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented Apr 12, 2023

/assign @tssurya

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented Apr 12, 2023

/hold

want to launch a cluster bot and ensure this works as expected

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 12, 2023
@openshift-ci openshift-ci Bot requested review from cgwalters and jkyros April 12, 2023 20:57
@trozet trozet force-pushed the OCPBUGS-3176 branch 2 times, most recently from 94a6dc3 to f0978e1 Compare April 28, 2023 17:21
@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented Apr 28, 2023

/test e2e-metal-ipi

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 3, 2023

metal passed without enabling ip forwarding, but that is most likely because the node ip interface == br-ex interface and ovn-kube will enable forwarding on the br-ex interface for us. We need to cover a case where the node ip interface differs from br-ex. For these cases keepalived will use the node ip interface according to @cybertron so we need to make sure node ip enables ip forwarding on those interfaces. Will add one more commit to this PR to take care of that.

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 3, 2023

/test e2e-metal-ipi

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 4, 2023

looks to be working...removing hold:

May 03 21:47:48.477113 master-0 configure-ip-forwarding.sh[2334]: +++ ip -j addr
May 03 21:47:48.477242 master-0 configure-ip-forwarding.sh[2335]: +++ jq -r 'first(.[] | select(any(.addr_info[]; .local=="192.168.111.20"))) | .ifname'
May 03 21:47:48.478206 master-0 resolv-prepender.sh[2236]: Writing manifest to image destination
May 03 21:47:48.478248 master-0 resolv-prepender.sh[2236]: Storing signatures
May 03 21:47:48.487865 master-0 resolv-prepender.sh[2236]: 9d8b54b39e892388bd8b873fb43dbcadaab0b8954797c1014fe45c9a652ed920
May 03 21:47:48.487958 master-0 podman[2236]: 2023-05-03 21:47:48.190102867 +0000 UTC m=+0.021496222 image pull  registry.build05.ci.openshift.org/ci-op-2yj26jvi/stable@sha256:2e4b1c3bccf9cdaafdfd5349977c9be2b8471ef7785ed3d816554debd5ad2f1f
May 03 21:47:48.489413 master-0 systemd[1]: var-lib-containers-storage-overlay.mount: Deactivated successfully.
May 03 21:47:48.490282 master-0 resolv-prepender.sh[2233]: NM resolv-prepender: Download of baremetal runtime cfg image completed
May 03 21:47:48.501301 master-0 configure-ip-forwarding.sh[2330]: ++ iface=enp2s0
May 03 21:47:48.501301 master-0 configure-ip-forwarding.sh[2330]: ++ [[ -n enp2s0 ]]
May 03 21:47:48.501365 master-0 configure-ip-forwarding.sh[2330]: ++ echo enp2s0
May 03 21:47:48.501586 master-0 configure-ip-forwarding.sh[2329]: + iface=enp2s0
May 03 21:47:48.501642 master-0 configure-ip-forwarding.sh[2329]: + [[ -z enp2s0 ]]
May 03 21:47:48.501670 master-0 configure-ip-forwarding.sh[2329]: + echo 'Node IP interface determined as: enp2s0. Enabling IP forwarding...'
May 03 21:47:48.501670 master-0 configure-ip-forwarding.sh[2329]: Node IP interface determined as: enp2s0. Enabling IP forwarding...
May 03 21:47:48.501670 master-0 configure-ip-forwarding.sh[2329]: + sysctl -w net.ipv4.conf.enp2s0.forwarding=1
May 03 21:47:48.502666 master-0 configure-ip-forwarding.sh[2342]: net.ipv4.conf.enp2s0.forwarding = 1
May 03 21:47:48.502793 master-0 configure-ip-forwarding.sh[2329]: + sysctl -w net.ipv6.conf.enp2s0.forwarding=1
May 03 21:47:48.503625 master-0 configure-ip-forwarding.sh[2343]: net.ipv6.conf.enp2s0.forwarding = 1
May 03 21:47:48.504533 master-0 systemd[1]: nodeip-configuration.service: Deactivated successfully.

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 4, 2023

/hold cancel

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 4, 2023
@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 4, 2023

/assign @cybertron

@cybertron
Copy link
Copy Markdown
Member

/lgtm

This should be fine from an on-prem perspective.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 5, 2023
Copy link
Copy Markdown
Contributor

@tssurya tssurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @trozet some suggestions, questions

Comment thread pkg/controller/template/render.go Outdated
Comment thread pkg/controller/template/render_test.go Outdated
Comment thread pkg/controller/template/test_data/controller_config_forwarding_ovn.yaml Outdated
Comment thread pkg/controller/template/render_test.go Outdated
Comment thread templates/common/_base/files/sysctl-forward-conf.yaml Outdated
Comment thread templates/common/on-prem/files/configure-ip-forwarding.yaml Outdated
Comment thread templates/common/on-prem/files/configure-ip-forwarding.yaml Outdated
Comment thread templates/common/on-prem/files/configure-ip-forwarding.yaml Outdated
Comment thread templates/common/on-prem/files/configure-ip-forwarding.yaml Outdated
trozet added 3 commits May 8, 2023 10:20
Enable forwarding globally can cause undesirable effects for customers
with their server acting as a router (non-default behavior). OVNK now
enables and restricts forwarding on a per managed interface basis. So
there is no need to enable this globally anymore.

Signed-off-by: Tim Rozet <trozet@redhat.com>
A user or IDE may unsuspectingly add a newline to the end of
conditionalized rendered templates that will fail the check to see if
there is any data actually to be rendered.

Signed-off-by: Tim Rozet <trozet@redhat.com>
Service will configure ip forwarding for the selected node ip interface.

Signed-off-by: Tim Rozet <trozet@redhat.com>
@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 15, 2023

/assign @sinnykumari

Copy link
Copy Markdown
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 15, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cybertron, jcaamano, sinnykumari, trozet, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 15, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 49f32d4 and 2 for PR HEAD cf172ff in total

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 17, 2023

/retest-required

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 655ba33 and 1 for PR HEAD cf172ff in total

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 22, 2023

/retest-required

1 similar comment
@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 23, 2023

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 23, 2023

@trozet: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-alibabacloud-ovn cf172ff link false /test e2e-alibabacloud-ovn
ci/prow/okd-scos-e2e-gcp-ovn-upgrade cf172ff link false /test okd-scos-e2e-gcp-ovn-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@trozet
Copy link
Copy Markdown
Contributor Author

trozet commented May 23, 2023

/retest-required

@openshift-merge-robot openshift-merge-robot merged commit 0dcaf88 into openshift:master May 23, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@trozet: Jira Issue OCPBUGS-3176: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-3176 has been moved to the MODIFIED state.

Details

In response to this:

Enable forwarding globally can cause undesirable effects for customers with their server acting as a router (non-default behavior). OVNK now enables and restricts forwarding on a per managed interface basis. So there is no need to enable this globally anymore.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

trozet added a commit to trozet/api that referenced this pull request May 24, 2023
With ovn-kubernetes/ovn-kubernetes#3374 and
openshift/machine-config-operator#3676
we stopped enabling IP forwarding globally in the OCP host. It is now
only enabled on interfaces that OVNK manages. Additionally iptables
rules are put in place to limit what traffic is forwarded. This was to
resolve an issue where our node by default acts as a router:
https://issues.redhat.com/browse/OCPBUGS-3176

With these changes, by default in OCP we will no longer forward IP
traffic. Most users should welcome this change as it means the OCP node
will not be forwarding traffic unintentionally on their network.
However, in case there is some corner case where a user expects traffic
to be forwarded we will expose this new config knob to allow them to
go back to the old behavior.

Signed-off-by: Tim Rozet <trozet@redhat.com>
trozet added a commit to trozet/api that referenced this pull request May 24, 2023
With ovn-kubernetes/ovn-kubernetes#3374 and
openshift/machine-config-operator#3676
we stopped enabling IP forwarding globally in the OCP host. It is now
only enabled on interfaces that OVNK manages. Additionally iptables
rules are put in place to limit what traffic is forwarded. This was to
resolve an issue where our node by default acts as a router:
https://issues.redhat.com/browse/OCPBUGS-3176

With these changes, by default in OCP we will no longer forward IP
traffic. Most users should welcome this change as it means the OCP node
will not be forwarding traffic unintentionally on their network.
However, in case there is some corner case where a user expects traffic
to be forwarded we will expose this new config knob to allow them to
go back to the old behavior.

Signed-off-by: Tim Rozet <trozet@redhat.com>
trozet added a commit to trozet/api that referenced this pull request May 24, 2023
With ovn-kubernetes/ovn-kubernetes#3374 and
openshift/machine-config-operator#3676
we stopped enabling IP forwarding globally in the OCP host. It is now
only enabled on interfaces that OVNK manages. Additionally iptables
rules are put in place to limit what traffic is forwarded. This was to
resolve an issue where our node by default acts as a router:
https://issues.redhat.com/browse/OCPBUGS-3176

With these changes, by default in OCP we will no longer forward IP
traffic. Most users should welcome this change as it means the OCP node
will not be forwarding traffic unintentionally on their network.
However, in case there is some corner case where a user expects traffic
to be forwarded we will expose this new config knob to allow them to
go back to the old behavior.

Signed-off-by: Tim Rozet <trozet@redhat.com>
trozet added a commit to trozet/sdn that referenced this pull request May 25, 2023
With changes to disable IP forwarding by default with OVNK:
openshift/machine-config-operator#3676

OVN to SDN migration is now broken, because pods roll out first that
require IP forwarding. IP forwarding for SDN is enabled via MCO node
rollout, which happens later.

This change enables IP forwarding in the SDN pod itself if needed.

Signed-off-by: Tim Rozet <trozet@redhat.com>
trozet added a commit to trozet/cluster-node-tuning-operator that referenced this pull request May 30, 2023
With openshift/machine-config-operator#3676 we
removed globally setting ip forwarding in MCO. We should not be setting
it in the default profile for OpenShift. We now set it on a per
interface basis as needed. However, by setting proc/sys/net/ipv4/forward
rather than (forwarding) it will reset all the values:

"This variable is special, its change resets all configuration
parameters to their default state (RFC1122 for hosts, RFC1812
for routers)"

We suspect this causes upgrade to fail. NTO sets this to 1, which then
resets all the per interface config that OVNK wrote. Then during upgrade
when there is a tuned profile change, the config change is rolled back
so forward is 0, and now there is no connectivity to kapi and upgrade
fails.

Signed-off-by: Tim Rozet <trozet@redhat.com>
trozet added a commit to trozet/tuned that referenced this pull request May 30, 2023
With openshift/machine-config-operator#3676 we
removed globally setting ip forwarding in MCO. We should not be setting
it in the default profile for OpenShift. We now set it on a per
interface basis as needed. However, by setting proc/sys/net/ipv4/forward
rather than (forwarding) it will reset all the values:

"This variable is special, its change resets all configuration
parameters to their default state (RFC1122 for hosts, RFC1812
for routers)"

We suspect this causes upgrade to fail. NTO sets this to 1, which then
resets all the per interface config that OVNK wrote. Then during upgrade
when there is a tuned profile change, the config change is rolled back
so forward is 0, and now there is no connectivity to kapi and upgrade
fails.

Signed-off-by: Tim Rozet <trozet@redhat.com>
@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.14.0-0.nightly-2023-09-11-201102

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.14.0-0.nightly-2023-09-12-024050

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.14.0-0.nightly-2023-09-15-101929

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants