Add _netdev option to mount Azure ephemeral disk#1213
Conversation
The ephemeral disk depends on a functional network to be mounted. Even though it depends on cloud-init.service, sometimes an ordering cycle is noticed on the instance. If the option "_netdev" is added the problem is gone. rhbz: #1998445 Signed-off-by: Eduardo Otubo <otubo@redhat.com>
|
For some reason tox didn't complain about the line length on my local copy. Anyways, just updated. Thanks! |
|
@otubo Can you elaborate on why this is the case "The ephemeral disk depends on a functional network to be mounted"? Why is this an issue for RHEL 9 and not other distros (or earlier versions of RHEL)? |
|
@anhvoms you probably want to check the whole discussion, more specifically David's comment on the original BZ. TL;DR he says a recent update on systemd made it visible, but the problem might have been there all along. |
Anh is out out for a couple of days. I don't have access to that bug so I cannot read along, but the mount already has a dependency on cloud-init.service, which itself should have a dependency on network being online. Has that changed for RHEL9? RHEL 8.5: Can you share the error you are seeing? I'm also not quite sure why networking needs to be online to mount the disk... Thanks! |
|
To summarize from the BZ... A change in systemd (systemd/systemd@fa138f5) triggers an ordering cycle because local-fs.target will now set After=mnt.mount (and local-fs.target is required for cloud-init). Previously, it would just be Wants=mnt.mount, avoiding the order cycling.
The argument made in the BZ is that due to a dependency on cloud-init.service, the ephemeral disk should not be considered local, but instead mark it as _netdev so that is not considered a local filesystem. It seems to me that this grouping is slightly at odds with the x-systemd.requires configuration, but from what I gather reading the systemd.mount manpage, mounts fall into two groups: local-fs.target or remote-fs.target, depending whether the file system is local or remote. If our mount cannot make the local-fs.target, it should declare _netdev to group in with remote-fs.target. Another option would be to instead use nofail so that it is not required by either local or remote targets - but that would likely cause other issues :) From my perspective given what I've seen, this seems a reasonable change (and not specific to Azure). 👍 Are there possible side-effects from grouping with remote-fs.target? |
|
@cjp256 thanks for the explanation. This looks reasonable to me |
TheRealFalcon
left a comment
There was a problem hiding this comment.
Thanks for the explanation @cjp256 . Given that this module runs after networking has been applied, it looks like a reasonable change.
|
@otubo , looks like there's still a formatting issue. Can you run |
|
I hate style issues, sorry for the inconvenience guys. Should be good to go now. Thanks! |
Fixes the spaces introduced in #1213 Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes #26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes #28504
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504 (cherry picked from commit 6d50e67)
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes #26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes #28504 (cherry picked from commit 6d50e67) Closes #29339
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504 (cherry picked from commit 6d50e67) Closes scylladb#29339
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes #26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes #28504 (cherry picked from commit 6d50e67) Closes #29339 Closes #29354
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504 (cherry picked from commit 6d50e67)
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504 (cherry picked from commit 6d50e67) Closes scylladb#29339 Closes scylladb#29354
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes scylladb#26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb#28504 (cherry picked from commit 6d50e67) Closes scylladb#29339 Closes scylladb#29354 (cherry picked from commit 7ed7728)
When a Scylla node starts, the scylla-image-setup.service invokes the `scylla_swap_setup` script to provision swap. This script allocates a swap file and creates a swap systemd unit to delegate control to systemd. By default, systemd injects a Before=swap.target dependency into every swap unit, allowing other services to use swap.target to wait for swap to be enabled. On Azure, this doesn't work so well because we store the swap file on the ephemeral disk [1] which has network dependencies (`_netdev` mount option, configured by cloud-init [2]). This makes the swap.target indirectly depend on the network, leading to dependency cycles such as: swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target -> network.target -> systemd-resolved.service -> tmp.mount -> swap.target This patch breaks the cycle by removing the swap unit from swap.target using DefaultDependencies=no. The swap unit will still be activated via WantedBy=multi-user.target, just not during early boot. Although this problem is specific to Azure, this patch applies the fix to all clouds to keep the code simple. Fixes #26519. Fixes SCYLLADB-1257 [1] scylladb/scylla-machine-image#426 [2] canonical/cloud-init#1213 (comment) Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes #28504 (cherry picked from commit 6d50e67) Closes #29339 Closes #29354 (cherry picked from commit 7ed7728) Closes #29517
Proposed Commit Message
The ephemeral disk depends on a functional network to be mounted. Even
though it depends on cloud-init.service, sometimes an ordering cycle is
noticed on the instance. If the option "_netdev" is added the problem is
gone.
rhbz: #1998445
Signed-off-by: Eduardo Otubo otubo@redhat.com
Additional Context
Test Steps
Deploy a RHEL-9/cloud-init-21.1 instance on Azure and observe that sometimes there's an ordering cycle on the logs and there's no network connectivity.
Checklist: