Skip to content

Add _netdev option to mount Azure ephemeral disk#1213

Merged
TheRealFalcon merged 4 commits into
canonical:mainfrom
otubo:1998445
Feb 4, 2022
Merged

Add _netdev option to mount Azure ephemeral disk#1213
TheRealFalcon merged 4 commits into
canonical:mainfrom
otubo:1998445

Conversation

@otubo
Copy link
Copy Markdown
Contributor

@otubo otubo commented Jan 26, 2022

Proposed Commit Message

The ephemeral disk depends on a functional network to be mounted. Even
though it depends on cloud-init.service, sometimes an ordering cycle is
noticed on the instance. If the option "_netdev" is added the problem is
gone.

rhbz: #1998445

Signed-off-by: Eduardo Otubo otubo@redhat.com

Additional Context

Test Steps

Deploy a RHEL-9/cloud-init-21.1 instance on Azure and observe that sometimes there's an ordering cycle on the logs and there's no network connectivity.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

otubo added 2 commits January 26, 2022 12:31
The ephemeral disk depends on a functional network to be mounted. Even
though it depends on cloud-init.service, sometimes an ordering cycle is
noticed on the instance. If the option "_netdev" is added the problem is
gone.

rhbz: #1998445

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
@otubo
Copy link
Copy Markdown
Contributor Author

otubo commented Jan 26, 2022

For some reason tox didn't complain about the line length on my local copy. Anyways, just updated. Thanks!

@anhvoms
Copy link
Copy Markdown
Contributor

anhvoms commented Jan 30, 2022

@otubo Can you elaborate on why this is the case "The ephemeral disk depends on a functional network to be mounted"? Why is this an issue for RHEL 9 and not other distros (or earlier versions of RHEL)?
If this is indeed an issue that is specific to Azure, I think we should just scope it to DSAzure. DSAzure should pass in the mount options when it tries to mount the ephemeral resource disk.

@otubo
Copy link
Copy Markdown
Contributor Author

otubo commented Jan 31, 2022

@anhvoms you probably want to check the whole discussion, more specifically David's comment on the original BZ. TL;DR he says a recent update on systemd made it visible, but the problem might have been there all along.

@cjp256
Copy link
Copy Markdown
Contributor

cjp256 commented Jan 31, 2022

@anhvoms you probably want to check the whole discussion, more specifically David's comment on the original BZ. TL;DR he says a recent update on systemd made it visible, but the problem might have been there all along.

Anh is out out for a couple of days. I don't have access to that bug so I cannot read along, but the mount already has a dependency on cloud-init.service, which itself should have a dependency on network being online. Has that changed for RHEL9?

RHEL 8.5:

# /usr/lib/systemd/system/cloud-init.service

[Unit]
Description=Initial cloud-init job (metadata service crawler)
Wants=cloud-init-local.service
Wants=sshd-keygen.service
Wants=sshd.service
After=cloud-init-local.service
After=NetworkManager.service network.service
After=NetworkManager-wait-online.service
...

Can you share the error you are seeing? I'm also not quite sure why networking needs to be online to mount the disk...

Thanks!

@cjp256
Copy link
Copy Markdown
Contributor

cjp256 commented Jan 31, 2022

To summarize from the BZ...

A change in systemd (systemd/systemd@fa138f5) triggers an ordering cycle because local-fs.target will now set After=mnt.mount (and local-fs.target is required for cloud-init). Previously, it would just be Wants=mnt.mount, avoiding the order cycling.

systemd-248-7.el9: - No mnt.mount in "After"

systemctl show -p After local-fs.target

After=systemd-remount-fs.service tmp.mount run-user-0.mount
systemd-fsck-root.service boot-efi.mount local-fs-pre.target

systemd-249-2.el9: - Have mnt.mount in "After"

systemctl show -p After local-fs.target

After=systemd-fsck-root.service mnt.mount run-user-0.mount boot-efi.mount
run-user-1000.mount local-fs-pre.target systemd-remount-fs.service tmp.mount
-.mount

The argument made in the BZ is that due to a dependency on cloud-init.service, the ephemeral disk should not be considered local, but instead mark it as _netdev so that is not considered a local filesystem.

It seems to me that this grouping is slightly at odds with the x-systemd.requires configuration, but from what I gather reading the systemd.mount manpage, mounts fall into two groups: local-fs.target or remote-fs.target, depending whether the file system is local or remote. If our mount cannot make the local-fs.target, it should declare _netdev to group in with remote-fs.target.

Another option would be to instead use nofail so that it is not required by either local or remote targets - but that would likely cause other issues :)

From my perspective given what I've seen, this seems a reasonable change (and not specific to Azure). 👍 Are there possible side-effects from grouping with remote-fs.target?

@anhvoms
Copy link
Copy Markdown
Contributor

anhvoms commented Feb 2, 2022

@cjp256 thanks for the explanation. This looks reasonable to me

Copy link
Copy Markdown
Contributor

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @cjp256 . Given that this module runs after networking has been applied, it looks like a reasonable change.

@TheRealFalcon
Copy link
Copy Markdown
Contributor

TheRealFalcon commented Feb 2, 2022

@otubo , looks like there's still a formatting issue. Can you run tox -e format to fix it (or after a rebase it would be tox -e do_format)?

@otubo
Copy link
Copy Markdown
Contributor Author

otubo commented Feb 4, 2022

I hate style issues, sorry for the inconvenience guys. Should be good to go now. Thanks!

@TheRealFalcon TheRealFalcon merged commit a1cfd1b into canonical:main Feb 4, 2022
blackboxsw pushed a commit that referenced this pull request Feb 10, 2022
Fixes the spaces introduced in #1213

Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
ndragazis added a commit to ndragazis/scylladb that referenced this pull request Feb 2, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
ndragazis added a commit to ndragazis/scylladb that referenced this pull request Mar 9, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
avikivity pushed a commit to scylladb/scylladb that referenced this pull request Apr 5, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes #26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes #28504
scylladbbot pushed a commit to scylladbbot/scylladb that referenced this pull request Apr 5, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504

(cherry picked from commit 6d50e67)
avikivity pushed a commit to scylladb/scylladb that referenced this pull request Apr 6, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes #26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes #28504

(cherry picked from commit 6d50e67)

Closes #29339
scylladbbot pushed a commit to scylladbbot/scylladb that referenced this pull request Apr 6, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504

(cherry picked from commit 6d50e67)

Closes scylladb#29339
timtimb0t pushed a commit to timtimb0t/scylladb that referenced this pull request Apr 9, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504
denesb pushed a commit to scylladb/scylladb that referenced this pull request Apr 16, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes #26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes #28504

(cherry picked from commit 6d50e67)

Closes #29339

Closes #29354
scylladbbot pushed a commit to scylladbbot/scylladb that referenced this pull request Apr 16, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504

(cherry picked from commit 6d50e67)
scylladbbot pushed a commit to scylladbbot/scylladb that referenced this pull request Apr 16, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504

(cherry picked from commit 6d50e67)

Closes scylladb#29339

Closes scylladb#29354
scylladbbot pushed a commit to scylladbbot/scylladb that referenced this pull request Apr 16, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes scylladb#26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes scylladb#28504

(cherry picked from commit 6d50e67)

Closes scylladb#29339

Closes scylladb#29354

(cherry picked from commit 7ed7728)
denesb pushed a commit to scylladb/scylladb that referenced this pull request Apr 24, 2026
When a Scylla node starts, the scylla-image-setup.service invokes the
`scylla_swap_setup` script to provision swap. This script allocates a
swap file and creates a swap systemd unit to delegate control to
systemd. By default, systemd injects a Before=swap.target dependency
into every swap unit, allowing other services to use swap.target to wait
for swap to be enabled.

On Azure, this doesn't work so well because we store the swap file on
the ephemeral disk [1] which has network dependencies (`_netdev` mount
option, configured by cloud-init [2]). This makes the swap.target
indirectly depend on the network, leading to dependency cycles such as:

swap.target -> mnt-swapfile.swap -> mnt.mount -> network-online.target
-> network.target -> systemd-resolved.service -> tmp.mount -> swap.target

This patch breaks the cycle by removing the swap unit from swap.target
using DefaultDependencies=no. The swap unit will still be activated via
WantedBy=multi-user.target, just not during early boot.

Although this problem is specific to Azure, this patch applies the fix
to all clouds to keep the code simple.

Fixes #26519.
Fixes SCYLLADB-1257

[1] scylladb/scylla-machine-image#426
[2] canonical/cloud-init#1213 (comment)

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>

Closes #28504

(cherry picked from commit 6d50e67)

Closes #29339

Closes #29354

(cherry picked from commit 7ed7728)

Closes #29517
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants