fix(raspberry-pi-os): Systemd network service template fix#6459
Conversation
Signed-off-by: paulober <paul.oberosler@raspberrypi.com>
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging blackboxsw, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag blackboxsw to reopen it.) |
|
Hi @paulober, can you please clarify the need for this PR?
This says what the code does, but not why. With service file changes especially, it is really important to understand the reasons for changes like this. What does this fix? Did it work before, and if not how did it get merged? What was done to validate this that would have caught the issue before? Etc, etc. |
|
@holmanb Sure no problem. Originally I build cloud-init support for Raspberry Pi OS based on Debian bookworm with maybe potential to also run on bullseye. But that changed this year and cloud-init will now be included from the beginning when Raspberry Pi OS based on Trixie launches. And this of course brought many changes with it. So i'm now updating everything to best support the upcoming release and some tooling around it. This is also the reason for my other PRs. I did not want to combine everything in one PR because that would probably take months to review and adjust like last year. So no specific issue, just improving of boot to screen time and best compatibility with the upcoming release and tooling. Hope this explains it. Let me know if you have any other questions. |
|
@blackboxsw Any updates on getting this PR merged? |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging blackboxsw, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag blackboxsw to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging blackboxsw, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag blackboxsw to reopen it.) |
|
|
@paulober anytime we see systemd ordering changes it gives us pause when we don't have specific justification for the issues that are being avoided or resolved as there have been many occasions when systemd ordering cycles get introduced by a seemingly innocuous change which result in one of cloud-init's boot stages getting ejecting from boot goals. These ordering cycle issues show up due to third party software or services that may not be present in default OS images and are hard to discover in base image testing. Hence the request by @holmanb to provide specific justification about the symptom being addressed or motivations for the PR. Your statement above is vague and doesn't detail the improvement achieved, why RPI OS doesn't care about ordering Before=avahi anymore : The comment above is not something tangible we can point to justify the re-ordering systemd services. Note as well that Ubuntu Desktop behaves slightly different as a downstream of cloud-init than ubuntu server images as Ubuntu Desktop has to take into account ordering before=NetworkManager.service which is documented in ubuntu's livecd-rootfs systemd dropins/overrides for cloud-init. I only reference this Ubuntu Desktop image systemd ordering for cloud-init to further show the impact of trying to move cloud-init-network service earler in boot and the unexpected relationship with both dbus.socket/service and sysinit.target which we need to take into account on this PR In this PR's "suggested commit message" can you please describe at least items 1 and 2:
Also, the scheduling If possible, can we get the output of the following in RaspberryPi OS with your changeset applied?
Or alternatively, just providing the /var/log/cloud-init.log and journalctl -b0 output (which can be collected in a gz with |
|
CC: @tdewey-rpi for visibility to the 2 open requests/questions before we can land this PR |
|
Hi @blackboxsw, thanks for taking another look at this PR. I’ll try to summarize the background and reasoning. Last year was the first time I worked with the cloud-init codebase when adding initial Raspberry Pi OS support based on Bookworm. Since cloud-init wasn’t part of any official RPi OS images at the time, real-world testing was very limited. Combined with my inexperience with cloud-init back then, a few mistakes made it into the initial implementation (which is currently upstream). This year, cloud-init has been included in Raspberry Pi OS based on Trixie, which gave us the opportunity for much broader testing — and with more testers. During that testing we identified issues that simply didn’t surface last year, and this PR corrects those mistakes. All changes included here (and in the other open PRs) are already shipped downstream in the current RPi OS images and have been working reliably. To address your questions: 1. About NetworkManager/networkd orderingThe ordering rules in last year’s implementation were primarily introduced to ensure that cloud-init ran in the right order, specifically before the Raspberry Pi OS setup wizards (both desktop and console). This was required to make the user setup in cloud-init function correctly in that environment. Since then, the cloud-init user setup logic for RPi OS has been updated, and there have also been complementary changes outside cloud-init in RPi OS tooling. With these updates in place, we can now align the ordering more closely with the upstream/default expectations. The requirement for correct sequencing still exists, but the workaround from last year is no longer necessary in its previous form. Also there have been downstream fixes for NetworkManager Netplan which result in the config changes provided in this PR being the one that works better with that new env and how we expect the startup to perform. 2. Dropping Before=avahi-daemon.serviceIf I remember correctly then this ordering was part of the earlier workaround related to the setup wizards or because I thought it would be required for some networking stuff. With the revised user setup flow and improved integration, I confirmed that this dependency is no longer needed, so it was removed to simplify stuff and adjust more to what other distros do. 3. Running before sysinit.targetYes, this means that some early-boot services may not yet be fully available. 4. Logs and systemd-analyze outputAt the moment I don’t have the time to rebuild the full test environment from scratch and re-run everything without the downstream patches — especially since all of these changes are already deployed in the current RPi OS images and evaluated in that real-world environment. Given the explanations above and the fact that the ordering logic has been validated through actual downstream usage, these specific logs shouldn’t be required to justify the updated ordering anymore. I hope this provides a clear explanation of how the ordering evolved and why the updated approach works for RPi OS. |
|
Thank you Paul for the update here. What I'd like to do for posterity is capturing some portion reasoning in the proposed commit message description at the top of this PR which will be used as the git commit message. I'd suggest something like the following but maybe you would like to word it a bit differently? |
|
@blackboxsw Done! |
|
Thank you much @paulober for your patience here. Let's work through this list of open PRs this week. I'll get you feedback on others today. |
…deps (canonical#6459) Align systemd network ordering with current Raspberry Pi OS behavior. The previous sequencing was designed around older cloud-init and setup-wizard requirements, but recent upstream and downstream changes make those workarounds unnecessary. The updated ordering matches default expectations and works reliably with newer NetworkManager/Netplan fixes. Remove the no-longer-needed Before=avahi-daemon.service dependency, simplifying the unit and bringing it closer to other distros. Run the network units before sysinit.target, which is consistent with how RPi OS boots and has been validated in current downstream images. Signed-off-by: paulober <paul.oberosler@raspberrypi.com>
…deps (#6459) Align systemd network ordering with current Raspberry Pi OS behavior. The previous sequencing was designed around older cloud-init and setup-wizard requirements, but recent upstream and downstream changes make those workarounds unnecessary. The updated ordering matches default expectations and works reliably with newer NetworkManager/Netplan fixes. Remove the no-longer-needed Before=avahi-daemon.service dependency, simplifying the unit and bringing it closer to other distros. Run the network units before sysinit.target, which is consistent with how RPi OS boots and has been validated in current downstream images. Signed-off-by: paulober <paul.oberosler@raspberrypi.com>
Proposed Commit Message
Additional Context
Test Steps
Merge type
@tdewey-rpi