Skip to content

Conversation

@schaefi
Copy link
Collaborator

@schaefi schaefi commented Nov 2, 2025

As a less strict method to retain_last, now also the retain_past deployment option exists. In the retain_past mode the partition to preserve must not be part of the initial image. It can retain the content of any last partition if its start address is past the OS image. However, a gap in the partition table between the OS end and the start of the retained partition can occur for which this commit does not provide a resize method yet.

@schaefi schaefi requested a review from davidcassany November 2, 2025 18:34
@schaefi schaefi self-assigned this Nov 2, 2025
@schaefi
Copy link
Collaborator Author

schaefi commented Nov 2, 2025

@davidcassany inspired by our conversation I was coding the changes to allow retaining any last partition for the use case you described. This is just a draft PR and I haven't done testing of the feature in an adequate way. I'd like to seek your feedback what you think about it. Also see comments in code that I made.

Thanks much

Copy link
Collaborator

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is the idea I had in mind from our previous conversation. Probably the two paths can be even closer with some small refactor.


# preserve pre-dump partition table
if getargbool 0 rd.kiwi.install.retain_past; then
fetch_local_partition_table "${image_target}" /tmp/parttable_pre
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this result could be used inside the compatible_to_retain method, aren't we getting the partition table twice? So we could compute the target_start in compatible_to_retain method from the /tmp/parttable_pre. This way we omit one additional partition table parsing and both cases logic (retain_last and retain_past) is closer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is true. I did not add this type of refactor to not interfere with the existing retain_last code. The draft patch here mostly adds code which I believe makes it easier to read. If I apply the suggested refactor (and I totally agree with you) then the patch will become a bit harder to read. So I did not do it at the moment such that we get an impression where all of this frankenstein surgery can go wrong :)

@davidcassany if we are in alignment that we want to support this mode, and I think we are on the same page now, I would create an example image which makes use of this feature and we can play with it.

Once we found a code base that we could live with I would apply the needed code refactor to avoid code duplication. What do you think ?

I also think we will not be able to make this rock solid as we inject an image that provides a partition table onto a system with a potentially different partition table. The other feature with retain_last would not allow this as it expects the partition table to be set by the image.

So we might also come to the conclusion that it is too dangerous ? not sure and I think we will only find out by testing.

Thoughts ?

As a less strict method to retain_last, now also the retain_past
deployment option exists. In the retain_past mode the partition
to preserve must not be part of the initial image. It can retain
the content of any last partition if its start address is past
the OS image. However, a gap in the partition table between the
OS end and the start of the retained partition can occur for
which this commit does not provide a resize method yet.
Add RetainPast profile to build image for testing custom
partition table changes not provided by the image
@schaefi
Copy link
Collaborator Author

schaefi commented Dec 1, 2025

@davidcassany I did several tests with the latest draft code. Here is my test plan

  • Build the image from the source git
poetry shell

sudo kiwi-ng --profile RetainPast system build --description build-tests/x86/tumbleweed/test-image-disk --target-dir /tmp/mytest/ --set-repo https://download.opensuse.org/tumbleweed/repo/oss --allow-existing-root

sudo cp -a dracut/modules.d/* /tmp/mytest/build/image-root/usr/lib/dracut/modules.d/

sudo kiwi-ng --profile RetainPast system build --description build-tests/x86/tumbleweed/test-image-disk --target-dir /tmp/mytest/ --set-repo https://download.opensuse.org/tumbleweed/repo/oss --allow-existing-root
  • Change to /tmp/mytest and run the initial deployment
cd /tmp/mytest
qemu-img create mydisk 20g
kvm -cdrom kiwi-test-image-disk.x86_64-1.42.1.install.iso -boot d -hda mydisk

... login and create a new partition, add some data
... halt the machine

  • Repeat the deployment
kvm -cdrom kiwi-test-image-disk.x86_64-1.42.1.install.iso -boot d -hda mydisk

... login, check the partition table,
... mount the partition created before, check the data
... halt the machine

  • Change the image description and make the image smaller
<size unit="G">2</size>

... rebuild the image and repeat the deployment

kvm -cdrom kiwi-test-image-disk.x86_64-1.42.1.install.iso -boot d -hda mydisk

... login, check the partition table. There should now be a gap of ~1G between the OS and the self created partition
... mount the partition created before, check the data
... halt the machine

  • Change the image description and make the image bigger
<size unit="G">4</size>

... rebuild the image and repeat the deployment

kvm -cdrom kiwi-test-image-disk.x86_64-1.42.1.install.iso -boot d -hda mydisk

... The image deployment must fail with an error message
... because the image would now overwrite the data partition

@schaefi
Copy link
Collaborator Author

schaefi commented Dec 1, 2025

@davidcassany These deployment types worked as I would expect them. I would be very interested in what you think about it. Thanks much

@davidcassany
Copy link
Collaborator

@davidcassany I did several tests with the latest draft code. Here is my test plan

Nice, this is a good testing procedure, great. Gonna try to go through a similar process next week and experiment with the feature.

So we might also come to the conclusion that it is too dangerous ?

Yes, this is a valid concern. I wonder how to narrow the use cases, I haven't read deep enough the code to realize what would happen in different contexts (e.g. luks, lvm, MBR instead of GPT) that go beyond a simple GPT partition table without any special setup. I'd say if test succeed then we should carefully think on ways to constraint its use cases with some early checks.

@schaefi
Copy link
Collaborator Author

schaefi commented Dec 15, 2025

Yes, this is a valid concern. I wonder how to narrow the use cases, I haven't read deep enough the code to realize what would happen in different contexts (e.g. luks, lvm, MBR instead of GPT) that go beyond a simple GPT partition table without any special setup. I'd say if test succeed then we should carefully think on ways to constraint its use cases with some early checks.

It should work for any of those. The code operates on the plain partition table and is compatible with all partition table types we support in the boot code. As all subsystems like luks, lvm, raid, etc etc are one level down the plain partition geometry, those subsystems are not affected.

Regarding the case when there is a gap in the partition geometry. If this gap should be closed then things becomes interesting because that would require a pretty complicated resize operation which then also affects all used subsystems. This PR does not cover the resize code and if at all I would add that in an extra PR. Actually we have all resize code available, the only missing part is to identify the correct partition that can resize to close the gap, which is the last partition of the dumped image blob. We also have this data. As such it is possible and I also believe it can be done in a safe way.

But first we need to do some more testing to check if we have overlooked some severe issue

@schaefi
Copy link
Collaborator Author

schaefi commented Dec 15, 2025

I think one important sanity would be hard to implement. The geometry check as of today takes the address of the last partition from the image and compares it with the start address of the last partition on the target disk. If the address of the image is smaller than the address of the last partition on the target we say this is good to go.

  • What if the target disk has more than one "data" partition relevant to the system after the OS system partitions ?
  • What if the OS and data partitions are not linear aligned, e.g data partition first and then the OS ? or a mix ?

As retain_past allows to dump over any prior existing system and data it's probably impossible to check if the dump would destroy important data and/or retain the correct partition. The dump itself will always lead to a functional OS but we don't know if what we retained makes sense or not.

I believe this can only be documented as risk factor to the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants