WIP: Resize encrypted partitions#1032
Conversation
|
|
||
|
|
||
| def resize_encrypted(blockdev): | ||
| subp.subp(['cryptsetup', '--key-file', '<keyfile>', 'resize', blockdev]) |
There was a problem hiding this comment.
This can't work as-is, but not sure where I'm supposed to get the key from.
There was a problem hiding this comment.
You would not need key-file at all, as it would be in kernel keyring already, because the drive is unlocked.
There was a problem hiding this comment.
However, there is backup recovery keyfile on disk too
There was a problem hiding this comment.
Ok, I pushed a change to remove that argument. It should be testable now.
There was a problem hiding this comment.
This won't actually work unless you supply a key. The cryptsetup documentation is a bit confusing here for the resize action.
By default, cryptsetup passes the volume key to the dm-crypt target via the kernel keyring by supplying the key description to the device mapper. But the key is added to a thread keyring which is deleted once cryptsetup exits. When you run "cryptsetup resize", there is nothing in the keyring that can help you here, and cryptsetup needs to supply the volume key to the device mapper again when reconfiguring the dm-crypt target.
We do add keys to the root user keyring during early boot, but the descriptions of those are in a format that's private to https://github.com/snapcore/secboot/blob/master/keyring.go. And snap-bootstrap probably needs rebasing on the latest snapd code for this to even be true.
Perhaps we need to agree a way to pass a key description from early boot to cloud-init so that it can retrieve the key we add to the kernel, which it would then need to supply via --key-file (which can read from stdin ok).
There was a problem hiding this comment.
@chrisccoulson , is there something we can do in the mean time, or is there currently no way for cloud-init to get this key?
There was a problem hiding this comment.
There is currently no way for cloud-init to get a key.
I have another idea - I could have the provisioning tool add another keyslot and drop a key inside the encrypted root filesystem. Cloud-init could use this, and then delete the file and the corresponding keyslot from the LUKS2 container once the resize is done. How does that sound?
There was a problem hiding this comment.
That should work for me!
2df9678 to
1a897e1
Compare
|
|
||
| def get_underlying_partition(blockdev): | ||
| command = 'dmsetup deps -o devname {}'.format(blockdev) | ||
| dep = subp.subp(command.split())[0] |
There was a problem hiding this comment.
Generally we should prefer the following for which does not use split or format and uses long format arguments (readability).
dep = subp.subp(["dmsetup", "deps", "--options=devname", blockdev])[0]Also, you really should check the error code... if this returns non-zero, and you try to parse stdout, then you'll likely get an IndexError, and raise a non-helpful exception.
There was a problem hiding this comment.
Thanks. Was planning some additional error checking in other places too...though...subp will raise error in non-0 exits already. I just don't remember how ugly that is in the logs.
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
b5ae9e2 to
700cf1a
Compare
chrisccoulson
left a comment
There was a problem hiding this comment.
Thanks for working on this, I've left a few comments.
| blockdev should look something like '/dev/mapper/disk1' if True, | ||
| otherwise something like /dev/vdb1. | ||
| """ | ||
| is_mapped = blockdev.startswith('/dev/mapper/') |
There was a problem hiding this comment.
I wouldn't check for /dev/mapper here - I'd resolve the symlink to the underlying /dev/dm-* device node first and check that.
There was a problem hiding this comment.
@chrisccoulson , how would I know which underlying /dev/dm-* device to use?
There was a problem hiding this comment.
/dev/mapper/whatever is a symlink to the underlying block device. It depends a bit what you're really asking here but "os.path.realpath(blockdev).startswith('dm-')" is one possible implementation of this logic that's better than what you have here.
There was a problem hiding this comment.
Yes, sorry it wasn't clear - the block device path you get could be any one of the user-space managed links in /dev/mapper or /dev/disk/ - the more correct thing to do is to resolve the symbolic link to obtain the actual block device path, and then check if the path you get begins with /dev/dm-
| is_encrypted = True | ||
| if not is_encrypted: | ||
| with suppress(subp.ProcessExecutionError): | ||
| subp.subp(['cryptsetup', 'isLuks', blockdev]) |
There was a problem hiding this comment.
cryptsetup isLuks only works on the underlying block device.
In any case, the cryptsetup resize action can work on other types of targets created by cryptsetup (other than bitlocker and truecrypt containers). Perhaps parsing the output of cryptsetup status and making sure the "type" field is not "n/a", "TCRYPT" or "BITLK" might be an idea instead, as an alternative to using cryptsetup isLuks? If you did that, you might want to rename is_encrypted to something else.
It's fine to limit the resizing only to mapped devices that are backed by luks containers, but you'll need to resolve the device mapper path to the underlying block device path first before calling cryptsetup isLuks.
There was a problem hiding this comment.
@chrisccoulson The goal here was to support both types of paths. If cryptsetup status passed, then we return True. If not, check to see if we've been passed the underlying device and try cryptsetup isLuks instead. Given how I'm passing things currently, that isLusk check is kind of pointless so I can just remove it.
Would it be valid to simply call cryptsetup status on the device, and return True if that returns 0?
| except (TypeError, ValueError) as e: | ||
| info.append((devent, RESIZE.SKIPPED, | ||
| "device_part_info(%s) failed: %s" % (blockdev, e),)) | ||
| if is_mapped_device(blockdev) and is_encrypted(blockdev): |
There was a problem hiding this comment.
I was going to question whether the is_mapped_device() and is_encrypted() is the right logic here, because you could potentially have a mapped device where you want to grow the underlying partition without doing the corresponding cryptsetup resize step, which just updates the live dm-crypt mapping to reflect the new underlying partition size.
But thinking about it, it's probably ok as it is like this - it seems fair enough that we probably don't want to touch the underlying partition if we don't know what tool manages the device mapping and therefore we don't know how to update the live mapping.
There was a problem hiding this comment.
it seems fair enough that we probably don't want to touch the underlying partition if we don't know what tool manages the device mapping and therefore we don't know how to update the live mapping.
I don't understand. The code here is meant to resize both the underlying partition via growpart, then the mapped device with the cryptsetup resize. Is that not what you're seeing?
2a8dc4e to
e3fb9d5
Compare
|
PR updated based on comments. cloud-init now reads the key off a json file on disk as suggested. |
holmanb
left a comment
There was a problem hiding this comment.
So far this looks good. I made one comment. I assume that testing/docs are planned prior to merge.
I don't see any integration tests for this module so far. I don't know if that's due to the complexity of LXC with filsystems or if we just never got around to it before. Assuming we want to add them, one idea is we could possibly try to mount a loop device (file backed drive) into an LXC container as a block device for testing. If LXC can't handle something about that, qemu would work, (but we should try LXC first I think to avoid adding a dependency to the integration tests). If help is wanted with integration tests for this, feel free to ping.
| command = ["dmsetup", "deps", "--options=devname", blockdev] | ||
| dep = subp.subp(command)[0] | ||
| try: | ||
| # Returned result should look something like: |
There was a problem hiding this comment.
Based on the source code[1] it looks like it is possible that the output could possibly look like:
1 dependencies : (vdb1)
or
2 dependencies : (vdb1) (vdb2)
or
N dependencies : (vdb1) (vdb2) ...(vdbN)
I don't have a sense for how likely this is, however it looks like this won't be handled currently. I think a warning when N>1 might be more appropriate (or perhaps even handling multiple devices).
[1] Relevant snippet:
2721 for (i = 0; i < deps->count; i++) {
2722 major = (int) MAJOR(deps->device[i]);
2723 minor = (int) MINOR(deps->device[i]);
2724
2725 if ((_dev_name_type == DN_BLK || _dev_name_type == DN_MAP) &&
2726 dm_device_get_name(major, minor, _dev_name_type == DN_BLK,
2727 dev_name, PATH_MAX))
2728 printf(" (%s)", dev_name);
2729 else
2730 printf(" (%d, %d)", major, minor);
2731 }
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
Proposed Commit Message
Additional Context
Test Steps
Can't test on a fully encrypted system yet, but approximated what we're trying to achieve on openstack with ephemeral drives.
Using this cloud-config:
Launch an instance with:
add floating ip here
upload cloud-init deb and install it
Run:
lsblk should show a /dev/vdb1 and /dev/vdc1 with 859M size of partition.
possibly reboot here because jbd2 won't let go of vdb and rerun parted
After reboot, run:
lsblk should show a resized sizes for both encrypted and unencrypted device with no errors in
/var/log/cloud-init.logChecklist: