Skip to content

Add module 'write-deferred-files' executed in stage 'final'#916

Merged
TheRealFalcon merged 16 commits into
canonical:mainfrom
lucendio:patch-1
Oct 25, 2021
Merged

Add module 'write-deferred-files' executed in stage 'final'#916
TheRealFalcon merged 16 commits into
canonical:mainfrom
lucendio:patch-1

Conversation

@lucendio
Copy link
Copy Markdown
Contributor

@lucendio lucendio commented Jun 7, 2021

Proposed Commit Message

summary: Change module order: run 'users-groups' before 'write-files'

The module 'write-files' allows to configure ownership. If the configured user (and/or group)
does not exist at the point when the 'write-files' module is applied, the whole stage fails.
Moving 'users-groups' solves the issue.

Additional Context

The reasoning in full length can be found here: https://bugs.launchpad.net/cloud-init/+bug/1486113

Test Steps

Please see this comment to find a reproducible example.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@TheRealFalcon
Copy link
Copy Markdown
Contributor

Hey @lucendio , unfortunately we cannot just swap these two modules for backwards compatibility reasons. One of the comments in that bug gives the reasoning and some possible ways to solve this:

As a (possibly contrived) example, as it is right now you can use write_files to replace 'adduser' and then cloud-init would call the replaced version of adduser when adding users. If write_files happened later you could not do that.

I would like to get this to work though.
2 options:
a.) add 'write_files_late' that runs after users are created (or write_files_early and write_files) basically 2 different points in boot.
b.) magically determine if 'owner' isn't around that we should re-try this after the usergroups are created.
c.) extend format of write_files to include 'early' or 'late' that would indicate to cloud-init explicitly that it should happen after users or before users creation.

a would probably be the easiest to implement, though I'd be open any solution that preserves backwards compatibility.

@lucendio
Copy link
Copy Markdown
Contributor Author

lucendio commented Jun 9, 2021

Hey @TheRealFalcon ,

while, from a logical point of view, I understand the reasoning, although I honestly doubt that sb actually does this replacing adduser before using the users module. Anyway, the likeliness of sb holding cloud-init that way seems out of the equation when we want to focus on solving this w/o breaking backward compatibility, which would be my primary goal here and the main reason why I opened this PR.

I agree with your observation about a. Though, from an ergonomic point of view, I propose a write_files in a users item? Ownership, for example, could be implied - or reasonably default.

A completely different approach could be to change how a user is being created from python - to eliminate the argument brought up. In some places on the internet I read sth about libuser and a python binding.

In any case, I'd like to help moving this dangling issue towards a solution :)

@TheRealFalcon
Copy link
Copy Markdown
Contributor

I agree with your observation about a. Though, from an ergonomic point of view, I propose a write_files in a users item? Ownership, for example, cloud be implied - or reasonably default.

This sounds reasonable to me. I'd be ok with it.

A completely different approach could be to change how a user is being created from python - to eliminate the argument brought up. In some places on the internet I read sth about libuser and a python binding.

We would still have the issue of write_file need to run before creating the user, right? I think this would be a little more invasive that I'd prefer, especially since it would involve dragging a new dependency into cloud-init.

If you want to go ahead with the users option feel free. I probably would have gone with having another write_files type thing much later in boot, like the distinction we have between bootcmd and runcmd. I'm fine with either solution though.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 2, 2021

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.)

@github-actions github-actions Bot added the stale-pr Pull request is stale; will be auto-closed soon label Jul 2, 2021
@lucendio
Copy link
Copy Markdown
Contributor Author

lucendio commented Jul 4, 2021

A completely different approach could be to change how a user is being created from python - to eliminate the argument brought up. In some places on the internet I read sth about libuser and a python binding.

We would still have the issue of write_file need to run before creating the user, right? I think this would be a little more invasive that I'd prefer, especially since it would involve dragging a new dependency into cloud-init.

Yes. this approach would still involve placingusers-groups before write-files like the initial commit of this PR proposes. But, it would bypass the argument of useing write_files to replace 'adduser', because it would simply not be possible anymore to change the behaviour of adduser.
My gut would even say, prohibiting this would increase the security (or maybe I'm missing sth here). But then again, what's the attack vector? A malicious adduser is pulled into the system during cloud-init run? Not sure if it's that serious...
But now that I come to think of it, would it not even possible to temper with the cloud-init logic/code with write_files during execution?

If you want to go ahead with the users option feel free. I probably would have gone with having another write_files type thing much later in boot, like the distinction we have between bootcmd and runcmd. I'm fine with either solution though.

I would argue, that it is easier to teach users what users[*].write_files would be for instead of a write_files_late, but maybe that's just me.

So, to summarize, as it stands, we currently have two different approaches:

  1. Add a second possibility to write-files which will be carried out after (or alongside of) users-groups: e.g. write_files_late or users[*].write_files
  2. replace the low-level implementation of users - which is founded on external binaries like adduser - with libuser, and move users-groups before write-files

While the first one would extend the user API, the second seems to introduce another dependency and would change the default order.
Right now, we are leaning towards the first approach.

@TheRealFalcon I'd like to start working on this. Would you mind removing the stale-pr tag? I will also leave a message in the bug tracker linked above, to a) indicate progress on this matter and b) to maybe trigger some more feedback.

@TheRealFalcon TheRealFalcon removed the stale-pr Pull request is stale; will be auto-closed soon label Jul 5, 2021
@TheRealFalcon
Copy link
Copy Markdown
Contributor

hey @lucendio , thanks for driving this. Your plan sounds good to me. Let me know if you have any additional questions or need help.

@github-actions
Copy link
Copy Markdown

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.)

@github-actions github-actions Bot added the stale-pr Pull request is stale; will be auto-closed soon label Jul 22, 2021
@github-actions github-actions Bot closed this Jul 30, 2021
@desrod
Copy link
Copy Markdown

desrod commented Jul 30, 2021

Let's not let this one die on the vine. I have a need for this as well, by injecting files (openvpn, ~/.ssh/config and others) at write_files time.

I like the idea of having a specific construct of when to trigger the write_files directive, by classifying it either in its precedence in the list (possibly with a priority flag?) or making sure it comes after another known plugin in the process.

I'm inclined to support the idea of a global write_files, but also another one inside (nested under) user-data, so they can be logically separated (host vs. user context).

Thoughts?

lucendio added a commit to lucendio/cloud-init that referenced this pull request Aug 1, 2021
During a follow-up discussion it was desiced to extend the user interface instead
of shuffling modules around. So, this commit reverts the initial approach of the PR
canonical#916 before starting to implement the
solution being discused.

This reverts commit 1888f4f.
@lucendio
Copy link
Copy Markdown
Contributor Author

lucendio commented Aug 1, 2021

I put together a first proposal for the implementation. Most of the details can be found in the commit messages. The branch used for the proposal is the continuation of the one referred in this PR.

I chose this specific location, because I can see cases, in which users would like to refer to users/groups created through package installation. Does this make sense?

The current approach introduces sth like this:

users:
  - name: 'alice'
    files:
    - path: '/home/alice/.profile'
      content: |
        PATH="/usr/local/opt/python37/bin:$PATH"
      append: true

but additionally, I could also imagine the following:

write_files:
  - path: /etc/nginx/conf.d/example.com.conf
    content: |
      server {
          server_name example.com;
          ...
      }
    owner: 'nginx:nginx'
    permissions: '0440'
    defer: true

So, not only users[*].files would end up in the list of deferred files but also entries in write_files that are marked as defer: true.

Please let me know what you think about the current implementation. I'd love to hear your feedback!

PS: @TheRealFalcon would you mind re-opning this PR, so I can change the origin branch? Also, please let me know if this doesn't meet your level of code quality. And, I ended up making write_deferred_files largely depend on write_files. Not sure if that's such a good idea. Internally, you might have certain requirements when it comes to inter-module dependencies.

@TheRealFalcon TheRealFalcon reopened this Aug 2, 2021
@TheRealFalcon TheRealFalcon removed the stale-pr Pull request is stale; will be auto-closed soon label Aug 2, 2021
@lucendio lucendio changed the title Change module order: run 'users-groups' before 'write-files' Add module 'write-deferred-files' running in stage 'final' Aug 2, 2021
@lucendio lucendio changed the title Add module 'write-deferred-files' running in stage 'final' Add module 'write-deferred-files' executed in stage 'final' Aug 2, 2021
@TheRealFalcon
Copy link
Copy Markdown
Contributor

Also, please let me know if this doesn't meet your level of code quality. And, I ended up making write_deferred_files largely depend on write_files. Not sure if that's such a good idea. Internally, you might have certain requirements when it comes to inter-module dependencies.

Quality overall looks good. There's no problem with reusing write_files in the way you did. I noticed a number of pep8-type things (mostly line lengths), but other than that the implementation mostly looks good. I haven't done a full review yet though.

Backing up to your earlier comment though...

but additionally, I could also imagine the following:

write_files:
  - path: /etc/nginx/conf.d/example.com.conf
    content: |
      server {
          server_name example.com;
          ...
      }
    owner: 'nginx:nginx'
    permissions: '0440'
    defer: true

Now that you bring it up, I like this approach a lot more. When you had initially mentioned adding the files option to user-groups, I was thinking it would modifying the module directly. If we're already adding a "deferred" module that will run later in boot, I think this approach using defer would be a lot simpler. If defer option is False or not set, we write the file in the local stage. If True, we write the file in the final stage. I think this is better UI too, because the option will be right there documented in the write-files module, rather than having to know to go look in the user-groups module.

I don't think we should have this and the user-groups option at the same time though. Having two different options to enable the same functionality would be confusing. I know it's annoying that I approved one option and am now asking to scratch that and go with a different option, but I think it will be a much better direction overall.

What do you think?

@lucendio
Copy link
Copy Markdown
Contributor Author

lucendio commented Aug 5, 2021

I have to admit, I almost couldn't resist the urge to implement defer: true along the way, too. As soon as I had the name, the defer flag hit me. But I kept telling myself: one thing at a time - think about the git history.

Anyway, I must say, I still believe there is value in having both.

I think this is better UI too, because the option will be right there documented in the write-files module, rather than having to know to go look in the user-groups module.

I guess, that could be solved by linking. I didn't render the docs yet, but I hope it works referring from one scheme description to another. And being able to define files that belong to a user right next to it and as simple as:

users:
  - name: 'alice'
    files:
      - path: '/home/alice/.profile'
        content: |
          PATH="/usr/local/opt/python37/bin:$PATH"
          append: true

It would even be possible to simplify path a bit, by requiring it to be relative and inferring ${HOME}, to lower chances of making typos in the user name. I was also thinking about the argument of avoiding more than one way of doing things. But it feels a bit weak considering, for example, SSH key or password configuration, not do mention runcmd.

When you had initially mentioned adding the files option to user-groups, I was thinking it would modifying the module directly.

Oh, you mean, calling the write_files from the cc_users_groups.handler directly, after create_user. Interesting, that didn't cross my mind. Right from the start, I had new module in mind that would be placed somewhere downstream after users-groups. Probably because that's how everything started - messing up the module order ;)
Very valid thought though, and probably less effort!

So, I hope you reconsider. It's not a blocker, because it's more impotent to even get this functionality in. Maybe other folks have opinions too :)

@TheRealFalcon
Copy link
Copy Markdown
Contributor

It would even be possible to simplify path a bit, by requiring it to be relative and inferring ${HOME}, to lower chances of making typos in the user name.

I don't think this will work. Root is still executing everything, so the user's HOME env var won't be set.

Anyway, I must say, I still believe there is value in having both

Having one write files definition that can write at 3 separate times seems excessive to me. We also need to consider cohesiveness and maintenance burden. In general, most modules have their yaml definition and operate only on that. There are obviously exceptions to this "rule", but I think it's important to keep those exceptions to a minimum. Having a write-files definition that is defined in the users yaml but not actually during user handling would lead to a lot of confusion.

My preference would be that we choose one. Either keep the users files definition (handled in the users module) or add the defer key to write-files that get's executed at final time.

Why do you think it's important to have both?

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucendio thank you for picking up this work and running with it and for driving conversation here with James.

I know there has been a bit of back and forth here trying to define the best support for this deferred set of write_files operations and I appreciate the effort you've put into this so far.

I recognize you are coming from the perspective of solving the creation of files in a user-based directories with proper perms etc. It seems you've added some helper functionality to aid in creation specific to the users. While the extension to the "users" module directives are helpful, it feels a bit like we are adding an alias with sugar which can already be done by explicit directives in the existing 'write_files:' schema.

I feel like providing two ways to do things in this case might be more of a burden from doc/maintenance stand point than just doing one thing that serves a more broad set of needs.

I'd vote instead that this PR tries limiting scope to only adding the ability to perform write_files type operations during cloud-init final stage.
We can do that in one of two ways, both of which involve adding a cc_write_files_deferred.py module:

  1. Add a "defer: true" config directive to existing write_files: schema, cc_write_files.py will ignore deferred items, cc_write_files_deferred.py will consume all defer: true items under write_files:
    -- OR --
  2. Don't add a defer:true option to write_files: schema, just add a write_files_deferred: top-level key which cc_write_files_deferred will consume

Are there any specific configuration directives missing from the existing write_files schema that don't ultimately meet the needs you tried addressing with your extract_deferred_files that maybe could be added to the base functionality of write_files ? It seems maybe you'd like the option to discover the user:group based on the working directory you are operating in?

I guess I'd probably lean toward option 2 for a couple reasons:

  1. If someone has a need to write deferred files or earlier files I'm just guessing they have a group of those operations that they would like to perform at the same stage in cloud-init. To more easily group those write_file operations with minimal #cloud-config under a single top-level key "write_files_deferred:" instead of appending defer:true on every item under the current write_files: directive.
  2. Most cloud-init config modules cc_*py are scoped to own unique top-level #cloud-config keys. So, it probably makes things a bit more intuitive for cc_write_files.py to honor write_files: config key and cc_write_files_deferred: to honor write_files_deferred: key instead of two modules having to pop or filter unrelated items out of their config.

In either case here's a diff against your PR that might explain what I was thinking for option 2
https://paste.ubuntu.com/p/5MF77rqgMV/

But if we think option 1 is best: here's a diff to support a defer: true and avoid honoring a new top-level #cloud-config key.
https://paste.ubuntu.com/p/MbZ3y5fdrh/

Comment thread cloudinit/config/cc_write_deferred_files.py Outdated
@lucendio
Copy link
Copy Markdown
Contributor Author

lucendio commented Aug 7, 2021

Having one write files definition that can write at 3 separate times seems excessive to me

If that would be the case, I would completely agree with you. I was aiming for two times, where times refers to the amount of modules writing files across all stages:

  1. cc_write_files during init stage whose list of files fed directly from the top-level key write-files[*].defer == false (or not set)
  2. cc_write_deferred_files during final stage whose list of files fed from entries in users[*].files and from the top-level key write-files[*].defer == true

Originally, the idea was to not expose a dedicated top-level key for cc_write_deferred_files. Introducing a new module was just my way of utilizing the cloud-init concept of ordered stages and modules to allow for doing sth at a later point in time, because the success (of creating a certain file) is actually dependent of when it happens, and thus the order of things.

If someone has a need to write deferred files or earlier files I'm just guessing they have a group of those operations that they would like to perform at the same stage in cloud-init. To more easily group those write_file operations with minimal #cloud-config under a single top-level key "write_files_deferred:" instead of appending defer:true on every item under the current write_files: directive.

One could say the oppositem too. Explicitly marking files whose creation is deferred, and finding it nicer to have all files managed by cloud-init in one place. No new top-level key one would need to be aware of, just a flag to slightly change the behaviour or preventing its creation from failing.

[...] instead of two modules having to pop or filter unrelated items out of their config.

In my opinion, that is exactly what logic behind a user interface is supposed to do. Often, humans understand data very differently compared to how machines do. That being said, well-structured and maintainable code is not necessarily dependent and still very much possible.

Why do you think it's important to have both?

Important is probably a bit strong. I'd say it feels intuitive to me: files, that I define in a user object would surely be created after the user has been created and would be owned by that user. And, a file marked as deferred is most likely be created at the very end of a cloud-init run, after anything that might interfere with the creation, did already happen.

But I guess intuition is also somewhat subjective :)

Above, I just wanted to make my case again to ensure that there is no misunderstanding about my train of thought and the implementation I propose. Just the perspective of a cloud-init user.
I understand that maintenance burden has to count for sth. I'm honestly just not sure what the ratio between that and the user experience is supposed to be in such case.

Despite our different point of views, I think we all share the goal of reaching consensus. And since you are the maintainers, I would like you to have the final verdict, because I'd rather have a solution addressing the original issue. If you like, I can still carry out the implementation.

One last technical thought though: From a backward compatibility perspective, does it make more or less sense to introduce a new top-level key or just extend existing schemes, when adding a new module to one of the stage lists in cloud.cfg?

Side note: only way where we would not need to link to write-files documentation, is the defer: true approach.


In either case here's a diff against your PR that might explain what I was thinking for option 2
https://paste.ubuntu.com/p/5MF77rqgMV/

But if we think option 1 is best: here's a diff to support a defer: true and avoid honoring a new top-level #cloud-config key.
https://paste.ubuntu.com/p/MbZ3y5fdrh/

@blackboxsw btw GH supports syntax highlighting for git diffs ;) although, a link to a diff served by GH is probably even more readable.

diff --git a/cloudinit/config/cc_write_files.py b/cloudinit/config/cc_write_files.py
index 8601e707..b19e47f6 100644
--- a/cloudinit/config/cc_write_files.py
+++ b/cloudinit/config/cc_write_files.py
@@ -151,6 +151,15 @@ schema = {
                             ``path`` exists. Default: **false**.
                         """),
                     },
+                    'defer': {
+                        'type': 'boolean',
+                        'default': False,
+                        'description': dedent("""\
+                            Whether to defer writing files until cloud-init's
+                            final stage, after user creation and
+                            package install.
+                        """),
+                    },
                 },
                 'required': ['path'],
                 'additionalProperties': False
@@ -163,7 +172,7 @@ __doc__ = get_schema_doc(schema)  # Supplement python help()
 
 
 def handle(name, cfg, _cloud, log, _args):
-    files = cfg.get('write_files')
+    files = get_deferred_files(cfg)
     if not files:
         log.debug(("Skipping module named %s,"
                    " no/empty 'write_files' key in configuration"), name)
@@ -172,6 +181,14 @@ def handle(name, cfg, _cloud, log, _args):
     write_files(name, files)
 
 
+def get_deferred_files(cfg, deferred=False):
+    """Filer files based on whether the items is deferred."""
+    filtered_files = []
+    for f in cfg.get('write_files', []):
+        filtered_files.append(f) if f.get("deferred", False) == deferred
+    return filtered_files
+
+
 def canonicalize_extraction(encoding_type):
     if not encoding_type:
         encoding_type = ''

@TheRealFalcon
Copy link
Copy Markdown
Contributor

We can do that in one of two ways, both of which involve adding a cc_write_files_deferred.py module:
Add a "defer: true" config directive to existing write_files: schema, cc_write_files.py will ignore deferred items, cc_write_files_deferred.py will consume all defer: true items under write_files:
-- OR --
Don't add a defer:true option to write_files: schema, just add a write_files_deferred: top-level key which cc_write_files_deferred will consume

I also agree with this point.

Despite our different point of views, I think we all share the goal of reaching consensus. And since you are the maintainers, I would like you to have the final verdict, because I'd rather have a solution addressing the original issue. If you like, I can still carry out the implementation.

Thank you for your flexibility @lucendio , and also thank you for the explanation. It's helpful to understand everybody's perspectives. Both myself and @blackboxsw prefer to keep the write-files definition out of the users-groups module, so let's limit it to one of the two choices listed here. I know @blackboxsw said he prefer's option 2, but I actually prefer option 1. I think both of us are flexible on this point and we've all explained our reasonings, so feel free to be the tie-breaker. If you're still interested in carrying out the implementation, we still welcome it.

One last technical thought though: From a backward compatibility perspective, does it make more or less sense to introduce a new top-level key or just extend existing schemes, when adding a new module to one of the stage lists in cloud.cfg

I think it makes sense to just extend existing schemas, and then the new module can refer people to the other schema in the docstring.

@github-actions
Copy link
Copy Markdown

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.)

@github-actions github-actions Bot added the stale-pr Pull request is stale; will be auto-closed soon label Aug 24, 2021
@TheRealFalcon
Copy link
Copy Markdown
Contributor

@lucendio , are you still working on this?

@lucendio
Copy link
Copy Markdown
Contributor Author

@lucendio , are you still working on this?

Yes. Although, little progress due to some vacation days. I plan to pick of the pace over the upcoming weeks. I'm sorry for letting this go stale!

@TheRealFalcon TheRealFalcon removed the stale-pr Pull request is stale; will be auto-closed soon label Aug 26, 2021
lucendio and others added 15 commits October 25, 2021 11:13
During a follow-up discussion it was desiced to extend the user interface instead
of shuffling modules around. So, this commit reverts the initial approach of the PR
canonical#916 before starting to implement the
solution being discused.

This reverts commit 1888f4f.
THe main idea is to introduce a second module that takes care of
writing files, but in the 'final' stage.

While the introduction of a second module would allow for choosing
the appropriate place withing the order of modules (and stages),
there is no addition top-level directive being added to the cloud
configuration schema. Instead, existing schemes are being extended,
like the 'users' directive.

The new module 'write-deferred-files' tries to reuse as much as
possible of the 'write-files' functionality, including the scheme
or the 'write_files' function.

further details on how it would be used, are shown in the scheme
definition of the module.
This is an initial small chuck of unittests, which verifie that

 (a) a broken scheme ('path' is required) will log - similar
     to the scheme testing in test_handler_write_files.py; it
     even reuses some of the test data
 (b) when gathering file configurations from the 'users'
     directive, the defaults indicated by the docs are being
     set

As this module is build upon the 'write-files' module, a larger
part is already covered by unit tests. Still, more tests -
especially integration tests - are going to be added before
getting this branch into mainline.
This change set implements the feedback from recent discussions [1] on how to best
support writing files at some later point during execution, so that, for example, a
user who is configured as file owner would exist when the file is being created.

THe general idea here is to keep the structure of the handlers of both modules
(write_files[_deferred]) as similar as possible, since they do pretty much the
same. The main difference is the lambda function used to filter deferred or not
deferred files.

For now, there is just some unit testing for the module handler. Integration tests
should be added after the implementation has been reviewed.

Specific details worth mentioning:
* 'validate_cloudconfig_schema' has been moved to the top of the handler functions,
  because
    a) even if the 'write_files' top-level directive is configured and not empty, it
       might still be empty for one of the two write-file modules
    b) the schema validation is a formal check and has no awareness of any logical
       semantics, so it makes sense to first verify the former before moving forward
* the new module handling deferred files has been renamed
* write_files.handle has been slightly restructured to adhere to the changes in
  write_files_deferred.handle
* revert all changes made to cc_users_groups.py in this branche while working on
  the deferred files feature

[1] canonical#916
* mostly line length or too many spaces
This also fixes the last flake8 complaints.
If scheme validation fails, it will just warn but not stop the execution.

As @TheRealFalcon put it:

  it's better to log a warning and do what the user intended rather than log a warning and do the wrong thing.
THis test is supposed to verify that files flagged as deferred are
not being created by the 'write_files' module.
The test is supposed to verify that:

  * deferred file writings are indeed happening at a later
    stage (compared to the default behaviour of write_files)
  * deferred files are created after users & groups have
    been created
  * thus deferred files can be owned by such users
Should cause a failure during integration testing
The test is supposed to verify that:

  * deferred file writings are indeed happening at a later
    stage (compared to the default behaviour of write_files)
  * deferred files are created after users & groups have
    been created
  * thus deferred files can be configured with and owned by
    users or groups created during the same cloud-init run

Co-authored-by: James Falcon <james.falcon@canonical.com>
@TheRealFalcon
Copy link
Copy Markdown
Contributor

Thanks for all of your work on this @lucendio ! I'm pushing a rebase to fix the CI failure. If it passes I'll get this merged.

@TheRealFalcon TheRealFalcon dismissed blackboxsw’s stale review October 25, 2021 19:23

Got confirmation my review is sufficient

@TheRealFalcon TheRealFalcon merged commit a4236c3 into canonical:main Oct 25, 2021
@lucendio
Copy link
Copy Markdown
Contributor Author

Thanks for all of your work on this @lucendio ! I'm pushing a rebase to fix the CI failure. If it passes I'll get this merged.

My pleasure. And, thank you for your collaboration.

@jlanza
Copy link
Copy Markdown

jlanza commented Sep 22, 2022

I would like to point out what I consider is a bug.
If the folder where the file is created exists, the file is created with the appropriate ownership. However if it doesn't exists, the file is created with the proper ownership while the folder is with root ownership.

In my case the default user is called ubuntu ;)

write_files:
  - path: /home/ubuntu/.help/file-in.txt
    content: |
      hi in
    owner: 'ubuntu:ubuntu'
    permissions: '0644'
    defer: true
  - path: /home/ubuntu/file-out.txt
    content: |
      hi out
    owner: 'ubuntu:ubuntu'
    permissions: '0644'
    defer: true

After running everthing:

ubuntu@mycomputer:~$ ls -alrt file.out
-rw-r--r-- 5 ubuntu ubuntu      6 Sep  22 10:44 file-out.txt
ubuntu@mycomputer:~$ ls -alrt .help
drwxr-xr-x 3 root   root     4096 Sep  22 2022 .help
ubuntu@mycomputer:~$ ls -alrt .help/file-in.txt
-rw-r--r-- 5 ubuntu ubuntu      7 Sep  22 10:44 file-out.txt

I guess the correct way should be to inherit the ownership for the folder as well, unless pointed out in another way.

What do you think?

@aciba90
Copy link
Copy Markdown
Contributor

aciba90 commented Sep 22, 2022

That sounds reasonable, @jlanza. Would you mind creating a bug report?

https://cloudinit.readthedocs.io/en/latest/topics/bugs.html

@marcospgp
Copy link
Copy Markdown

marcospgp commented Feb 8, 2025

I just ran into this issue where I'm trying to assign a user as owner of a file, but cloud init says the user couldn't be found:

"getpwnam(): name not found

This PR was meant to fix that, but I'm still getting it on both ubuntu v22 and v24. Is there a way to know starting from what release this would be fixed?

@TheRealFalcon
Copy link
Copy Markdown
Contributor

@marcospgp , this functionality is already present on all supported Ubuntu releases including 20.04, 22.04, and 24.04. defer: true must be included as part of the write_files section of the cloud-config. If you're having further problems, please open an issue at https://github.com/canonical/cloud-init/issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants