Skip to content

Fall back to cached local ds if no valid ds found#4997

Merged
holmanb merged 16 commits into
canonical:mainfrom
PengpengSun:fall-back-to-cached-ds-if-no-valid-ds-found
Mar 29, 2024
Merged

Fall back to cached local ds if no valid ds found#4997
holmanb merged 16 commits into
canonical:mainfrom
PengpengSun:fall-back-to-cached-ds-if-no-valid-ds-found

Conversation

@PengpengSun
Copy link
Copy Markdown
Contributor

@PengpengSun PengpengSun commented Mar 4, 2024

Proposed Commit Message

Fall back to cached local ds if no valid ds found

Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402 (GitHub Issue number. Remove line if irrelevant)
LP: #1835205 (Launchpad bug number. Remove line if irrelevant)

Additional Context

Fixes #3402

Test Steps

Checklist

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

Copy link
Copy Markdown
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this happens, does ds-identify still correctly identify the platform?

Comment thread cloudinit/sources/__init__.py Outdated
# quickly (local check only) if self.instance_id is still
return False

def check_fallback(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this method name more descriptive please? Also we should add a docstring that will communicate to other datasource authors what the purpose of this method is. There are many fallbacks in cloud-init, so this isn't very meaningful on its own.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.
Yes, I've renamed this method(please suggest a better name if needed), and added a docstring.

@PengpengSun
Copy link
Copy Markdown
Contributor Author

When this happens, does ds-identify still correctly identify the platform?

Not sure if I get your question correctly, with my understanding, ds-identify identifies the platform before init --local, so this shall not have impact on ds identify process. The purpose of this change is to add an opportunity for the cached ds to be restored after no data can be gotten from ds in the ds list.

Comment thread cloudinit/stages.py Outdated
Comment thread cloudinit/stages.py Outdated
@holmanb holmanb self-assigned this Mar 9, 2024
@PengpengSun PengpengSun requested a review from holmanb March 15, 2024 05:40
@PengpengSun
Copy link
Copy Markdown
Contributor Author

Hi @holmanb , I'm removing WIP from title, please kindly review this again and let me know if any updates needed. Thanks!

@PengpengSun PengpengSun changed the title WIP: Fall back to cached local ds if no valid ds found Fall back to cached local ds if no valid ds found Mar 19, 2024
Comment thread cloudinit/stages.py Outdated
ds = self._restore_from_cache()
if (
ds
and hasattr(ds, "check_if_fallback_is_allowed")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is undesirable for two reasons:

  • old pickles will never acquire this attribute
  • encapsulation: we shift pickle handling outside of the class definition which should handle pickle issues directly

How about the following modification?

diff --git a/cloudinit/distros/__init__.py b/cloudinit/distros/__init__.py
index 2c4d54871..e225e4ee8 100644
--- a/cloudinit/distros/__init__.py
+++ b/cloudinit/distros/__init__.py
@@ -185,6 +185,8 @@ class Distro(persistence.CloudInitPickleMixin, metaclass=abc.ABCMeta):
             self._dhcp_client = None
         if not hasattr(self, "_fallback_interface"):
             self._fallback_interface = None
+        if not hasattr(self, "check_if_fallback_is_allowed"):
+            self.check_if_fallback_is_allowed = lambda: False
 
     def _validate_entry(self, entry):
         if isinstance(entry, str):
diff --git a/cloudinit/stages.py b/cloudinit/stages.py
index 13bc26d26..c228805d1 100644
--- a/cloudinit/stages.py
+++ b/cloudinit/stages.py
@@ -378,11 +378,7 @@ class Init:
                 if existing != "check":
                     raise e
                 ds = self._restore_from_cache()
-                if (
-                    ds
-                    and hasattr(ds, "check_if_fallback_is_allowed")
-                    and ds.check_if_fallback_is_allowed()
-                ):
+                if ds and ds.check_if_fallback_is_allowed():
                     LOG.info(
                         "Restored fallback datasource from checked cache: %s",
                         ds,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @holmanb,
Thanks for the review! I've updated the code accordingly(I think it'd be sources/init.py), PTAL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox is not happy with the last commit, need to fix it.

Copy link
Copy Markdown
Contributor Author

@PengpengSun PengpengSun Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @holmanb , I've updated code again which fixed the test error, PTAL, thanks!

@PengpengSun PengpengSun requested a review from holmanb March 29, 2024 06:30
Copy link
Copy Markdown
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @PengpengSun!

@holmanb holmanb merged commit 9929a00 into canonical:main Mar 29, 2024
@PengpengSun PengpengSun deleted the fall-back-to-cached-ds-if-no-valid-ds-found branch April 2, 2024 06:45
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
holmanb pushed a commit that referenced this pull request Apr 3, 2024
Rebooting an instance which has finished VMware guest
customization with DataSourceVMware will load
DataSourceNone due to metadata is NOT available.

This is mostly a re-post of PR#229, few differences are:
1. Let ds decide if fallback is allowed, not always fall back
   to previous cached LOCAL ds.
2. No comparing instance-id of cached ds with previous instance-id
   due to I think they are always identical.

Fixes GH-3402
@ani-sinha
Copy link
Copy Markdown
Contributor

@holmanb is it possible to backport this patch to 24.1 train maybe in 24.1.5?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OVF datasource should check if instant id is still on VMware Platform

3 participants