-
Notifications
You must be signed in to change notification settings - Fork 1.1k
DataSourceOracle: refactor to use only OPC v1 endpoint #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,22 +1,20 @@ | ||
| # This file is part of cloud-init. See LICENSE file for license information. | ||
| """Datasource for Oracle (OCI/Oracle Cloud Infrastructure) | ||
|
|
||
| OCI provides a OpenStack like metadata service which provides only | ||
| '2013-10-17' and 'latest' versions.. | ||
|
|
||
| Notes: | ||
| * This datasource does not support the OCI-Classic. OCI-Classic | ||
| provides an EC2 lookalike metadata service. | ||
| * The uuid provided in DMI data is not the same as the meta-data provided | ||
| * This datasource does not support OCI Classic. OCI Classic provides an EC2 | ||
| lookalike metadata service. | ||
| * The UUID provided in DMI data is not the same as the meta-data provided | ||
| instance-id, but has an equivalent lifespan. | ||
| * We do need to support upgrade from an instance that cloud-init | ||
| identified as OpenStack. | ||
| * Both bare-metal and vms use iscsi root | ||
| * Both bare-metal and vms provide chassis-asset-tag of OracleCloud.com | ||
| * Bare metal instances use iSCSI root, virtual machine instances do not. | ||
| * Both bare metal and virtual machine instances provide a chassis-asset-tag of | ||
| OracleCloud.com. | ||
| """ | ||
|
|
||
| import base64 | ||
| import json | ||
| import re | ||
|
|
||
| from cloudinit import log as logging | ||
| from cloudinit import net, sources, util | ||
|
|
@@ -26,7 +24,7 @@ | |
| get_interfaces_by_mac, | ||
| is_netfail_master, | ||
| ) | ||
| from cloudinit.url_helper import UrlError, combine_url, readurl | ||
| from cloudinit.url_helper import readurl | ||
|
|
||
| LOG = logging.getLogger(__name__) | ||
|
|
||
|
|
@@ -35,8 +33,9 @@ | |
| 'configure_secondary_nics': False, | ||
| } | ||
| CHASSIS_ASSET_TAG = "OracleCloud.com" | ||
| METADATA_ENDPOINT = "http://169.254.169.254/openstack/" | ||
| VNIC_METADATA_URL = 'http://169.254.169.254/opc/v1/vnics/' | ||
| METADATA_ROOT = "http://169.254.169.254/opc/v1/" | ||
| METADATA_ENDPOINT = METADATA_ROOT + "instance/" | ||
| VNIC_METADATA_URL = METADATA_ROOT + "vnics/" | ||
| # https://docs.cloud.oracle.com/iaas/Content/Network/Troubleshoot/connectionhang.htm#Overview, | ||
| # indicates that an MTU of 9000 is used within OCI | ||
| MTU = 9000 | ||
|
|
@@ -189,53 +188,39 @@ def _get_data(self): | |
| if not self._is_platform_viable(): | ||
| return False | ||
|
|
||
| self.system_uuid = _read_system_uuid() | ||
|
|
||
| # network may be configured if iscsi root. If that is the case | ||
| # then read_initramfs_config will return non-None. | ||
| if _is_iscsi_root(): | ||
| data = self.crawl_metadata() | ||
| data = read_opc_metadata() | ||
| else: | ||
| with dhcp.EphemeralDHCPv4(net.find_fallback_nic()): | ||
| data = self.crawl_metadata() | ||
| data = read_opc_metadata() | ||
|
|
||
| self._crawled_metadata = data | ||
| vdata = data['2013-10-17'] | ||
|
|
||
| self.userdata_raw = vdata.get('user_data') | ||
| self.system_uuid = vdata['system_uuid'] | ||
|
|
||
| vd = vdata.get('vendor_data') | ||
| if vd: | ||
| self.vendordata_pure = vd | ||
| try: | ||
| self.vendordata_raw = sources.convert_vendordata(vd) | ||
| except ValueError as e: | ||
| LOG.warning("Invalid content in vendor-data: %s", e) | ||
| self.vendordata_raw = None | ||
|
|
||
| mdcopies = ('public_keys',) | ||
| md = dict([(k, vdata['meta_data'].get(k)) | ||
| for k in mdcopies if k in vdata['meta_data']]) | ||
|
|
||
| mdtrans = ( | ||
| # oracle meta_data.json name, cloudinit.datasource.metadata name | ||
| ('availability_zone', 'availability-zone'), | ||
| ('hostname', 'local-hostname'), | ||
| ('launch_index', 'launch-index'), | ||
| ('uuid', 'instance-id'), | ||
| ) | ||
| for dsname, ciname in mdtrans: | ||
| if dsname in vdata['meta_data']: | ||
| md[ciname] = vdata['meta_data'][dsname] | ||
|
|
||
| self.metadata = md | ||
| return True | ||
| self.metadata = { | ||
| "availability-zone": data["ociAdName"], | ||
| "instance-id": data["id"], | ||
| "launch-index": 0, | ||
| "local-hostname": data["hostname"], | ||
| "name": data["displayName"], | ||
| } | ||
|
|
||
| if "metadata" in data: | ||
| user_data = data["metadata"].get("user_data") | ||
| if user_data: | ||
| self.userdata_raw = base64.b64decode(user_data) | ||
| self.metadata["public_keys"] = data["metadata"].get( | ||
| "ssh_authorized_keys" | ||
| ) | ||
|
|
||
| def crawl_metadata(self): | ||
| return read_metadata() | ||
| return True | ||
|
|
||
| def _get_subplatform(self): | ||
| """Return the subplatform metadata source details.""" | ||
| return 'metadata (%s)' % METADATA_ENDPOINT | ||
| return "metadata ({})".format(METADATA_ROOT) | ||
|
|
||
| def check_instance_id(self, sys_cfg): | ||
| """quickly check (local only) if self.instance_id is still valid | ||
|
|
@@ -292,72 +277,15 @@ def _is_iscsi_root(): | |
| return bool(cmdline.read_initramfs_config()) | ||
|
|
||
|
|
||
| def _load_index(content): | ||
| """Return a list entries parsed from content. | ||
|
|
||
| OpenStack's metadata service returns a newline delimited list | ||
| of items. Oracle's implementation has html formatted list of links. | ||
| The parser here just grabs targets from <a href="target"> | ||
| and throws away "../". | ||
|
|
||
| Oracle has accepted that to be buggy and may fix in the future | ||
| to instead return a '\n' delimited plain text list. This function | ||
| will continue to work if that change is made.""" | ||
| if not content.lower().startswith("<html>"): | ||
| return content.splitlines() | ||
| items = re.findall( | ||
| r'href="(?P<target>[^"]*)"', content, re.MULTILINE | re.IGNORECASE) | ||
| return [i for i in items if not i.startswith(".")] | ||
|
|
||
|
|
||
| def read_metadata(endpoint_base=METADATA_ENDPOINT, sys_uuid=None, | ||
| version='2013-10-17'): | ||
| """Read metadata, return a dictionary. | ||
|
|
||
| Each path listed in the index will be represented in the dictionary. | ||
| If the path ends in .json, then the content will be decoded and | ||
| populated into the dictionary. | ||
|
|
||
| The system uuid (/sys/class/dmi/id/product_uuid) is also populated. | ||
| Example: given paths = ('user_data', 'meta_data.json') | ||
| This would return: | ||
| {version: {'user_data': b'blob', 'meta_data': json.loads(blob.decode()) | ||
| 'system_uuid': '3b54f2e0-3ab2-458d-b770-af9926eee3b2'}} | ||
| def read_opc_metadata(): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was some effort at some point previously to try to standardize and expose a commonly named
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
These don't have a consistent interface (some require a parameter, others don't), and 2 of them aren't in datasource modules. So I'm not sure we lose much by changing the name of this one.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1, since we have a lot of discussion about making datasources consistent and decided how that should be done, this certainly doesn't make things any harder so let's move forward with your changeset. |
||
| """ | ||
| endpoint = combine_url(endpoint_base, version) + "/" | ||
| if sys_uuid is None: | ||
| sys_uuid = _read_system_uuid() | ||
| if not sys_uuid: | ||
| raise sources.BrokenMetadata("Failed to read system uuid.") | ||
|
|
||
| try: | ||
| resp = readurl(endpoint) | ||
| if not resp.ok(): | ||
| raise sources.BrokenMetadata( | ||
| "Bad response from %s: %s" % (endpoint, resp.code)) | ||
| except UrlError as e: | ||
| raise sources.BrokenMetadata( | ||
| "Failed to read index at %s: %s" % (endpoint, e)) | ||
|
|
||
| entries = _load_index(resp.contents.decode('utf-8')) | ||
| LOG.debug("index url %s contained: %s", endpoint, entries) | ||
|
|
||
| # meta_data.json is required. | ||
| mdj = 'meta_data.json' | ||
| if mdj not in entries: | ||
| raise sources.BrokenMetadata( | ||
| "Required field '%s' missing in index at %s" % (mdj, endpoint)) | ||
|
|
||
| ret = {'system_uuid': sys_uuid} | ||
| for path in entries: | ||
| response = readurl(combine_url(endpoint, path)) | ||
| if path.endswith(".json"): | ||
| ret[path.rpartition(".")[0]] = ( | ||
| json.loads(response.contents.decode('utf-8'))) | ||
| else: | ||
| ret[path] = response.contents | ||
| Fetch metadata from the /opc/ routes. | ||
|
|
||
| return {version: ret} | ||
| :return: | ||
| The JSON-decoded value of the /opc/v1/instance/ endpoint on the IMDS. | ||
| """ | ||
| # retries=1 as requested by Oracle to address a potential race condition | ||
| return json.loads(readurl(METADATA_ENDPOINT, retries=1)._response.text) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So we are no longer handling UrlErrors and raising BrokenMetadata with more specific failure failure information instead of UrlErrors. I realize cloud-init doesn't actually handle this specific BrokenMetadata uniquely, but what are your thoughts about us dropping BrokenMetadata in the exception specialization in the future from other/all datasources?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at the way I don't think
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. retries=1 is still just weird, and my argument against it stands. I don't think "the cloud platform requests us to do random non-sense things" is general good argument. Either they have good reasons for doing this, or they do not. But... as I said, I won't fight this. |
||
|
|
||
|
|
||
| # Used to match classes to dependencies | ||
|
|
@@ -373,17 +301,21 @@ def get_datasource_list(depends): | |
|
|
||
| if __name__ == "__main__": | ||
| import argparse | ||
| import os | ||
|
|
||
| parser = argparse.ArgumentParser(description='Query Oracle Cloud Metadata') | ||
| parser.add_argument("--endpoint", metavar="URL", | ||
| help="The url of the metadata service.", | ||
| default=METADATA_ENDPOINT) | ||
| args = parser.parse_args() | ||
| sys_uuid = "uuid-not-available-not-root" if os.geteuid() != 0 else None | ||
|
|
||
| data = read_metadata(endpoint_base=args.endpoint, sys_uuid=sys_uuid) | ||
| data['is_platform_viable'] = _is_platform_viable() | ||
| print(util.json_dumps(data)) | ||
|
OddBloke marked this conversation as resolved.
|
||
|
|
||
| description = """ | ||
| Query Oracle Cloud metadata and emit a JSON object with two keys: | ||
| `read_opc_metadata` and `_is_platform_viable`. The values of each are | ||
| the return values of the corresponding functions defined in | ||
| DataSourceOracle.py.""" | ||
| parser = argparse.ArgumentParser(description=description) | ||
| parser.parse_args() | ||
| print( | ||
| util.json_dumps( | ||
| { | ||
| "read_opc_metadata": read_opc_metadata(), | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Along the same lines of generalization of main output, is it worth just calling this key "metadata"? Admitedly, if required by our upcoming hotplug work, if we actually need to standardize on a common function name for all datasources, and common behavior of json output of a main in each datasource, this would be a flag day type PR that could address all datasources in one go, so maybe we don't have to quibble about it for this PR.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, similarly to my above response, there are only 3 datasources which implement this at all, and they are not consistent in how they do it: AIUI, this output is only really intended for debugging and is already inconsistent across datasources, so I went with key == function name so it's clear where the data is coming from internal to the file.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. works for me, I think this could a topic for us to discuss and refine if, or when, we decide to go this route. |
||
| "_is_platform_viable": _is_platform_viable(), | ||
| } | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it worth exposing the vnic network metadata here somehow from _add_network_config_from_opc_imds too to aid in introspection of content exposed to the vm when calling main?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Potentially, but @TheRealFalcon will be changing this codepath in his follow-up PR anyway (because |
||
| ) | ||
| ) | ||
|
|
||
| # vi: ts=4 expandtab | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misreading this. it appears you are trying to limit the metadata preserved as a cache on the datasource by only instrumenting a few keys from the provided
data.I'm not certain why we would not want to cache all metadata here on the datasource? Why not just update the existing data with additional key aliases for the known availability-zone, instance-id, launch-index, local-hostname and name keys? At least from a triage point of view, we would be able to more easily inspect the system and "raw" values provided by the system more easily via cloud-init query if we get bugs, or have to develop more features etc.
An unwritten goal of
cloud-init queryis, at some point, to accurately represent any instance-data in as close to the original format as possible to potentially allow folks to leverage cloud-init's inherent knowledge of the 'best practices' to talk to a given datasource API (including retries and failure handling etc). This would allow customers on cloud-init driven platforms to simplify their scripts which rely on instance-data/meta-data to just cloud-init query blah instead of knowing the proper routes to curl etc and handle failures.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be the effect, but it is not the intent: I'm following the prior implementation of the datasource which did not include the entire metadata in
self.metadata. Looking at a few other datasources, it's inconsistent as to whether they include all metadata inself.metadataor use something likeself.metadata_full.These seem like two different namespaces to me: if we modify the dict that we get from the cloud, then it's hard for users to know which keys are set by cloud-init (and therefore they can rely on across platforms) and which are cloud-specific; all the keys appear similar. (This is further complicated by the fact that cloud-init uses OpenStack/EC2 names for metadata keys internally; on OpenStack/EC2 there wouldn't even be an obvious pattern to allow you to guess which keys are cross-substrate and which are cloud-specific.)
All that said, I think we already have this distinction between namespaces in the codebase: we have accessors like
get_public_ssh_keysoravailability_zonefor the things which are required for cloud-init to function. So the right move here is likely to store the metadata unmodified, but implement all the accessors whose default implementations assume the presence of OpenStack-named keys in the metadata. This should obviate the need for aliasing.Performing the aliasing required for cloud-init's core functionality in the same data structure as returned by the metadata service would run counter to this goal, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon further inspection,
self._crawled_metadatais used incloud-init queryoutput if it's set (which it is in this PR). This also explains why the prior implementation of the DS was this way.So I don't think we actually need to change anything here after all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Thanks for the cloud-init query dump in IRC, I also launched instance before and after and determined (surprise) that I did misread the metadata setup, original metadata content is properly preserved and accessible under cloud-init query ds.meta_data so this PR didn't unintentionally redact or remove that information.
Oops.