Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
178 changes: 55 additions & 123 deletions cloudinit/sources/DataSourceOracle.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,20 @@
# This file is part of cloud-init. See LICENSE file for license information.
"""Datasource for Oracle (OCI/Oracle Cloud Infrastructure)

OCI provides a OpenStack like metadata service which provides only
'2013-10-17' and 'latest' versions..

Notes:
* This datasource does not support the OCI-Classic. OCI-Classic
provides an EC2 lookalike metadata service.
* The uuid provided in DMI data is not the same as the meta-data provided
* This datasource does not support OCI Classic. OCI Classic provides an EC2
lookalike metadata service.
* The UUID provided in DMI data is not the same as the meta-data provided
instance-id, but has an equivalent lifespan.
* We do need to support upgrade from an instance that cloud-init
identified as OpenStack.
* Both bare-metal and vms use iscsi root
* Both bare-metal and vms provide chassis-asset-tag of OracleCloud.com
* Bare metal instances use iSCSI root, virtual machine instances do not.
* Both bare metal and virtual machine instances provide a chassis-asset-tag of
OracleCloud.com.
"""

import base64
import json
import re

from cloudinit import log as logging
from cloudinit import net, sources, util
Expand All @@ -26,7 +24,7 @@
get_interfaces_by_mac,
is_netfail_master,
)
from cloudinit.url_helper import UrlError, combine_url, readurl
from cloudinit.url_helper import readurl

LOG = logging.getLogger(__name__)

Expand All @@ -35,8 +33,9 @@
'configure_secondary_nics': False,
}
CHASSIS_ASSET_TAG = "OracleCloud.com"
METADATA_ENDPOINT = "http://169.254.169.254/openstack/"
VNIC_METADATA_URL = 'http://169.254.169.254/opc/v1/vnics/'
METADATA_ROOT = "http://169.254.169.254/opc/v1/"
METADATA_ENDPOINT = METADATA_ROOT + "instance/"
VNIC_METADATA_URL = METADATA_ROOT + "vnics/"
# https://docs.cloud.oracle.com/iaas/Content/Network/Troubleshoot/connectionhang.htm#Overview,
# indicates that an MTU of 9000 is used within OCI
MTU = 9000
Expand Down Expand Up @@ -189,53 +188,39 @@ def _get_data(self):
if not self._is_platform_viable():
return False

self.system_uuid = _read_system_uuid()

# network may be configured if iscsi root. If that is the case
# then read_initramfs_config will return non-None.
if _is_iscsi_root():
data = self.crawl_metadata()
data = read_opc_metadata()
else:
with dhcp.EphemeralDHCPv4(net.find_fallback_nic()):
data = self.crawl_metadata()
data = read_opc_metadata()

self._crawled_metadata = data
vdata = data['2013-10-17']

self.userdata_raw = vdata.get('user_data')
self.system_uuid = vdata['system_uuid']

vd = vdata.get('vendor_data')
if vd:
self.vendordata_pure = vd
try:
self.vendordata_raw = sources.convert_vendordata(vd)
except ValueError as e:
LOG.warning("Invalid content in vendor-data: %s", e)
self.vendordata_raw = None

mdcopies = ('public_keys',)
md = dict([(k, vdata['meta_data'].get(k))
for k in mdcopies if k in vdata['meta_data']])

mdtrans = (
# oracle meta_data.json name, cloudinit.datasource.metadata name
('availability_zone', 'availability-zone'),
('hostname', 'local-hostname'),
('launch_index', 'launch-index'),
('uuid', 'instance-id'),
)
for dsname, ciname in mdtrans:
if dsname in vdata['meta_data']:
md[ciname] = vdata['meta_data'][dsname]

self.metadata = md
return True
self.metadata = {
"availability-zone": data["ociAdName"],
"instance-id": data["id"],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misreading this. it appears you are trying to limit the metadata preserved as a cache on the datasource by only instrumenting a few keys from the provided data.

I'm not certain why we would not want to cache all metadata here on the datasource? Why not just update the existing data with additional key aliases for the known availability-zone, instance-id, launch-index, local-hostname and name keys? At least from a triage point of view, we would be able to more easily inspect the system and "raw" values provided by the system more easily via cloud-init query if we get bugs, or have to develop more features etc.

An unwritten goal of cloud-init query is, at some point, to accurately represent any instance-data in as close to the original format as possible to potentially allow folks to leverage cloud-init's inherent knowledge of the 'best practices' to talk to a given datasource API (including retries and failure handling etc). This would allow customers on cloud-init driven platforms to simplify their scripts which rely on instance-data/meta-data to just cloud-init query blah instead of knowing the proper routes to curl etc and handle failures.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misreading this. it appears you are trying to limit the metadata preserved as a cache on the datasource by only instrumenting a few keys from the provided data.

This may be the effect, but it is not the intent: I'm following the prior implementation of the datasource which did not include the entire metadata in self.metadata. Looking at a few other datasources, it's inconsistent as to whether they include all metadata in self.metadata or use something like self.metadata_full.

Why not just update the existing data with additional key aliases for the known availability-zone, instance-id, launch-index, local-hostname and name keys?

These seem like two different namespaces to me: if we modify the dict that we get from the cloud, then it's hard for users to know which keys are set by cloud-init (and therefore they can rely on across platforms) and which are cloud-specific; all the keys appear similar. (This is further complicated by the fact that cloud-init uses OpenStack/EC2 names for metadata keys internally; on OpenStack/EC2 there wouldn't even be an obvious pattern to allow you to guess which keys are cross-substrate and which are cloud-specific.)

All that said, I think we already have this distinction between namespaces in the codebase: we have accessors like get_public_ssh_keys or availability_zone for the things which are required for cloud-init to function. So the right move here is likely to store the metadata unmodified, but implement all the accessors whose default implementations assume the presence of OpenStack-named keys in the metadata. This should obviate the need for aliasing.

An unwritten goal of cloud-init query is, at some point, to accurately represent any instance-data in as close to the original format as possible

Performing the aliasing required for cloud-init's core functionality in the same data structure as returned by the metadata service would run counter to this goal, I think.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further inspection, self._crawled_metadata is used in cloud-init query output if it's set (which it is in this PR). This also explains why the prior implementation of the DS was this way.

So I don't think we actually need to change anything here after all?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon further inspection, self._crawled_metadata is used in cloud-init query output if it's set (which it is in this PR). This also explains why the prior implementation of the DS was this way.

So I don't think we actually need to change anything here after all?

Correct. Thanks for the cloud-init query dump in IRC, I also launched instance before and after and determined (surprise) that I did misread the metadata setup, original metadata content is properly preserved and accessible under cloud-init query ds.meta_data so this PR didn't unintentionally redact or remove that information.
Oops.

"launch-index": 0,
"local-hostname": data["hostname"],
"name": data["displayName"],
}

if "metadata" in data:
user_data = data["metadata"].get("user_data")
if user_data:
self.userdata_raw = base64.b64decode(user_data)
self.metadata["public_keys"] = data["metadata"].get(
"ssh_authorized_keys"
)

def crawl_metadata(self):
return read_metadata()
return True

def _get_subplatform(self):
"""Return the subplatform metadata source details."""
return 'metadata (%s)' % METADATA_ENDPOINT
return "metadata ({})".format(METADATA_ROOT)

def check_instance_id(self, sys_cfg):
"""quickly check (local only) if self.instance_id is still valid
Expand Down Expand Up @@ -292,72 +277,15 @@ def _is_iscsi_root():
return bool(cmdline.read_initramfs_config())


def _load_index(content):
"""Return a list entries parsed from content.

OpenStack's metadata service returns a newline delimited list
of items. Oracle's implementation has html formatted list of links.
The parser here just grabs targets from <a href="target">
and throws away "../".

Oracle has accepted that to be buggy and may fix in the future
to instead return a '\n' delimited plain text list. This function
will continue to work if that change is made."""
if not content.lower().startswith("<html>"):
return content.splitlines()
items = re.findall(
r'href="(?P<target>[^"]*)"', content, re.MULTILINE | re.IGNORECASE)
return [i for i in items if not i.startswith(".")]


def read_metadata(endpoint_base=METADATA_ENDPOINT, sys_uuid=None,
version='2013-10-17'):
"""Read metadata, return a dictionary.

Each path listed in the index will be represented in the dictionary.
If the path ends in .json, then the content will be decoded and
populated into the dictionary.

The system uuid (/sys/class/dmi/id/product_uuid) is also populated.
Example: given paths = ('user_data', 'meta_data.json')
This would return:
{version: {'user_data': b'blob', 'meta_data': json.loads(blob.decode())
'system_uuid': '3b54f2e0-3ab2-458d-b770-af9926eee3b2'}}
def read_opc_metadata():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was some effort at some point previously to try to standardize and expose a commonly named read_metadata method from each datasource module so that at some point we could potentially use from a cloud-init query --no-cache call or hotplug logic to crawl fresh metadata for the datasource without updating the existing cached datasource metadata. specializing this function name seems to move away from that interest.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def read_metadata( only appears in 4 places in the codebase:

cloudinit/sources/DataSourceOracle.py
313:def read_metadata(endpoint_base=METADATA_ENDPOINT, sys_uuid=None,

cloudinit/sources/DataSourceExoscale.py
186:def read_metadata(metadata_url=METADATA_URL,

cloudinit/sources/helpers/hetzner.py
16:def read_metadata(url, timeout=2, sec_between=2, retries=30):

cloudinit/sources/helpers/digitalocean.py
185:def read_metadata(url, timeout=2, sec_between=2, retries=30):

These don't have a consistent interface (some require a parameter, others don't), and 2 of them aren't in datasource modules. So I'm not sure we lose much by changing the name of this one.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, since we have a lot of discussion about making datasources consistent and decided how that should be done, this certainly doesn't make things any harder so let's move forward with your changeset.

"""
endpoint = combine_url(endpoint_base, version) + "/"
if sys_uuid is None:
sys_uuid = _read_system_uuid()
if not sys_uuid:
raise sources.BrokenMetadata("Failed to read system uuid.")

try:
resp = readurl(endpoint)
if not resp.ok():
raise sources.BrokenMetadata(
"Bad response from %s: %s" % (endpoint, resp.code))
except UrlError as e:
raise sources.BrokenMetadata(
"Failed to read index at %s: %s" % (endpoint, e))

entries = _load_index(resp.contents.decode('utf-8'))
LOG.debug("index url %s contained: %s", endpoint, entries)

# meta_data.json is required.
mdj = 'meta_data.json'
if mdj not in entries:
raise sources.BrokenMetadata(
"Required field '%s' missing in index at %s" % (mdj, endpoint))

ret = {'system_uuid': sys_uuid}
for path in entries:
response = readurl(combine_url(endpoint, path))
if path.endswith(".json"):
ret[path.rpartition(".")[0]] = (
json.loads(response.contents.decode('utf-8')))
else:
ret[path] = response.contents
Fetch metadata from the /opc/ routes.

return {version: ret}
:return:
The JSON-decoded value of the /opc/v1/instance/ endpoint on the IMDS.
"""
# retries=1 as requested by Oracle to address a potential race condition
return json.loads(readurl(METADATA_ENDPOINT, retries=1)._response.text)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are no longer handling UrlErrors and raising BrokenMetadata with more specific failure failure information instead of UrlErrors. I realize cloud-init doesn't actually handle this specific BrokenMetadata uniquely, but what are your thoughts about us dropping BrokenMetadata in the exception specialization in the future from other/all datasources?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the way BrokenMetadata is used elsewhere it breaks down in two ways: (a) the content of the metadata is incorrect (e.g. random_seed is not base64 as we expect), or (b) there are multiple different "transports" being used, and BrokenMetadata is used to abstract away the differences between the exceptions that those "transports" would raise on failure. (The only example of (b) is OpenStack: ConfigDrive will raise filesystem-related errors whereas the IMDS will raise HTTP errors.)

I don't think BrokenMetadata gains us much here because it doesn't match either of these cases: the UrlError indicates a "transport" issue (rather than a content issue), and we only have one "transport" to worry about.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retries=1 is still just weird, and my argument against it stands. I don't think "the cloud platform requests us to do random non-sense things" is general good argument. Either they have good reasons for doing this, or they do not.

But... as I said, I won't fight this.



# Used to match classes to dependencies
Expand All @@ -373,17 +301,21 @@ def get_datasource_list(depends):

if __name__ == "__main__":
import argparse
import os

parser = argparse.ArgumentParser(description='Query Oracle Cloud Metadata')
parser.add_argument("--endpoint", metavar="URL",
help="The url of the metadata service.",
default=METADATA_ENDPOINT)
args = parser.parse_args()
sys_uuid = "uuid-not-available-not-root" if os.geteuid() != 0 else None

data = read_metadata(endpoint_base=args.endpoint, sys_uuid=sys_uuid)
data['is_platform_viable'] = _is_platform_viable()
print(util.json_dumps(data))
Comment thread
OddBloke marked this conversation as resolved.

description = """
Query Oracle Cloud metadata and emit a JSON object with two keys:
`read_opc_metadata` and `_is_platform_viable`. The values of each are
the return values of the corresponding functions defined in
DataSourceOracle.py."""
parser = argparse.ArgumentParser(description=description)
parser.parse_args()
print(
util.json_dumps(
{
"read_opc_metadata": read_opc_metadata(),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the same lines of generalization of main output, is it worth just calling this key "metadata"?
Also in the same vein, is_platform_viable function was an interest in generalizing for each datasource Is it worth just surfacing "is_platform_viable" key for this?

Admitedly, if required by our upcoming hotplug work, if we actually need to standardize on a common function name for all datasources, and common behavior of json output of a main in each datasource, this would be a flag day type PR that could address all datasources in one go, so maybe we don't have to quibble about it for this PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, similarly to my above response, there are only 3 datasources which implement this at all, and they are not consistent in how they do it:

cloudinit/sources/DataSourceOracle.py
184:    def _is_platform_viable(self):
286:def _is_platform_viable():

cloudinit/sources/DataSourceExoscale.py
137:    def _is_platform_viable(self):

cloudinit/sources/DataSourceAzure.py
502:    def _is_platform_viable(self):
1497:def _is_platform_viable(seed_dir):

AIUI, this output is only really intended for debugging and is already inconsistent across datasources, so I went with key == function name so it's clear where the data is coming from internal to the file.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me, I think this could a topic for us to discuss and refine if, or when, we decide to go this route.

"_is_platform_viable": _is_platform_viable(),
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth exposing the vnic network metadata here somehow from _add_network_config_from_opc_imds too to aid in introspection of content exposed to the vm when calling main?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but @TheRealFalcon will be changing this codepath in his follow-up PR anyway (because read_opc_metadata will evolve); instead of writing more code that will be replaced, let's remember to get him to do this.

)
)

# vi: ts=4 expandtab
Loading