Skip to content
This repository was archived by the owner on Aug 19, 2019. It is now read-only.

Conversation

@igorpeshansky
Copy link
Contributor

@igorpeshansky igorpeshansky commented Dec 19, 2017

  • Decouple metadata updates from polling.
  • Factor out common functionality in Kubernetes API response processing.

@igorpeshansky
Copy link
Contributor Author

@supriyagarg, would be great if you could review this as well.
The last change in the description probably deserves to be a separate PR, as it's not really semantics-preserving.

Copy link
Contributor

@bmoyles0117 bmoyles0117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, will comb through again.


json::value KubernetesReader::ComputePodAssociations(const json::Object* pod)
const throw(json::Exception) {
const std::string platform = "gce"; // TODO: detect other platforms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be GKE, are are we specifically talking about the compute platform? If the compute platform, I think a clarifying comment would help since the rename.

Copy link
Contributor Author

@igorpeshansky igorpeshansky Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the platform of the instance on which the node is running on. We'll probably want to remove this anyway, because it's already present in the node metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

top_level_metadata->Get<json::String>("uid") != pod_id) {
LOG(ERROR) << "Internal error; top-level controller without 'kind' "
<< *top_level_controller
<< " not the same as pod " << *pod;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be returning out of this, so that we don't assume that the entity we're looking at is a pod?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, assuming that it's a pod is fairly harmless and won't break anything downstream. We do log an error in case someone wanted to investigate. Not worth terminating the poll (i.e., throwing an exception) for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

kResourceTypeSeparator);
return MetadataUpdater::ResourceMetadata(
std::vector<std::string>{k8s_pod_id, k8s_pod_name},
k8s_pod,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fits on previous line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping this as one parameter per line, for readability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

bool is_deleted = false;

const MonitoredResource k8s_container("k8s_container", {
{"cluster_name", cluster_name},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building and using these labels seems heavily shared between containers and pods, can we recycle some code here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already share the label values. Unfortunately, once you start dynamically building arrays, this stops looking like a literal, which was the whole purpose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM


const MonitoredResource gke_container("gke_container", {
{"cluster_name", cluster_name},
{"namespace_id", namespace_name},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution, pretty sure the legacy resource type is using the namespace uid here, not the name. Just had to deal with this with the integration tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The logging agent uses the namespace name (and pod name, for that matter) in the legacy resources, as I recall. If anything, I should be changing the pod id to pod name instead...
Good catch, but I think it's orthogonal to this refactoring. Let's open a bug and follow up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, I'll file the bug and we'll address these issues there.

);
}

MetadataUpdater::ResourceMetadata KubernetesReader::GetLegacyResource(
Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't there a strange hack around using the gke_container resource to monitor instances? Do we need to be capturing those as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean s/instances/nodes/. Yes, there was a hack for nodes (and pods), but it relied on setting the container name appropriately, so we don't have to worry about this -- this code deals with actual containers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

status->Get<json::Array>("containerStatuses");

if (container_specs->size() != container_list->size()) {
LOG(ERROR) << "Container specs and statuses arrays "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that you are logging an error when the kubernetes pod spec containers array doesn't match the number of pods that actually exist within the pod? If so, it's important to note that this SHOULD be expected. The pod should only not conflict with its parent replicaset, but it doesn't need to be consistent with its parent deployment as it might be undergoing an active change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this checks that the containers and container_specs arrays have the same size within a single pod returned by a single API call. If they don't, something's gone terribly wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this even possible to occur? If it's such an extreme scenario, are we essentially validating the Kubernetes API at this point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kind of. I don't know if it's possible. This is basically an assertion. I didn't want to crash when this occurs, because there's an obvious recovery path, so I just log an error and continue.

#endif
);

std::vector<MetadataUpdater::ResourceMetadata> pod_metadata =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I understand the naming, but pod_metadata still led me to believe this was just metadata for pod resource types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

return QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
path_component.second + "/" + name);
owner = std::move(
QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use api_version and kind here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. The path_component is based on api_version and kind, but there's more to it.

Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't realize how much KindPath was doing. SGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the joys of the Kubernetes Master API spec...

store_->UpdateResource(
result.ids, result.resource, std::move(result.metadata));
UpdateResourceCallback(result);
UpdateMetadataCallback(std::move(result));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfamiliarity with C++ here, will there be any implications when using move when we start cleaning up resource metadata after being synced? Should resource callback get a copy? Perhaps it already is, just airing concerns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::move hands off the ownership of result to the callback. From here on, result does not own any memory.

Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that previously you passed in result.resource, and used std::move only on the metadata, whereas now you're moving the entire object and sharing it. Will that have any negative ramifications for the resource callback?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, the callbacks are invoked sequentially, and the resource callback doesn't modify the result (and, in fact, doesn't touch result.metadata). After the metadata callback is invoked, the result's metadata is gone. As for moving the whole result as opposed to result.metadata, that support is added in dc2e1d3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed feedback. PTAL.


json::value KubernetesReader::ComputePodAssociations(const json::Object* pod)
const throw(json::Exception) {
const std::string platform = "gce"; // TODO: detect other platforms.
Copy link
Contributor Author

@igorpeshansky igorpeshansky Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the platform of the instance on which the node is running on. We'll probably want to remove this anyway, because it's already present in the node metadata.

top_level_metadata->Get<json::String>("uid") != pod_id) {
LOG(ERROR) << "Internal error; top-level controller without 'kind' "
<< *top_level_controller
<< " not the same as pod " << *pod;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, assuming that it's a pod is fairly harmless and won't break anything downstream. We do log an error in case someone wanted to investigate. Not worth terminating the poll (i.e., throwing an exception) for.

kResourceTypeSeparator);
return MetadataUpdater::ResourceMetadata(
std::vector<std::string>{k8s_pod_id, k8s_pod_name},
k8s_pod,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping this as one parameter per line, for readability.

bool is_deleted = false;

const MonitoredResource k8s_container("k8s_container", {
{"cluster_name", cluster_name},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already share the label values. Unfortunately, once you start dynamically building arrays, this stops looking like a literal, which was the whole purpose.

);
}

MetadataUpdater::ResourceMetadata KubernetesReader::GetLegacyResource(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean s/instances/nodes/. Yes, there was a hack for nodes (and pods), but it relied on setting the container name appropriately, so we don't have to worry about this -- this code deals with actual containers.


const MonitoredResource gke_container("gke_container", {
{"cluster_name", cluster_name},
{"namespace_id", namespace_name},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The logging agent uses the namespace name (and pod name, for that matter) in the legacy resources, as I recall. If anything, I should be changing the pod id to pod name instead...
Good catch, but I think it's orthogonal to this refactoring. Let's open a bug and follow up.

store_->UpdateResource(
result.ids, result.resource, std::move(result.metadata));
UpdateResourceCallback(result);
UpdateMetadataCallback(std::move(result));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::move hands off the ownership of result to the callback. From here on, result does not own any memory.

status->Get<json::Array>("containerStatuses");

if (container_specs->size() != container_list->size()) {
LOG(ERROR) << "Container specs and statuses arrays "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this checks that the containers and container_specs arrays have the same size within a single pod returned by a single API call. If they don't, something's gone terribly wrong.

#endif
);

std::vector<MetadataUpdater::ResourceMetadata> pod_metadata =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

return QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
path_component.second + "/" + name);
owner = std::move(
QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. The path_component is based on api_version and kind, but there's more to it.

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded. PTAL.

status->Get<json::Array>("containerStatuses");

if (container_specs->size() != container_list->size()) {
LOG(ERROR) << "Container specs and statuses arrays "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kind of. I don't know if it's possible. This is basically an assertion. I didn't want to crash when this occurs, because there's an obvious recovery path, so I just log an error and continue.

return QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
path_component.second + "/" + name);
owner = std::move(
QueryMaster(path_component.first + "/namespaces/" + ns + "/" +
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the joys of the Kubernetes Master API spec...

store_->UpdateResource(
result.ids, result.resource, std::move(result.metadata));
UpdateResourceCallback(result);
UpdateMetadataCallback(std::move(result));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, the callbacks are invoked sequentially, and the resource callback doesn't modify the result (and, in fact, doesn't touch result.metadata). After the metadata callback is invoked, the result's metadata is gone. As for moving the whole result as opposed to result.metadata, that support is added in dc2e1d3.

const std::string k8s_node_id = boost::algorithm::join(
std::vector<std::string>{kK8sNodeResourcePrefix, node_id},
kResourceTypeSeparator);
// TODO: do we need this?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a use-case where this would be helpful for now. It's simply enough to add the resource identifier later on that we might as well remove it for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.


// TODO: find pod_deleted.
//const json::Object* status = pod->Get<json::Object>("status");
bool pod_deleted = false;
Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Rename to is_deleted or rename is_deleted in the container section to container_deleted for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const json::Object* status = pod->Get<json::Object>("status");

const json::Array* container_specs = spec->Get<json::Array>("containers");
const json::Array* container_list =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confused by the name, can this be container_statuses instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also renamed c_element.

);

std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata =
GetPodAndContainerMetadata(pod, collected_at);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify, are we only running subqueries for pods, for the node that we're running on? I can't seem to find where we would list pods that may not be associated to a node due to scheduling restrictions by Kubernetes due to CPU / Memory / Host Ports / etc... Not having a local resource ID to query for is fine, but not syncing metadata would be bad from an observability point of view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Unscheduled pods are out of scope for the current iteration. For now we only require metadata for pods/containers that have either logs or metrics associated with them, and all of those would have been scheduled to a node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

store_->UpdateResource(
result.ids, result.resource, std::move(result.metadata));
UpdateResourceCallback(result);
UpdateMetadataCallback(std::move(result));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments. PTAL.

const std::string k8s_node_id = boost::algorithm::join(
std::vector<std::string>{kK8sNodeResourcePrefix, node_id},
kResourceTypeSeparator);
// TODO: do we need this?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.


// TODO: find pod_deleted.
//const json::Object* status = pod->Get<json::Object>("status");
bool pod_deleted = false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const json::Object* status = pod->Get<json::Object>("status");

const json::Array* container_specs = spec->Get<json::Array>("containers");
const json::Array* container_list =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also renamed c_element.

);

std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata =
GetPodAndContainerMetadata(pod, collected_at);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Unscheduled pods are out of scope for the current iteration. For now we only require metadata for pods/containers that have either logs or metrics associated with them, and all of those would have been scheduled to a node.

@igorpeshansky igorpeshansky force-pushed the igorp-kubernetes-refactoring branch 3 times, most recently from bad0e2d to 02e756f Compare December 19, 2017 21:37
@igorpeshansky
Copy link
Contributor Author

Pulled out owner memoization into #46.

}
}

void MetadataAgent::UpdateMetadata(const MonitoredResource& resource,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why is this second map not keyed by resource_id as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that map is sent to to the Resource Metadata API, which neither knows nor cares about the local resource ids.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - how is the local resource_id different from the monitored resource ID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local resource_id is an easily discoverable identifier for the resource that is unique within the current context (e.g., a Kubernetes node), but may not be globally unique, whereas the monitored resource (which actually is an id, so "monitored resource id" is redundant) is globally unique.

@igorpeshansky igorpeshansky force-pushed the igorp-kubernetes-refactoring branch from 02e756f to 38d3db0 Compare December 19, 2017 23:03
@igorpeshansky igorpeshansky changed the base branch from igorp-miscellaneous-cleanup to master December 19, 2017 23:04
@igorpeshansky
Copy link
Contributor Author

Rebased. PTAL.

@igorpeshansky igorpeshansky force-pushed the igorp-kubernetes-refactoring branch 3 times, most recently from f7c65d9 to 38d3db0 Compare December 20, 2017 19:56
Copy link

@qingling128 qingling128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (Learning how to use github more efficiently LOL)

if (config_.VerboseLogging()) {
LOG(INFO) << "Container: " << *c_status;
}
const json::Object* container = c_status->As<json::Object>();
Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would renaming this to container_status cause a bunch if line formatting issues? I got confused below when we sent the status metadata and provided the container, would have been more clear if it were container_status. Also throughout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to change this code anyway to fix the list ordering issue -- it would make sense to change this in that PR. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

{"version", json::string(kKubernetesApiVersion)},
{"raw", container->Clone()},
})},
{"labels", json::object({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Container labels will never vary from the pod labels, should we really be sending this up? Is this simply a feature request for simplicity on the UI side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of treating each container as a separate resource, so that's what the API expects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, the node doesn't need to send labels, these are raw kubernetes labels in the raw metadata, not resource labels.

Copy link
Contributor

@bmoyles0117 bmoyles0117 Dec 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I say "node doesn't need to", I'm specifically referring to the part way above where the node sends off the raw metadata blob.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both nodes and pods get the full Kubernetes object metadata, which includes the labels. Containers, on the other hand, have an artificially constructed metadata blob, which, instead of "raw", includes "spec", "status", and "labels" -- that's just the way it was defined. All of these are, in fact, redundant, because they're present in the pod metadata, but because containers are stored separately, the information is replicated. Note that this the way it's been since the start, and is orthogonal to this refactoring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, agreed that it's orthogonal. SGTM.

{"zone", zone},
});

const std::string gke_container_id = boost::algorithm::join(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be using the container ID instead of pod_id name? If not, can we rename the variable? I know this will likely be deprecated, but caught my eye for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

);

std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata =
GetPodAndContainerMetadata(pod, collected_at);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed feedback. PTAL.

if (config_.VerboseLogging()) {
LOG(INFO) << "Container: " << *c_status;
}
const json::Object* container = c_status->As<json::Object>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to change this code anyway to fix the list ordering issue -- it would make sense to change this in that PR. WDYT?

{"version", json::string(kKubernetesApiVersion)},
{"raw", container->Clone()},
})},
{"labels", json::object({
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an artifact of treating each container as a separate resource, so that's what the API expects.

{"zone", zone},
});

const std::string gke_container_id = boost::algorithm::join(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

Copy link
Contributor

@bmoyles0117 bmoyles0117 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐿

{"version", json::string(kKubernetesApiVersion)},
{"raw", container->Clone()},
})},
{"labels", json::object({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, agreed that it's orthogonal. SGTM.

if (config_.VerboseLogging()) {
LOG(INFO) << "Container: " << *c_status;
}
const json::Object* container = c_status->As<json::Object>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

}
}

void MetadataAgent::UpdateMetadata(const MonitoredResource& resource,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - how is the local resource_id different from the monitored resource ID?

const MonitoredResource& resource);

// Updates metadata for a given resource.
// Adds a metadata mapping from the `resource` to the metadata `entry`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "metadata mapping" mean. Is just "mapping" sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata agent maintains two mappings:

  • local resource map, from a locally unique id to the monitored resource (there may be multiple locally unique ids, all referring to the same monitored resource), which is exposed via a local endpoint to other agents so they don't have to discover all parts of the monitored resource themselves, and
  • metadata map, from the monitored resource to the metadata blob, which is sent to the Resource Metadata API.

MetadataUpdater::ResourceMetadata GetContainerMetadata(
const json::Object* pod, int container_index, json::value associations,
Timestamp collected_at) const throw(json::Exception);
// Given a pod object and container index, return the legacy resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

legacy resource -> legacy container metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be no actual metadata in the returned object (see the next line), so this is intentional.

@igorpeshansky igorpeshansky force-pushed the igorp-kubernetes-refactoring branch from d623dcb to 02e756f Compare December 20, 2017 22:39
Copy link
Contributor Author

@igorpeshansky igorpeshansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@supriyagarg, responded to your comments. PTAL.

}
}

void MetadataAgent::UpdateMetadata(const MonitoredResource& resource,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local resource_id is an easily discoverable identifier for the resource that is unique within the current context (e.g., a Kubernetes node), but may not be globally unique, whereas the monitored resource (which actually is an id, so "monitored resource id" is redundant) is globally unique.

const MonitoredResource& resource);

// Updates metadata for a given resource.
// Adds a metadata mapping from the `resource` to the metadata `entry`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata agent maintains two mappings:

  • local resource map, from a locally unique id to the monitored resource (there may be multiple locally unique ids, all referring to the same monitored resource), which is exposed via a local endpoint to other agents so they don't have to discover all parts of the monitored resource themselves, and
  • metadata map, from the monitored resource to the metadata blob, which is sent to the Resource Metadata API.

MetadataUpdater::ResourceMetadata GetContainerMetadata(
const json::Object* pod, int container_index, json::value associations,
Timestamp collected_at) const throw(json::Exception);
// Given a pod object and container index, return the legacy resource.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be no actual metadata in the returned object (see the next line), so this is intentional.

Copy link
Contributor

@supriyagarg supriyagarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications Igor.

LGTM.

@igorpeshansky igorpeshansky force-pushed the igorp-kubernetes-refactoring branch 3 times, most recently from d20cef7 to d623dcb Compare December 20, 2017 23:04
@igorpeshansky igorpeshansky merged commit 277e6ba into master Dec 20, 2017
@igorpeshansky igorpeshansky deleted the igorp-kubernetes-refactoring branch January 23, 2018 20:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants