-
Notifications
You must be signed in to change notification settings - Fork 11
A set of refactorings to enable Kubernetes watch. #44
Conversation
|
@supriyagarg, would be great if you could review this as well. |
bmoyles0117
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, will comb through again.
|
|
||
| json::value KubernetesReader::ComputePodAssociations(const json::Object* pod) | ||
| const throw(json::Exception) { | ||
| const std::string platform = "gce"; // TODO: detect other platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be GKE, are are we specifically talking about the compute platform? If the compute platform, I think a clarifying comment would help since the rename.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the platform of the instance on which the node is running on. We'll probably want to remove this anyway, because it's already present in the node metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
| top_level_metadata->Get<json::String>("uid") != pod_id) { | ||
| LOG(ERROR) << "Internal error; top-level controller without 'kind' " | ||
| << *top_level_controller | ||
| << " not the same as pod " << *pod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be returning out of this, so that we don't assume that the entity we're looking at is a pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case, assuming that it's a pod is fairly harmless and won't break anything downstream. We do log an error in case someone wanted to investigate. Not worth terminating the poll (i.e., throwing an exception) for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
| kResourceTypeSeparator); | ||
| return MetadataUpdater::ResourceMetadata( | ||
| std::vector<std::string>{k8s_pod_id, k8s_pod_name}, | ||
| k8s_pod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fits on previous line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keeping this as one parameter per line, for readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
| bool is_deleted = false; | ||
|
|
||
| const MonitoredResource k8s_container("k8s_container", { | ||
| {"cluster_name", cluster_name}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building and using these labels seems heavily shared between containers and pods, can we recycle some code here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already share the label values. Unfortunately, once you start dynamically building arrays, this stops looking like a literal, which was the whole purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
|
|
||
| const MonitoredResource gke_container("gke_container", { | ||
| {"cluster_name", cluster_name}, | ||
| {"namespace_id", namespace_name}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution, pretty sure the legacy resource type is using the namespace uid here, not the name. Just had to deal with this with the integration tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. The logging agent uses the namespace name (and pod name, for that matter) in the legacy resources, as I recall. If anything, I should be changing the pod id to pod name instead...
Good catch, but I think it's orthogonal to this refactoring. Let's open a bug and follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM, I'll file the bug and we'll address these issues there.
| ); | ||
| } | ||
|
|
||
| MetadataUpdater::ResourceMetadata KubernetesReader::GetLegacyResource( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there a strange hack around using the gke_container resource to monitor instances? Do we need to be capturing those as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean s/instances/nodes/. Yes, there was a hack for nodes (and pods), but it relied on setting the container name appropriately, so we don't have to worry about this -- this code deals with actual containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
| status->Get<json::Array>("containerStatuses"); | ||
|
|
||
| if (container_specs->size() != container_list->size()) { | ||
| LOG(ERROR) << "Container specs and statuses arrays " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that you are logging an error when the kubernetes pod spec containers array doesn't match the number of pods that actually exist within the pod? If so, it's important to note that this SHOULD be expected. The pod should only not conflict with its parent replicaset, but it doesn't need to be consistent with its parent deployment as it might be undergoing an active change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this checks that the containers and container_specs arrays have the same size within a single pod returned by a single API call. If they don't, something's gone terribly wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this even possible to occur? If it's such an extreme scenario, are we essentially validating the Kubernetes API at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kind of. I don't know if it's possible. This is basically an assertion. I didn't want to crash when this occurs, because there's an obvious recovery path, so I just log an error and continue.
src/kubernetes.cc
Outdated
| #endif | ||
| ); | ||
|
|
||
| std::vector<MetadataUpdater::ResourceMetadata> pod_metadata = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I understand the naming, but pod_metadata still led me to believe this was just metadata for pod resource types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
src/kubernetes.cc
Outdated
| return QueryMaster(path_component.first + "/namespaces/" + ns + "/" + | ||
| path_component.second + "/" + name); | ||
| owner = std::move( | ||
| QueryMaster(path_component.first + "/namespaces/" + ns + "/" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use api_version and kind here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. The path_component is based on api_version and kind, but there's more to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't realize how much KindPath was doing. SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the joys of the Kubernetes Master API spec...
| store_->UpdateResource( | ||
| result.ids, result.resource, std::move(result.metadata)); | ||
| UpdateResourceCallback(result); | ||
| UpdateMetadataCallback(std::move(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfamiliarity with C++ here, will there be any implications when using move when we start cleaning up resource metadata after being synced? Should resource callback get a copy? Perhaps it already is, just airing concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::move hands off the ownership of result to the callback. From here on, result does not own any memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that previously you passed in result.resource, and used std::move only on the metadata, whereas now you're moving the entire object and sharing it. Will that have any negative ramifications for the resource callback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, the callbacks are invoked sequentially, and the resource callback doesn't modify the result (and, in fact, doesn't touch result.metadata). After the metadata callback is invoked, the result's metadata is gone. As for moving the whole result as opposed to result.metadata, that support is added in dc2e1d3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
igorpeshansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed feedback. PTAL.
|
|
||
| json::value KubernetesReader::ComputePodAssociations(const json::Object* pod) | ||
| const throw(json::Exception) { | ||
| const std::string platform = "gce"; // TODO: detect other platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is the platform of the instance on which the node is running on. We'll probably want to remove this anyway, because it's already present in the node metadata.
| top_level_metadata->Get<json::String>("uid") != pod_id) { | ||
| LOG(ERROR) << "Internal error; top-level controller without 'kind' " | ||
| << *top_level_controller | ||
| << " not the same as pod " << *pod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this case, assuming that it's a pod is fairly harmless and won't break anything downstream. We do log an error in case someone wanted to investigate. Not worth terminating the poll (i.e., throwing an exception) for.
| kResourceTypeSeparator); | ||
| return MetadataUpdater::ResourceMetadata( | ||
| std::vector<std::string>{k8s_pod_id, k8s_pod_name}, | ||
| k8s_pod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keeping this as one parameter per line, for readability.
| bool is_deleted = false; | ||
|
|
||
| const MonitoredResource k8s_container("k8s_container", { | ||
| {"cluster_name", cluster_name}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already share the label values. Unfortunately, once you start dynamically building arrays, this stops looking like a literal, which was the whole purpose.
| ); | ||
| } | ||
|
|
||
| MetadataUpdater::ResourceMetadata KubernetesReader::GetLegacyResource( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean s/instances/nodes/. Yes, there was a hack for nodes (and pods), but it relied on setting the container name appropriately, so we don't have to worry about this -- this code deals with actual containers.
|
|
||
| const MonitoredResource gke_container("gke_container", { | ||
| {"cluster_name", cluster_name}, | ||
| {"namespace_id", namespace_name}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. The logging agent uses the namespace name (and pod name, for that matter) in the legacy resources, as I recall. If anything, I should be changing the pod id to pod name instead...
Good catch, but I think it's orthogonal to this refactoring. Let's open a bug and follow up.
| store_->UpdateResource( | ||
| result.ids, result.resource, std::move(result.metadata)); | ||
| UpdateResourceCallback(result); | ||
| UpdateMetadataCallback(std::move(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::move hands off the ownership of result to the callback. From here on, result does not own any memory.
| status->Get<json::Array>("containerStatuses"); | ||
|
|
||
| if (container_specs->size() != container_list->size()) { | ||
| LOG(ERROR) << "Container specs and statuses arrays " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this checks that the containers and container_specs arrays have the same size within a single pod returned by a single API call. If they don't, something's gone terribly wrong.
src/kubernetes.cc
Outdated
| #endif | ||
| ); | ||
|
|
||
| std::vector<MetadataUpdater::ResourceMetadata> pod_metadata = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
src/kubernetes.cc
Outdated
| return QueryMaster(path_component.first + "/namespaces/" + ns + "/" + | ||
| path_component.second + "/" + name); | ||
| owner = std::move( | ||
| QueryMaster(path_component.first + "/namespaces/" + ns + "/" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. The path_component is based on api_version and kind, but there's more to it.
igorpeshansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded. PTAL.
| status->Get<json::Array>("containerStatuses"); | ||
|
|
||
| if (container_specs->size() != container_list->size()) { | ||
| LOG(ERROR) << "Container specs and statuses arrays " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, kind of. I don't know if it's possible. This is basically an assertion. I didn't want to crash when this occurs, because there's an obvious recovery path, so I just log an error and continue.
src/kubernetes.cc
Outdated
| return QueryMaster(path_component.first + "/namespaces/" + ns + "/" + | ||
| path_component.second + "/" + name); | ||
| owner = std::move( | ||
| QueryMaster(path_component.first + "/namespaces/" + ns + "/" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the joys of the Kubernetes Master API spec...
| store_->UpdateResource( | ||
| result.ids, result.resource, std::move(result.metadata)); | ||
| UpdateResourceCallback(result); | ||
| UpdateMetadataCallback(std::move(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, the callbacks are invoked sequentially, and the resource callback doesn't modify the result (and, in fact, doesn't touch result.metadata). After the metadata callback is invoked, the result's metadata is gone. As for moving the whole result as opposed to result.metadata, that support is added in dc2e1d3.
src/kubernetes.cc
Outdated
| const std::string k8s_node_id = boost::algorithm::join( | ||
| std::vector<std::string>{kK8sNodeResourcePrefix, node_id}, | ||
| kResourceTypeSeparator); | ||
| // TODO: do we need this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a use-case where this would be helpful for now. It's simply enough to add the resource identifier later on that we might as well remove it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/kubernetes.cc
Outdated
|
|
||
| // TODO: find pod_deleted. | ||
| //const json::Object* status = pod->Get<json::Object>("status"); | ||
| bool pod_deleted = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Rename to is_deleted or rename is_deleted in the container section to container_deleted for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/kubernetes.cc
Outdated
| const json::Object* status = pod->Get<json::Object>("status"); | ||
|
|
||
| const json::Array* container_specs = spec->Get<json::Array>("containers"); | ||
| const json::Array* container_list = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused by the name, can this be container_statuses instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also renamed c_element.
| ); | ||
|
|
||
| std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata = | ||
| GetPodAndContainerMetadata(pod, collected_at); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify, are we only running subqueries for pods, for the node that we're running on? I can't seem to find where we would list pods that may not be associated to a node due to scheduling restrictions by Kubernetes due to CPU / Memory / Host Ports / etc... Not having a local resource ID to query for is fine, but not syncing metadata would be bad from an observability point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Unscheduled pods are out of scope for the current iteration. For now we only require metadata for pods/containers that have either logs or metrics associated with them, and all of those would have been scheduled to a node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
| store_->UpdateResource( | ||
| result.ids, result.resource, std::move(result.metadata)); | ||
| UpdateResourceCallback(result); | ||
| UpdateMetadataCallback(std::move(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
igorpeshansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed comments. PTAL.
src/kubernetes.cc
Outdated
| const std::string k8s_node_id = boost::algorithm::join( | ||
| std::vector<std::string>{kK8sNodeResourcePrefix, node_id}, | ||
| kResourceTypeSeparator); | ||
| // TODO: do we need this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/kubernetes.cc
Outdated
|
|
||
| // TODO: find pod_deleted. | ||
| //const json::Object* status = pod->Get<json::Object>("status"); | ||
| bool pod_deleted = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/kubernetes.cc
Outdated
| const json::Object* status = pod->Get<json::Object>("status"); | ||
|
|
||
| const json::Array* container_specs = spec->Get<json::Array>("containers"); | ||
| const json::Array* container_list = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also renamed c_element.
| ); | ||
|
|
||
| std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata = | ||
| GetPodAndContainerMetadata(pod, collected_at); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Unscheduled pods are out of scope for the current iteration. For now we only require metadata for pods/containers that have either logs or metrics associated with them, and all of those would have been scheduled to a node.
bad0e2d to
02e756f
Compare
|
Pulled out owner memoization into #46. |
| } | ||
| } | ||
|
|
||
| void MetadataAgent::UpdateMetadata(const MonitoredResource& resource, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why is this second map not keyed by resource_id as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that map is sent to to the Resource Metadata API, which neither knows nor cares about the local resource ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it. Makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - how is the local resource_id different from the monitored resource ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local resource_id is an easily discoverable identifier for the resource that is unique within the current context (e.g., a Kubernetes node), but may not be globally unique, whereas the monitored resource (which actually is an id, so "monitored resource id" is redundant) is globally unique.
…gMetadataUpdater constructor arguments.
…adata. Use separate locks for each map. Make it explicit that UpdateMetadataCallback takes ownership of the result.
02e756f to
38d3db0
Compare
|
Rebased. PTAL. |
f7c65d9 to
38d3db0
Compare
qingling128
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (Learning how to use github more efficiently LOL)
| if (config_.VerboseLogging()) { | ||
| LOG(INFO) << "Container: " << *c_status; | ||
| } | ||
| const json::Object* container = c_status->As<json::Object>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would renaming this to container_status cause a bunch if line formatting issues? I got confused below when we sent the status metadata and provided the container, would have been more clear if it were container_status. Also throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to change this code anyway to fix the list ordering issue -- it would make sense to change this in that PR. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
| {"version", json::string(kKubernetesApiVersion)}, | ||
| {"raw", container->Clone()}, | ||
| })}, | ||
| {"labels", json::object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Container labels will never vary from the pod labels, should we really be sending this up? Is this simply a feature request for simplicity on the UI side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an artifact of treating each container as a separate resource, so that's what the API expects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, the node doesn't need to send labels, these are raw kubernetes labels in the raw metadata, not resource labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I say "node doesn't need to", I'm specifically referring to the part way above where the node sends off the raw metadata blob.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both nodes and pods get the full Kubernetes object metadata, which includes the labels. Containers, on the other hand, have an artificially constructed metadata blob, which, instead of "raw", includes "spec", "status", and "labels" -- that's just the way it was defined. All of these are, in fact, redundant, because they're present in the pod metadata, but because containers are stored separately, the information is replicated. Note that this the way it's been since the start, and is orthogonal to this refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, agreed that it's orthogonal. SGTM.
src/kubernetes.cc
Outdated
| {"zone", zone}, | ||
| }); | ||
|
|
||
| const std::string gke_container_id = boost::algorithm::join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be using the container ID instead of pod_id name? If not, can we rename the variable? I know this will likely be deprecated, but caught my eye for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
| ); | ||
|
|
||
| std::vector<MetadataUpdater::ResourceMetadata> full_pod_metadata = | ||
| GetPodAndContainerMetadata(pod, collected_at); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
igorpeshansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed feedback. PTAL.
| if (config_.VerboseLogging()) { | ||
| LOG(INFO) << "Container: " << *c_status; | ||
| } | ||
| const json::Object* container = c_status->As<json::Object>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to change this code anyway to fix the list ordering issue -- it would make sense to change this in that PR. WDYT?
| {"version", json::string(kKubernetesApiVersion)}, | ||
| {"raw", container->Clone()}, | ||
| })}, | ||
| {"labels", json::object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an artifact of treating each container as a separate resource, so that's what the API expects.
src/kubernetes.cc
Outdated
| {"zone", zone}, | ||
| }); | ||
|
|
||
| const std::string gke_container_id = boost::algorithm::join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
bmoyles0117
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐿
| {"version", json::string(kKubernetesApiVersion)}, | ||
| {"raw", container->Clone()}, | ||
| })}, | ||
| {"labels", json::object({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, agreed that it's orthogonal. SGTM.
| if (config_.VerboseLogging()) { | ||
| LOG(INFO) << "Container: " << *c_status; | ||
| } | ||
| const json::Object* container = c_status->As<json::Object>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM.
| } | ||
| } | ||
|
|
||
| void MetadataAgent::UpdateMetadata(const MonitoredResource& resource, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - how is the local resource_id different from the monitored resource ID?
| const MonitoredResource& resource); | ||
|
|
||
| // Updates metadata for a given resource. | ||
| // Adds a metadata mapping from the `resource` to the metadata `entry`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "metadata mapping" mean. Is just "mapping" sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata agent maintains two mappings:
- local resource map, from a locally unique id to the monitored resource (there may be multiple locally unique ids, all referring to the same monitored resource), which is exposed via a local endpoint to other agents so they don't have to discover all parts of the monitored resource themselves, and
- metadata map, from the monitored resource to the metadata blob, which is sent to the Resource Metadata API.
| MetadataUpdater::ResourceMetadata GetContainerMetadata( | ||
| const json::Object* pod, int container_index, json::value associations, | ||
| Timestamp collected_at) const throw(json::Exception); | ||
| // Given a pod object and container index, return the legacy resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
legacy resource -> legacy container metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be no actual metadata in the returned object (see the next line), so this is intentional.
d623dcb to
02e756f
Compare
igorpeshansky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@supriyagarg, responded to your comments. PTAL.
| } | ||
| } | ||
|
|
||
| void MetadataAgent::UpdateMetadata(const MonitoredResource& resource, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The local resource_id is an easily discoverable identifier for the resource that is unique within the current context (e.g., a Kubernetes node), but may not be globally unique, whereas the monitored resource (which actually is an id, so "monitored resource id" is redundant) is globally unique.
| const MonitoredResource& resource); | ||
|
|
||
| // Updates metadata for a given resource. | ||
| // Adds a metadata mapping from the `resource` to the metadata `entry`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metadata agent maintains two mappings:
- local resource map, from a locally unique id to the monitored resource (there may be multiple locally unique ids, all referring to the same monitored resource), which is exposed via a local endpoint to other agents so they don't have to discover all parts of the monitored resource themselves, and
- metadata map, from the monitored resource to the metadata blob, which is sent to the Resource Metadata API.
| MetadataUpdater::ResourceMetadata GetContainerMetadata( | ||
| const json::Object* pod, int container_index, json::value associations, | ||
| Timestamp collected_at) const throw(json::Exception); | ||
| // Given a pod object and container index, return the legacy resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be no actual metadata in the returned object (see the next line), so this is intentional.
supriyagarg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarifications Igor.
LGTM.
d20cef7 to
d623dcb
Compare
Uh oh!
There was an error while loading. Please reload this page.