Baggage discovery#6779
Conversation
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
…com:istio/proxy into baggage-discovery
| {InstanceNameToken, BaggageToken::InstanceName}, | ||
| }; | ||
|
|
||
| static absl::flat_hash_map<absl::string_view, BaggageToken> ALL_BAGGAGE_TOKENS = { |
There was a problem hiding this comment.
It's used in getField() which is in-turn used to generatae baggage
There was a problem hiding this comment.
ALL_BAGGAGE_TOKENS is used to convert baggage into metadata, and ALL_METADATA_FIELDS is used for creating baggage from metadata. At first I thought we had getField being used also in other parts of the code, but looks like we don't. I'll unify the mappings and change the tests accordingly.
There was a problem hiding this comment.
Actually, I'm afraid WorkloadMetadtaObject can be accessed from Envoy's API and we may need to maintain compatibility. So if there's any use out there to access data using the old telemetry labels, we probably don't want to break that.
There was a problem hiding this comment.
I've added some comments, I hope it makes things clearer.
| factory_context.serverFactoryContext())); | ||
| break; | ||
| case io::istio::http::peer_metadata::Config::DiscoveryMethod::MethodSpecifierCase::kBaggage: | ||
| methods.push_back(std::make_unique<BaggageDiscoveryMethod>( |
There was a problem hiding this comment.
To be consistent with the upstream peer metadata discovery, given that this method only works for downstream metadata discovery it makes sense to check the downstream parameter and only add it if it's downstream and maybe throw a warning if it's not the case.
| {InstanceNameToken, BaggageToken::InstanceName}, | ||
| }; | ||
|
|
||
| static absl::flat_hash_map<absl::string_view, BaggageToken> ALL_BAGGAGE_TOKENS = { |
There was a problem hiding this comment.
It's used in getField() which is in-turn used to generatae baggage
|
Fixes istio/istio#58829 |
| WorkloadMetadataObject::getField(absl::string_view field_name) const { | ||
| const auto it = ALL_BAGGAGE_TOKENS.find(field_name); | ||
| if (it != ALL_BAGGAGE_TOKENS.end()) { | ||
| const auto it = ALL_METADATA_FIELDS.find(field_name); |
There was a problem hiding this comment.
This feels like a contract change; even if the naming is clearer, we CANNOT change the contract. Please revert
There was a problem hiding this comment.
ALL_METADATA_FIELDS contains the original contract change. Keeping the ALL_BAGGAGE_TOKENS name seemed confusing to me.
There was a problem hiding this comment.
Ah I see what you're doing. Make sure you double check all of the other filters in istio-proxy to ensure they're not depending on the old naming. If possible, I'd prefer to split the rename into its own change later just so we and future reviewers can focus on the substantive changes
There was a problem hiding this comment.
IMO this change is reasonably safe. ALL_BAGGAGE_TOKENS (the old name) and even ALL_METADATA_FIELDS are not used anywhere else in the repo (checked with a find grep).
| case BaggageToken::AppVersion: | ||
| app_version = parts.second; | ||
| break; | ||
| case BaggageToken::WorkloadName: |
There was a problem hiding this comment.
Why the change if this worked before?
There was a problem hiding this comment.
This is old cold, it worked with the previous baggage keys. Now that the workload type is embedded in baggage key we have to get the type and the name from the WorkloadName token. The workload name continues the same piece of code. But the type needs to be extract from a key now, different from before.
There was a problem hiding this comment.
Note how it now comes from part.first, which is the key, instead of part.second, which is the value.
There was a problem hiding this comment.
Oh good catch; yeah the ztunnel format and old pilot format are apparently different
keithmattix
left a comment
There was a problem hiding this comment.
LGTM. Should merge after a rebase
…com:istio/proxy into baggage-discovery
|
/test release-test-arm64 |
74cabca
into
istio:experimental-ambient-multicluster-telemetry
* Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (istio#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (istio#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (istio#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (istio#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (istio#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (istio#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge master branch and resolve merge conflicts properly (istio#6795) * Automator: update envoy@ in istio/proxy@master (istio#6777) * Automator: update envoy@ in istio/proxy@master (istio#6778) * Don't do workload discovery for cross-network traffic (istio#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (istio#6782) * Automator: update envoy@ in istio/proxy@master (istio#6784) * Automator: update go-control-plane in istio/proxy@master (istio#6786) * Automator: update envoy@ in istio/proxy@master (istio#6787) * Automator: update envoy@ in istio/proxy@master (istio#6788) * update x-network header key (istio#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (istio#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch * Use single match - creating multiple matches means that the later overrides the earlier --------- * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation * clang-tidy * Go back to old baggage impl * Fix baggage format * Actually use new baggage approach --------- * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. * Fix BUILD formatting * Fix formatting of C++ code * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. * Populate peer principal in the upstream workload metadata as well * Support propagating baggage header to upstream and additional safety checks for upstream network filter * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery * Move peer_metadata filter proto config in the same directory * Fix typo --------- * Baggage discovery (#6779) * Add Baggage metadata propagation * clang-tidy * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- * Add locality to proxy metadata (#6780) * Add locality to proxy metadata * Clang-tidy * Buildifier format * Rebase and fix some bugs --------- * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. * Keep backwards compatibility for app.service and app.version baggage fields --------- * Fix some test compilation errors * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata * clang-tidy * One more tidy * Switch to debug for logging --------- * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one --------- --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
…o#6798) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (istio#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (istio#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (istio#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (istio#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (istio#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (istio#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Merge master branch and resolve merge conflicts properly (istio#6795) * Automator: update envoy@ in istio/proxy@master (istio#6777) * Automator: update envoy@ in istio/proxy@master (istio#6778) * Don't do workload discovery for cross-network traffic (istio#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (istio#6782) * Automator: update envoy@ in istio/proxy@master (istio#6784) * Automator: update go-control-plane in istio/proxy@master (istio#6786) * Automator: update envoy@ in istio/proxy@master (istio#6787) * Automator: update envoy@ in istio/proxy@master (istio#6788) * update x-network header key (istio#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (istio#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
… (#6802) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch * Use single match - creating multiple matches means that the later overrides the earlier --------- * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation * clang-tidy * Go back to old baggage impl * Fix baggage format * Actually use new baggage approach --------- * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. * Fix BUILD formatting * Fix formatting of C++ code * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. * Populate peer principal in the upstream workload metadata as well * Support propagating baggage header to upstream and additional safety checks for upstream network filter * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery * Move peer_metadata filter proto config in the same directory * Fix typo --------- * Baggage discovery (#6779) * Add Baggage metadata propagation * clang-tidy * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- * Add locality to proxy metadata (#6780) * Add locality to proxy metadata * Clang-tidy * Buildifier format * Rebase and fix some bugs --------- * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. * Keep backwards compatibility for app.service and app.version baggage fields --------- * Fix some test compilation errors * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata * clang-tidy * One more tidy * Switch to debug for logging --------- * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one --------- * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. * Fix formatting --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
What this PR does / why we need it:
This PR adds metadata discovery from baggage content. Note that it contains some of the propagation changes introduced by Keith.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close that issue when PR gets merged): fixes #Special notes for your reviewer: