Introduce new filters discovering peer metadata from baggage header#6771
Conversation
This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
I assume TLS = thread-local-storage?
Might be interesting to see if we can integrate this within the existing metadata_exchange filter. But that's a future consideration IMO. |
keithmattix
left a comment
There was a problem hiding this comment.
Can you also update codeowners for this branch just so others besides me can approve?
Yes, that's what I meant. It's not super obvious how to build communication we need on top of TLS, but it certainly should be possible.
It feels like metadata exchange filter has everything we really need, there are just two issues that I found:
So I think if we can get folks behind that idea, we can just make a few changes to the metadata exchange filter directly and use it basically as if we did metadata exchange protocol between main_internal and connect_originate listeners. That might be pretty small change. |
@keithmattix created #6772 for that. I'm not sure if we will end-up merging this PR in the end (I want to try metadata exchange filter approach as well), so to decouple the fate of the codeowners change and this PR, I'm updating codeowners in a different PR. |
keithmattix
left a comment
There was a problem hiding this comment.
Implementation looks correct; my only comments are high level
peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
|
Updated the code to propagate peer metadata all the way to the http peer_metadata filter - this will take care of making this metadata available to the istio stats filter the same way as with other peer metadata discovery methods. |
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
…checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
…upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
…ter-telemetry' into experimental-ambient-multicluster-telemetry
|
|
||
| Network::FilterStatus onNewConnection() override { | ||
| ENVOY_LOG(trace, "New connection from downstream"); | ||
| populateBaggage(); |
There was a problem hiding this comment.
So this filter is doing both downstream propagation AND upstream discovery? I'm a little confused; probably just not following
There was a problem hiding this comment.
Yes, we need to extract baggage and also send our own baggage to the upstream.
Both of those things can only be done when the CONNECT tunnel is created, so this filter will do it.
|
/retest |
eac628e
into
istio:experimental-ambient-multicluster-telemetry
It's a followup for istio/proxy#6771 that introduced filters needed to extract baggage header from the HBONE CONNECT response, turn it into peer metadata and propagate all the way into the istio stats filter. NOTE: Those filters also handle populating baggage header in the HBONE CONNECT request, a.k.a. propagation. The changes here include: 1. Adding PeerMetadata filter to connect_originate and inner_connect_originate listeners 2. Adding PeerMetadata upstream network filters to service clusters to consume peer metadata returned in the data stream (the filters above add it to the data stream) 3. Update PeerMetadata HTTP filter to get the peer metadat from the filter state populated by the filter above and propagate it to the istio stats. I will not go into details of the overall design - that has been discussed in the PR linked above, the main idea is that we have a pair of filters - regular network and upstream network filter; network filter sits in the connect_originate and inner_connect_originate listener filter chain where we actually create HBONE tunnel and therefore it could extract and populate baggage header there; Network filter injects that peer metadata discovered from the baggage header value into the data stream (for reasons); The goal of the upstream network filter set on the cluster level is to remove the data injected by the network filter from the data stream and save if for the HTTP PeerMetadata filter to use. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Pilot support for upstream metadata discovery using baggage It's a followup for istio/proxy#6771 that introduced filters needed to extract baggage header from the HBONE CONNECT response, turn it into peer metadata and propagate all the way into the istio stats filter. NOTE: Those filters also handle populating baggage header in the HBONE CONNECT request, a.k.a. propagation. The changes here include: 1. Adding PeerMetadata filter to connect_originate and inner_connect_originate listeners 2. Adding PeerMetadata upstream network filters to service clusters to consume peer metadata returned in the data stream (the filters above add it to the data stream) 3. Update PeerMetadata HTTP filter to get the peer metadat from the filter state populated by the filter above and propagate it to the istio stats. I will not go into details of the overall design - that has been discussed in the PR linked above, the main idea is that we have a pair of filters - regular network and upstream network filter; network filter sits in the connect_originate and inner_connect_originate listener filter chain where we actually create HBONE tunnel and therefore it could extract and populate baggage header there; Network filter injects that peer metadata discovered from the baggage header value into the data stream (for reasons); The goal of the upstream network filter set on the cluster level is to remove the data injected by the network filter from the data stream and save if for the HTTP PeerMetadata filter to use. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix bugs Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
…6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (istio#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (istio#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (istio#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (istio#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (istio#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (istio#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge master branch and resolve merge conflicts properly (istio#6795) * Automator: update envoy@ in istio/proxy@master (istio#6777) * Automator: update envoy@ in istio/proxy@master (istio#6778) * Don't do workload discovery for cross-network traffic (istio#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (istio#6782) * Automator: update envoy@ in istio/proxy@master (istio#6784) * Automator: update go-control-plane in istio/proxy@master (istio#6786) * Automator: update envoy@ in istio/proxy@master (istio#6787) * Automator: update envoy@ in istio/proxy@master (istio#6788) * update x-network header key (istio#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (istio#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch * Use single match - creating multiple matches means that the later overrides the earlier --------- * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation * clang-tidy * Go back to old baggage impl * Fix baggage format * Actually use new baggage approach --------- * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. * Fix BUILD formatting * Fix formatting of C++ code * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. * Populate peer principal in the upstream workload metadata as well * Support propagating baggage header to upstream and additional safety checks for upstream network filter * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery * Move peer_metadata filter proto config in the same directory * Fix typo --------- * Baggage discovery (#6779) * Add Baggage metadata propagation * clang-tidy * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- * Add locality to proxy metadata (#6780) * Add locality to proxy metadata * Clang-tidy * Buildifier format * Rebase and fix some bugs --------- * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. * Keep backwards compatibility for app.service and app.version baggage fields --------- * Fix some test compilation errors * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata * clang-tidy * One more tidy * Switch to debug for logging --------- * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one --------- --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* Peer metadata filter waypoint (#58873) * basics for downstream peer metadata filter in WP * adding downstream mx propagation * adding wds based discovery for MC * add flag ENABLE_AMBIENT_MULTINETWORK_BAGGAGE * Pilot support for upstream metadata discovery using baggage (#58890) * Pilot support for upstream metadata discovery using baggage It's a followup for istio/proxy#6771 that introduced filters needed to extract baggage header from the HBONE CONNECT response, turn it into peer metadata and propagate all the way into the istio stats filter. NOTE: Those filters also handle populating baggage header in the HBONE CONNECT request, a.k.a. propagation. The changes here include: 1. Adding PeerMetadata filter to connect_originate and inner_connect_originate listeners 2. Adding PeerMetadata upstream network filters to service clusters to consume peer metadata returned in the data stream (the filters above add it to the data stream) 3. Update PeerMetadata HTTP filter to get the peer metadat from the filter state populated by the filter above and propagate it to the istio stats. I will not go into details of the overall design - that has been discussed in the PR linked above, the main idea is that we have a pair of filters - regular network and upstream network filter; network filter sits in the connect_originate and inner_connect_originate listener filter chain where we actually create HBONE tunnel and therefore it could extract and populate baggage header there; Network filter injects that peer metadata discovered from the baggage header value into the data stream (for reasons); The goal of the upstream network filter set on the cluster level is to remove the data injected by the network filter from the data stream and save if for the HTTP PeerMetadata filter to use. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix bugs Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Remove dead code from merge Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * suggestion from main PR (#58936) * suggestion from main PR * Update pilot/pkg/xds/filters/filters.go Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * moving from Once to OnceValue --------- Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * Address PR feedback Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix format of filters Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com>
* Peer metadata filter waypoint (istio#58873) * basics for downstream peer metadata filter in WP * adding downstream mx propagation * adding wds based discovery for MC * add flag ENABLE_AMBIENT_MULTINETWORK_BAGGAGE * Pilot support for upstream metadata discovery using baggage (istio#58890) * Pilot support for upstream metadata discovery using baggage It's a followup for istio/proxy#6771 that introduced filters needed to extract baggage header from the HBONE CONNECT response, turn it into peer metadata and propagate all the way into the istio stats filter. NOTE: Those filters also handle populating baggage header in the HBONE CONNECT request, a.k.a. propagation. The changes here include: 1. Adding PeerMetadata filter to connect_originate and inner_connect_originate listeners 2. Adding PeerMetadata upstream network filters to service clusters to consume peer metadata returned in the data stream (the filters above add it to the data stream) 3. Update PeerMetadata HTTP filter to get the peer metadat from the filter state populated by the filter above and propagate it to the istio stats. I will not go into details of the overall design - that has been discussed in the PR linked above, the main idea is that we have a pair of filters - regular network and upstream network filter; network filter sits in the connect_originate and inner_connect_originate listener filter chain where we actually create HBONE tunnel and therefore it could extract and populate baggage header there; Network filter injects that peer metadata discovered from the baggage header value into the data stream (for reasons); The goal of the upstream network filter set on the cluster level is to remove the data injected by the network filter from the data stream and save if for the HTTP PeerMetadata filter to use. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix bugs Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Remove dead code from merge Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * suggestion from main PR (istio#58936) * suggestion from main PR * Update pilot/pkg/xds/filters/filters.go Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * moving from Once to OnceValue --------- Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * Address PR feedback Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix format of filters Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com>
* Ambient Multicluster Telemetry (#58901) * Peer metadata filter waypoint (#58873) * basics for downstream peer metadata filter in WP * adding downstream mx propagation * adding wds based discovery for MC * add flag ENABLE_AMBIENT_MULTINETWORK_BAGGAGE * Pilot support for upstream metadata discovery using baggage (#58890) * Pilot support for upstream metadata discovery using baggage It's a followup for istio/proxy#6771 that introduced filters needed to extract baggage header from the HBONE CONNECT response, turn it into peer metadata and propagate all the way into the istio stats filter. NOTE: Those filters also handle populating baggage header in the HBONE CONNECT request, a.k.a. propagation. The changes here include: 1. Adding PeerMetadata filter to connect_originate and inner_connect_originate listeners 2. Adding PeerMetadata upstream network filters to service clusters to consume peer metadata returned in the data stream (the filters above add it to the data stream) 3. Update PeerMetadata HTTP filter to get the peer metadat from the filter state populated by the filter above and propagate it to the istio stats. I will not go into details of the overall design - that has been discussed in the PR linked above, the main idea is that we have a pair of filters - regular network and upstream network filter; network filter sits in the connect_originate and inner_connect_originate listener filter chain where we actually create HBONE tunnel and therefore it could extract and populate baggage header there; Network filter injects that peer metadata discovered from the baggage header value into the data stream (for reasons); The goal of the upstream network filter set on the cluster level is to remove the data injected by the network filter from the data stream and save if for the HTTP PeerMetadata filter to use. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix bugs Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Remove dead code from merge Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * suggestion from main PR (#58936) * suggestion from main PR * Update pilot/pkg/xds/filters/filters.go Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * moving from Once to OnceValue --------- Co-authored-by: Keith Mattix II <keithmattix2@gmail.com> * Address PR feedback Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix format of filters Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com> * Add 1.29 release note Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com>
* Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
…o#6798) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (istio#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Use single match - creating multiple matches means that the later overrides the earlier Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Add Baggage metadata propagation (istio#6776) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Go back to old baggage impl Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Fix baggage format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Actually use new baggage approach Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Introduce new filters discovering peer metadata from baggage header (istio#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix BUILD formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting of C++ code Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Populate peer principal in the upstream workload metadata as well Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Support propagating baggage header to upstream and additional safety checks for upstream network filter Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Move peer_metadata filter proto config in the same directory Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix typo Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Baggage discovery (istio#6779) * Add Baggage metadata propagation Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Add locality to proxy metadata (istio#6780) * Add locality to proxy metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Buildifier format Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Rebase and fix some bugs Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Drop app labels from baggage and propagate principal (istio#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Keep backwards compatibility for app.service and app.version baggage fields Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> * Fix some test compilation errors Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Merge master branch and resolve merge conflicts properly (istio#6795) * Automator: update envoy@ in istio/proxy@master (istio#6777) * Automator: update envoy@ in istio/proxy@master (istio#6778) * Don't do workload discovery for cross-network traffic (istio#6767) * Get the implementation compiling * Add tests for cross-network peer metadata Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * clang-tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * One more tidy Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Switch to debug for logging Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> * Automator: update envoy@ in istio/proxy@master (istio#6782) * Automator: update envoy@ in istio/proxy@master (istio#6784) * Automator: update go-control-plane in istio/proxy@master (istio#6786) * Automator: update envoy@ in istio/proxy@master (istio#6787) * Automator: update envoy@ in istio/proxy@master (istio#6788) * update x-network header key (istio#6790) Signed-off-by: Ian Rudie <ian.rudie@solo.io> * Automator: update envoy@ in istio/proxy@master (istio#6794) * Merge upstream/master and resolve merge conflicts Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Missed one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fixed a wrong one Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com> * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> * Fix formatting Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
… (#6802) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch (#6772) * Include myself, Steven and Gustavo as owners of the experimental-ambient-multicluster-telemetry branch * Use single match - creating multiple matches means that the later overrides the earlier --------- * Add Baggage metadata propagation (#6776) * Add Baggage metadata propagation * clang-tidy * Go back to old baggage impl * Fix baggage format * Actually use new baggage approach --------- * Introduce new filters discovering peer metadata from baggage header (#6771) * Introduce new filters discovering peer metadata from baggage header This a combination of two filters that have to be used together: - regular network filter (expected to be configured in connect_originate or inner_connect_originate listeners before TCP Proxy filter) - upstream network filter (expected to be configuration in all clusters that use HBONE or double-HBONE for endpoints) Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is). The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply. The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers. In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not. NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases. The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that. Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one. I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter. NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation. * Fix BUILD formatting * Fix formatting of C++ code * Update HTTP peer_metadata filter to consume filter state set by upstream peer_metadata filter This basically taps the upstream peer metadata into the regular filter state consumed by the istio stats filter. http peer metadata filter also takes care of priorities between different discovery methods - we just need to put different discovery methods in the right order in the configuration. * Populate peer principal in the upstream workload metadata as well * Support propagating baggage header to upstream and additional safety checks for upstream network filter * Only register UpstreamFilterState peer metadata discovery method for upstream peer discovery * Move peer_metadata filter proto config in the same directory * Fix typo --------- * Baggage discovery (#6779) * Add Baggage metadata propagation * clang-tidy * basics for baggage discovery downstream * removing unnecessary tests * reverting crazy claude changes in release-binary.sh * fixing tests, fixing baggage key tokens * removing comment * make lint * fixing unit tests for metadata_object * make lint * suggestions from PR * clarifying use of mappings for baggage and field access * make lint --------- * Add locality to proxy metadata (#6780) * Add locality to proxy metadata * Clang-tidy * Buildifier format * Rebase and fix some bugs --------- * Drop app labels from baggage and propagate principal (#6791) * Drop app labels from baggage and propagate principal I think I confused folks a bit when I mentioned that app field is missing from the baggage - it wasn't. In fact, canonical name of the workload and app in ambient are the same thing, that's why baggage does not actually need an app label - it already has service.name that encodes what we need. I updated the design document, but it happened after I mentioned here and there that we need to add a missing field to the baggage. This change corrects implementation and that makes istio stats populate the app label correctly. The other field that has not been populated is principal. WorkloadMetadataObject contained that identity field that contained principle in principle, but the methods used to conver WorkloadMetadataObject to a protobuf Struct and back ignored that field and never populated it, so it got lost and istio stats never used it. We haven't noticed that before because in ambient we used xDS-based peer metadata discovery by default and it triggers a different code path that does not rely on the methods that convert protobuf Struct to WorkloadMetadataObject, and the code path used there didn't have the same issue. * Keep backwards compatibility for app.service and app.version baggage fields --------- * Fix some test compilation errors * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one * Merge master branch and resolve merge conflicts properly (#6795) * Automator: update envoy@ in istio/proxy@master (#6777) * Automator: update envoy@ in istio/proxy@master (#6778) * Don't do workload discovery for cross-network traffic (#6767) * Get the implementation compiling * Add tests for cross-network peer metadata * clang-tidy * One more tidy * Switch to debug for logging --------- * Automator: update envoy@ in istio/proxy@master (#6782) * Automator: update envoy@ in istio/proxy@master (#6784) * Automator: update go-control-plane in istio/proxy@master (#6786) * Automator: update envoy@ in istio/proxy@master (#6787) * Automator: update envoy@ in istio/proxy@master (#6788) * update x-network header key (#6790) * Automator: update envoy@ in istio/proxy@master (#6794) * Merge upstream/master and resolve merge conflicts * Missed one * Fixed a wrong one --------- * Add e2e test for proxy baggage-based metadata discovery + fixes This adds an e2e test for new proxy filters that verifies both baggage propagation as well as metrics that stats filter will generate. I also made some changes to how we parse and generate baggage header. Basically, WorkloadMetadataObject supports cases where app and service might have different values. In xDS-based metadata discovery implementation however app is always derived from the service name, at least in ztunnel. So there is a bit of a mismatch there. In practice this mismatch should not matter, but purely hypothetically, if I didn't change the logic, we could end up in a situation where waypoint node metadata is configured in such a way that app is set, but service is not. And if that happens, waypoint will generate a baggage, that ztunnels cannot interpret correctly yet and it will result in metrics with unknown values. I figured that I can rewrite code in a way that accounts for all corner cases like that by making sure that: 1. When we parse baggage we set both app and service, and if any of those is not provided in the baggage, we use the other to backfill 2. When we generate baggage, we always generate service (because ztunnel needs it), and we generate app, if it's different from the service. All-in-all, under normal circumstances in the baggage we will only have service; if app is different from the service - we will add it to the baggage as well; and when parsing baggage we will use either service or app, depending on what is provided. * Fix formatting --------- Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com> Signed-off-by: Keith Mattix II <keithmattix@microsoft.com> Signed-off-by: Ian Rudie <ian.rudie@solo.io> Co-authored-by: Keith Mattix II <keithmattix@microsoft.com> Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Ian Rudie <ilrudie@gmail.com>
* fix cni shutdown treating NodeAffinity change as upgrade/restart (#58768)
* fix cni shutdown treating NodeAffinity change as upgrade/restart
* added releasenotes
* added not-nil guarantees
* updated releasenotes
* optimization: avoid service deep copies (#58743)
* avoid service deep copies
* cleanup benchmarks
* take write lock
* fix
* adjust benchmark
* tweak comments
* tweak more comments
* respect kind-config flag when setting up multicluster integration test kind clusters (#58835)
* Add input validation for excludeInterfaces annotation (#58785)
* Add input validation for excludeInterfaces annotation
* Move interface validation to pkg/util/net
* Address review feedback: extend validation to kubevirt interfaces
* Fix gci import formatting
* Update pkg/util/net/ip.go
Co-authored-by: Petr McAllister <petr@solo.io>
* Update tests for tightened interface name regex
---------
Co-authored-by: Petr McAllister <petr@solo.io>
* add source tags to waypoint trace (#58818)
* add source tags to waypoint trace
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* add new tags to the tests
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* lint
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Automator: update ztunnel@master in istio/istio@master (#58820)
* Automator: update proxy@master in istio/istio@master (#58822)
* rebase (#58866)
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Set source tags in Waypoint spans (#58872)
* set source waypoint span tags
* update release note
* Automator: update proxy@master in istio/istio@master (#58869)
* Automator: update go-control-plane in istio/istio@master (#58896)
* Correct some conditionals for sending to ambient e/w gateways (#58849)
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Automator: update ztunnel@master in istio/istio@master (#58905)
* dnsPolicy and dnsConfig customization for ztunnel (#58711)
* helm definition for ztunnel dns settings
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* add release notes
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* bump kiali addon to v2.21.0 (#58902)
Kiali v2.21.0 is the appropriate version for the next Istio release.
This PR bumps up the Kiali addon to that version.
* test: add TargetPortName support to echo framework (#58770)
Enable testing services that use named target ports by adding
TargetPortName field to echo.Port. When set, this uses the port
name in Service targetPort instead of the numeric WorkloadPort.
This supports standard Kubernetes behavior where Services can
reference container ports by name rather than number.
* Automator: update istio/client-go@master dependency in istio/istio@master (#58863)
* move sidecar annotation `statsCompression` to proxyConfig parameter (#58717)
* remove `sidecar.istio.io/statsCompression` annotation
* add option `statsCompression` (default true)
Compressing metrics reduces the size of the responses by 90%.
Since modern CPUs handle compression extremely efficiently, there
is virtually no downside in allowing metrics scrapers to ask for
compressed responses. In case no compression is wanted, a client
like Prometheus can just request with an appropriate `Accept-Encoding`
header (or none).
Fixes: #48051
xref: #30987 #47997
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
fixup
* krt: migrate config store controllers to krt (#58605)
* krt memory controller
* krt fake store
* krt config stores
* fix removed field assignement
* remove unneeded methods
* Add missing Pilot environment variable when enabling untaint controller (#58542)
Fixes #52050
Signed-off-by: Yann Soubeyrand <8511577+yann-soubeyrand@users.noreply.github.com>
* test: Show output of external control plane installer when failed (#58915)
Signed-off-by: mkralik3 <mkralik@redhat.com>
* feat: support multiple CUSTOM authorization providers per workload (#58082)
Fixes #57933, #55142, #34041
* addons: bump addons version (#58916)
Signed-off-by: xin.li <xin.li@daocloud.io>
* add few missing annotation validations (#58928)
* add few missing annotation validations
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* release notes
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Automator: update common-files@master in istio/istio@master (#58932)
* Set x-forwarded-network header (#58815)
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* improve the deployment controller safeguards (#58940)
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* tests: remove unused destination rule from a test (#58937)
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Update BASE_VERSION to master-2026-01-28T19-02-04 (#58947)
* Automator: update istio/client-go@master dependency in istio/istio@master (#58933)
* fix checking references when merging gateways (#58878)
* fix checking references when merging gateways
Signed-off-by: Mikhail Scherba <mikhail.scherba@flant.com>
* add a test case for multiple creds
Signed-off-by: Mikhail Scherba <mikhail.scherba@flant.com>
* Apply suggestion from @Stevenjin8
Co-authored-by: Steven Jin <stevenjin8@gmail.com>
* fix after applying suggestions
Signed-off-by: Mikhail Scherba <mikhail.scherba@flant.com>
---------
Signed-off-by: Mikhail Scherba <mikhail.scherba@flant.com>
Co-authored-by: Steven Jin <stevenjin8@gmail.com>
* Ambient Multicluster Telemetry (#58901)
* Peer metadata filter waypoint (#58873)
* basics for downstream peer metadata filter in WP
* adding downstream mx propagation
* adding wds based discovery for MC
* add flag ENABLE_AMBIENT_MULTINETWORK_BAGGAGE
* Pilot support for upstream metadata discovery using baggage (#58890)
* Pilot support for upstream metadata discovery using baggage
It's a followup for https://github.com/istio/proxy/pull/6771 that
introduced filters needed to extract baggage header from the HBONE
CONNECT response, turn it into peer metadata and propagate all the way
into the istio stats filter.
NOTE: Those filters also handle populating baggage header in the HBONE
CONNECT request, a.k.a. propagation.
The changes here include:
1. Adding PeerMetadata filter to connect_originate and
inner_connect_originate listeners
2. Adding PeerMetadata upstream network filters to service clusters to
consume peer metadata returned in the data stream (the filters above
add it to the data stream)
3. Update PeerMetadata HTTP filter to get the peer metadat from the
filter state populated by the filter above and propagate it to the
istio stats.
I will not go into details of the overall design - that has been
discussed in the PR linked above, the main idea is that we have a pair
of filters - regular network and upstream network filter; network filter
sits in the connect_originate and inner_connect_originate listener
filter chain where we actually create HBONE tunnel and therefore it
could extract and populate baggage header there; Network filter injects
that peer metadata discovered from the baggage header value into the
data stream (for reasons); The goal of the upstream network filter set
on the cluster level is to remove the data injected by the network
filter from the data stream and save if for the HTTP PeerMetadata filter
to use.
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Fix bugs
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Fix formatting
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
---------
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Remove dead code from merge
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* suggestion from main PR (#58936)
* suggestion from main PR
* Update pilot/pkg/xds/filters/filters.go
Co-authored-by: Keith Mattix II <keithmattix2@gmail.com>
* moving from Once to OnceValue
---------
Co-authored-by: Keith Mattix II <keithmattix2@gmail.com>
* Address PR feedback
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Fix format of filters
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
---------
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
Co-authored-by: Gustavo Meira <grnmeira@users.noreply.github.com>
Co-authored-by: Krinkin, Mike <mkrinkin@microsoft.com>
* Automator: update ztunnel@master in istio/istio@master (#58957)
* add namespace authorization to port 15014 debug endpoints (#58925)
* fix authorization for Pilot Debug Endpoints
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* release notes
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* address PR comment
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* add flag for kiali compatibility
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Update pilot/pkg/xds/debug_test.go
Co-authored-by: Jackie Maertens (Elliott) <64559656+jaellio@users.noreply.github.com>
* address PR review comments
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
Co-authored-by: Jackie Maertens (Elliott) <64559656+jaellio@users.noreply.github.com>
* Support retry in CNI plugin for verifying ambient enablement (#58379)
* Support retry in plugin for verifying ambient enablement
Adds support to retry getting pod and namespace to determine if
a pod is ambient enabled. Currently, if the plugin fails to
determine if a pod is ambient enabled it assumes it is not. This
enables potential mesh bypass
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* fix format
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Cleanup/respond to pr feedback
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix gen errors
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* fix lint
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix testdata value ordering
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Respond to PR feedback
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* add new flag and release note
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Updated cni helm chart
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Refactor ordering
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Update flag name
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Update json name
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Update test to remove istio owned dependency
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix retry value in golden
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Return back peer metadata filter in inner_connect_originate listener (#58967)
This filter is needed for peer metadata discovery and telemetry to work
properly in multi-network environments. We had it at some point, but
with more recent reworks and merges we lost it.
Unfortunately, we didn't create any automatic tests to catch this issue.
No customer impact though - it's a new code path disabled by default.
This time I'm adding some unit tests that checks that all the right
filters are generated in different configurations.
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Fix source_app/destination_app to respect app.kubernetes.io/name label (#58436) (#58444)
* Fix source_app/destination_app to respect app.kubernetes.io/name label (#58436)
This change ensures that source_app, destination_app, source_version, and
destination_version metric labels properly respect Kubernetes well-known
labels (app.kubernetes.io/name and app.kubernetes.io/version) with the
correct priority order:
For app name:
1. service.istio.io/canonical-name
2. app.kubernetes.io/name
3. app
For version:
1. service.istio.io/canonical-revision
2. app.kubernetes.io/version
3. version
The fix adds a GetAppName function to the labels package that uses the
existing label priority lookup mechanism, and updates serviceClusterOrDefault
in bootstrap/config.go to use this function instead of directly accessing
the 'app' label.
This maintains backward compatibility while allowing users who only have
app.kubernetes.io/name or app.kubernetes.io/version labels to have their
metric labels properly populated.
Fixes #58436
* Address PR review feedback: backward-compatible label priority order
Changes from previous commit:
1. Reverted the order of checks to match original behavior: labels are
checked BEFORE WorkloadName for backward compatibility
2. Added GetAppNameBackwardCompatible function that checks labels in
order: app -> app.kubernetes.io/name -> canonical-name
3. Updated tests to reflect correct priority order
The change maintains exact backward compatibility:
- If 'app' label exists, it takes priority (same as before)
- If only 'app.kubernetes.io/name' exists, it is now used (new feature)
- If only canonical-name exists, it is used as last fallback (new feature)
- WorkloadName is only used if none of the above labels are present
This ensures existing users see no behavior change while enabling users
who only have app.kubernetes.io/name to have their metrics properly labeled.
* Add release note for issue #58436
Adds release note explaining the new support for app.kubernetes.io/name
and service.istio.io/canonical-name labels when populating source_app
and destination_app metric labels.
* Address PR feedback: remove unused GetAppName and rename GetAppNameBackwardCompatible to GetApp
* Automator: update go-control-plane in istio/istio@master (#58976)
* Automator: update ztunnel@master in istio/istio@master (#58975)
* Fix multinetwork ambient remote cluster store (#58942)
* Fix surprisingly rare concurrency bug with ambient remote clusters
We were using a read lock to write...I only ever saw this once:
https://prow.istio.io/view/gs/istio-prow/pr-logs/pull/istio_istio/58815/unit-tests-arm64_istio/2016461941756661760.
But the logic is clearly flawed. Fixed this by taking an explicit write
lock.
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Add more explicit locking
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Cleanup before submitting
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Remove unused function
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Remove confusing comment
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Don't poll syncers again
Apparently calling WaitUntilSynced blocks other callers from getting a
return value. I guess that makes sense if the syncers are backed by
buffered channels or something like that. Regardless, it's still good to
not duplicate the checks so we have a single source of truth.
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Remove debug logging
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* Add release note
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
---------
Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>
* tls: fix applying min TLS version to downstream TLS context (#58912)
* tls: fix applying min TLS version to downstream TLS context
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add a release note
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add unit tests for applyDownstreamTLSDefaults
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Automator: update istio/client-go@master dependency in istio/istio@master (#58999)
* endpoints: use gateway for network-specific endpoints when local proxy network is unset (#58613)
* endpoints: use gateway for network-specific endpoints when local proxy network is unset
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add a release note
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* test: implements integration tests for CRL support in zTunnel (#58367)
* test: implements integration tests for CRL support in zTunnel
Signed-off-by: nilekh <1626598+nilekhc@users.noreply.github.com>
* chore: confirms CRL load via logs
Signed-off-by: nilekh <1626598+nilekhc@users.noreply.github.com>
* chore: removes unnecessary config
Signed-off-by: nilekh <1626598+nilekhc@users.noreply.github.com>
---------
Signed-off-by: nilekh <1626598+nilekhc@users.noreply.github.com>
* Update BASE_VERSION to master-2026-02-04T19-02-20 (#59005)
* Automator: update common-files@master in istio/istio@master (#59008)
* Automator: update istio/client-go@master dependency in istio/istio@master (#59009)
* gatewayapi: fix cors origin and preflight filters (#59018)
* gatewayapi: fix cors origin and preflight filters
* gatewayapi: update golden manifests for cors
* gatewayapi: add releasenotes for cors fixes
* Automator: update ztunnel@master in istio/istio@master (#59020)
* Reduce flakiness of the TestServiceRestart test (#59022)
* Reduce flakiness of the TestServiceRestart test
In my local environment the test flakes 2-3 times per 1000 runs. Looking
at the causes of the flakes I found two:
1. The echo process gets terminated prematurely - before the pod is
removed from the list of service endpoints
2. DNS resolution fails with timeout - I've only seen this once, so it's
quite rare and it's not an Istio issue, so I'm going to ignore this
cause for now (we can potentially harden echo implementation by
allowing retries of DNS failures, though)
The way a restart works in k8s deployment is roughly happens in the
following order:
- deployment brings up a new pod
- deployment waits for the new pod to become healthy
- deployment terminates the old pod
In parallel with termination of the new pod, if deployment pods are
behind a service, the terminating pod gets removed from the list of
the service endpoints.
The important thing here is that terminating pod and removing it from
the service endpoints list happens concurrently, so you could end up in
a situation, when the processes in the pod already terminated, but the
pod is still listed as a service endpoint and can get requests routed to
it. Obviously, such requests will fail.
In echo app implementation used in our tests we protect against that by
delaying echo process termination. The way it works is roughly as
follows:
1. Echo receives a SIGTERM from Kubelet - this is Kubelet's way to ask
the pod politely to shut down.
2. Echo starts a 1 second timer:
- if during this 1 second interval echo receives a request it rests
the timer and start waiting for another 1 second
3. If Echo does not receive a request before timer elapses, Echo shuts
down.
And basically the intention here is to wait for as long as we get
requests routed to the pod. Once requests stop, we assume that the pod
has been deleted from service endpoints list and it's safe to shut down.
The `TestServiceRestart` test currently sends a request every 100ms, so
during the 1 second the Echo waits for requests to come we will have ~10
attempts - that seems like enough, but let's count.
In the test setup normally we have 2 deployments (v1 and v2) each with 1
pod, so 2 pods in total. However, during the restart in one of the
deployments we first bring up a new pod and only after that start
terminating the old pod, so for a period of time we will have 3 pods to
choose from.
That means that each request, assuming uniformity, has about 2/3 chance
of *not* getting routed to the terminating pod. And assuming
independence all ~10 attempts can miss the terminating pod with about
1-2% chance, so it's small, but not insignificant.
If all 10 attempts miss the terminating pod, it will stop listening on
its ports and shut down. If by the time it happens the pod is not yet
removed from the service endpoint list it can still get new requests and
those requests, if routed to the pod, will fail, failing the test along
the way.
This PR changes the call interval from 100ms to 40ms. Due to an
exponential nature of the probaility calculation function, this change
results in a significantly lower chance of hitting the flake (e.g. more
than 100 times lower chance, though the calculations I did are rather
rough).
Related to #58226
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Fix spelling and formatting
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
---------
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* update EOL information (#58991)
* improve bugreport output (#58977)
Signed-off-by: xin.li <xin.li@daocloud.io>
* Automator: update go-control-plane in istio/istio@master (#59025)
* test: add support for customizing min TLS version and ECDH curves in echo client/server (#58918)
* test: add support for customizing min TLS version and ECDH curves in echo client/server
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Break line to fix linter error
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Set TLS min version and curve preferences in the deployment template
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Deduplicate implementation of parse TLS settings
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Support only standard curve names
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Update curve names in flag descriptions
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Return error on invalid TLS version or curve preference
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Fix ineffassign issue
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Wrap parse errors
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* add tls-inspector for listener with only TLS ports (#59028)
* wip: add tls-inspector when using tls dynmaic dns so we have sni information for the dfp
Signed-off-by: Ian Rudie <ian.rudie@solo.io>
* release note
Signed-off-by: Ian Rudie <ian.rudie@solo.io>
* simplify, test
Signed-off-by: Ian Rudie <ian.rudie@solo.io>
---------
Signed-off-by: Ian Rudie <ian.rudie@solo.io>
* Automator: update istio/client-go@master dependency in istio/istio@master (#59034)
* Automator: update istio/client-go@master dependency in istio/istio@master (#59036)
* gatewayapi: make the cors origin stricter with wildcards (#59026)
* handle remote cluster secret update in a more graceful way (#58567)
* handle remote cluster secret update in a more graceful way
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* fix ut
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* fix data race
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* rename
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* compare old informer and new informer
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* lint
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* address review comments
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
---------
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* Implement disableContextPropagation in Telemetry Tracing API (#59047)
* Implement disableContextPropagation in Telemetry Tracing API
Add support for disabling trace context header propagation (e.g., traceparent/tracestate for W3C, X-B3-* for Zipkin) independently from span reporting. This leverages Envoy's no_context_propagation field on HttpConnectionManager.Tracing.
Fixes #58871
* adding release notes
* Fixing the linter issues
* Adding the istio version check
* Automator: update common-files@master in istio/istio@master (#59058)
* test framework: fix Restart to wait for pods to actually roll (#59060)
* fix race in TestListRemoteClusters (#59061)
* add test for redirect only httproute in gateway api (#59051)
* add test for redirect only httproute in gateway api
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* remove claude settings
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
---------
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* fix data race in pending swap (#59070)
* fix data race in pending swap
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* add comments
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
---------
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* Automator: update proxy@master in istio/istio@master (#58885)
* Automator: update proxy@master in istio/istio@master (#59077)
* Add "ZtunnelNamespace" flag to specify Ztunnel deployment location (#58677)
During deployment of Ambient mode, Ztunnel resource could be deployed to
a namespace other that "istio-system".
When executing TestZtunnelConfig and TestZtunnelRestart integration tests,
while Ztunnel resource deployment in a separate NS,
the test will fail as will not be able to locate the required resource.
Add "ZtunnelNamespace" flag to specify Ztunnel deployment location.
Defaults to - "istio-system".
Signed-off-by: Maxim Babushkin <mbabushk@redhat.com>
* Use %w verb for error wrapping in pilot/pkg/config/file (#59078)
* Use %w verb for error wrapping in pilot/pkg/config/file/store.go
Replace fmt.Errorf %v with %w for proper error chain propagation,
enabling errors.Is() and errors.As() to work correctly.
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
* Add release note for error wrapping fix
Addresses the failing release-notes CI check by adding the required
release note YAML for the %w error wrapping change.
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Modify TestAuthorizationGateway ambient test (#58699)
The TestAuthorizationGateway ambient test has the ingress gateway
service account hardcoded as - "istio-ingressgateway-service-account".
If Istio deployed by a trird party control plane, the service name could
differ.
Make the ingress gateway service account detection dynamic to fetch the
actual service name.
Signed-off-by: Maxim Babushkin <mbabushk@redhat.com>
* Automator: update istio/client-go@master dependency in istio/istio@master (#59059)
* Fix flake in Ambient MC TestServices test (#59094)
* Fix flake in Ambient MC TestServices test
This test runs traffic between different types of workloads, including
sidecar to sidecar traffic.
Whne it comes to multicluster topology, ambient works a bit differntly
from sidecar, specifically, in sidecar mode, when we don't have an E/W
gateway, we assume that there is a direct connectivity.
Ambient does not do that and requires an E/W gateway for multi-network
communications.
Additionally, ambient E/W gateway is different from sidecar E/W gateway,
so we can't replace a sidecar E/W gateway with an ambient E/W gateway.
Putting things together, in ambient multi-network tests we only deploy
ambient E/W gateway. When we generate EDS for sidecars and cannot find
a sidecar E/W gateway, Istio assumes direct connectivity between
networks, which does not exist in ambient multi-network test setup.
As a result, every time when sidecar tries to talk to another sidecar
and picks an endpoint on a remote network - connection fails, failing
the test as well.
NOTE: I also cleaned up a stale TODO along the way in this PR.
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Explicitly set delay to be less than timeout
default delay in the retry is 10ms, with the timeout set to 5ms it
basically means that we will test and if the test is not successful we
fail imideately.
Given that the test involves running multiple goroutines underneath that
have to synchornize with each other, that's probably not the behavior we
want.
NOTE: I think that's the reason this test flakes on the CI, though I
could not catch the flake locally.
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
---------
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Automator: update proxy@master in istio/istio@master (#59106)
* fix: resources improperly being set when trying to remove resource values (#58824)
* fix: resources improperly being set when trying to remove resource values
* add release note
* additional test case
* add more test cases
* removed unused resources, refine release notes
* remove unused files
* improve performance
* gst
* update release notes
* work for more resources than just limit/request
* fix releasenote schema, remove whitespace from chart output
* use trim on all helpers output
* fix: format template_test.go to pass gci linter
The gci linter was failing due to improper formatting in template_test.go.
Running make format resolved the issue.
* fix: convert YAML files to Unix line endings
* fix: include .tpl files in copy-templates sed replacements
---------
Co-authored-by: Minh Nguyen <minh.nguyen@airtable.com>
* backend policy references use proper dependency tracking (#59110)
* XDS debug endpoints to require authentication (#59054)
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* SSRF protection (#58969)
* Implemented SSRF protection
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* adjust filtering logic, add more tests
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* release notes
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* check multiple challenges
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* lint
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* fix flake in TestJoinCollection (#59129)
* krt: make static singleton emit standard events (#58604)
* add tests to make sure replacing a key emits expected events
* fix static singleton conformance
* krt: make Singleton emit delete+event when key changes
* revert GetKey behavior change
* address comments
* remove assertion on key extraction - not feasible for simple types
* longline lint
* autoregistration unhealthy status race (#59111)
* Add namespace-level traffic-distribution annotation (#58701)
* add support for ns level annotation of traffic distribution
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* lint
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* address pr review
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* add namespace kclient to se-controller
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Add `--tls-min-version` flag to istiod (#58802)
* Add `tls-min-version` flag to istiod
Adds a flag that configures the minimum TLS version the istiod server and webhook allows.
Signed-off-by: Nick Fox <nfox@redhat.com>
* Fix linting issues
Signed-off-by: Nick Fox <nfox@redhat.com>
* Override minVersion after tlsConfig is constructed
Signed-off-by: Nick Fox <nfox@redhat.com>
---------
Signed-off-by: Nick Fox <nfox@redhat.com>
* fix ipallocate off-by-one (#59113)
* fix ipallocate off-by-one
* address comment
* reset env on cleanup
* test: add memory null and mixed resource test cases for injection (#59134)
Adding extra test scenarios to increase coverage after merging #58824
Signed-off-by: Francisco Herrera <fjglira@gmail.com>
* Use separate waypoints to enable server-first protocol testing (#58774)
* Use separate waypoints to enable server-first protocol testing
Server-first protocols (like MySQL, PostgreSQL, SMTP) require service-only
waypoints. This PR replaces the merged waypoint (--for all) with separate
instances of waypoint proxy (waypoint-service and waypoint-workload).
The server-first protocol tests (tcp-server) are now validated instead
of being skipped.
Fixes: https://github.com/istio/istio/issues/55420
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
* Fix failures in WasmPlugin and WaypointDNS tests
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
---------
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
* fix file controller flake (#59114)
* fix file controller flake
* bump interval
* unique collection key for each ReferenceGrant.to entry (#59069)
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
* Update BASE_VERSION to master-2026-02-17T19-01-26 (#59120)
* fix(pilot): consider ports of native sidecars when scanning for inbound ports (#59142)
fixes https://github.com/istio/istio/issues/59045
* tests: add integration tests for "pqc" compliance policy (#59042)
* tests: add test suite for "pqc" compliance policy
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add tests for ingress
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add tests for egress use cases
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Install E/W gateway
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Improve error message when ingress does not exists
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add a test case with successful connection to the exernal server
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Change configuration to make sure that traffic is routed through the egress gateway
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Fix checkTLSHandshakeFailure that might cause nil pointer dereference
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Automator: update ztunnel@master in istio/istio@master (#59074)
* Log configuration analysis messages in istiod for all resource types (#59105)
* Log configuration analysis messages in istiod for all resource types
* Adding release notes
* Deduping the messages
* Automator: update istio/client-go@master dependency in istio/istio@master (#59148)
* pilot: fix nil pointer dereference in meshwatcher adapter (#59156)
* pilot: fix nil pointer dereference in meshwatcher adapter
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add a release note
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Add comments to adapter.Mesh() implementation
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Update BASE_VERSION to master-2026-02-19T19-00-38 (#59157)
* fix nil pointer dereference in ServiceEntry DYNAMIC_DNS validation (#59171)
* fix nil pointer dereference in ServiceEntry DYNAMIC_DNS validation, add tests for nil pointers
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* lint, add release note
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Automator: update common-files@master in istio/istio@master (#59178)
* Automator: update ztunnel@master in istio/istio@master (#59181)
* Automator: update proxy@master in istio/istio@master (#59180)
* gatewayapi: allow tls.Options to override Gateway API TLS mode (#59098)
Move tls.Options[gateway.istio.io/tls-terminate-mode] processing to occur
after Gateway API standard TLS configuration (CertificateRefs and
CACertificateRefs) has been applied.
This change makes tls.Options act as an override mechanism rather than
initial configuration, ensuring Gateway API spec behavior takes precedence
with Istio-specific extensions applied afterwards.
Key benefits:
- Enables OPTIONAL_MUTUAL mode when Gateway API configures client cert
validation (CACertificateRefs), allowing "valid cert OR IP whitelist"
authorization patterns
- Preserves Gateway API standard semantics as the primary configuration
- Maintains backward compatibility for ISTIO_MUTUAL and ISTIO_SIMPLE modes
- Allows explicit opt-in to optional mTLS behavior via Istio extension
* Automator: update proxy@master in istio/istio@master (#59189)
* Set CNI config file permissions to 0600 for CIS benchmark compliance (#59075)
* Set CNI config file permissions to 0600 for CIS compliance
The Istio CNI plugin writes its config files with 0644 permissions,
which violates the CIS Kubernetes benchmark v1.12 requirement that
CNI config files should be 0600 or more restrictive.
Other CNI plugins like Cilium already write with 0600. This changes
both the initial write path in cniconfig.go and the re-write path
in install.go to use 0600, and updates the test helper to match.
Fixes #59071
* Add release note for CNI config file permission change
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
* Add configurable cni-conf-mode flag for CNI config file permissions
Addresses review feedback: make the CNI config file permissions
configurable via --cni-conf-mode flag (env: CNI_CONF_MODE), defaulting
to 0600. This lets users revert to 0644 if needed.
Also adds an upgrade note to the release notes.
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
* simplify CNI config permission flag to boolean toggle
Replace the freeform --cni-conf-mode integer flag with a boolean
--cni-conf-chgrp toggle. When disabled (default), permissions are
0600. When enabled, permissions are 0640 for group read access.
This prevents users from setting arbitrary (potentially insecure)
file modes on the CNI config.
* preserve existing file permissions for chained CNI config
When modifying the primary CNI config in chained mode, preserve the
existing file's permissions instead of overwriting them. This avoids
breaking CNI plugins that require specific permissions (e.g. 0660).
The configured mode (0600 default, or 0640 with --cni-conf-chgrp) is
still applied when creating Istio-owned or standalone config files.
Also fix import ordering in config.go and correct the CNIConfChgrp
comment to say "group read" instead of "group read/write" since 0640
only grants group read access.
* Address review feedback: rename flag, revert test default, add separate test case
1. Rename CNIConfChgrp/cni-conf-chgrp to CNIConfGroupRead/cni-conf-group-read
for a more descriptive flag name.
2. Revert defaultFileMode in test back to 0o644 to preserve existing test behavior.
3. Add rmCNIConfigRestrictive as a separate test helper for 0o600 mode instead
of modifying the existing rmCNIConfig function.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix gofmt alignment for CNIConfGroupRead fields
* Remove unused rmCNIConfigRestrictive function to fix lint
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
* trigger CI re-run after rebase on upstream master
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
* update release notes to use passthrough env flag per review
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
---------
Signed-off-by: Varun Chawla <varun_6april@hotmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Automator: update istio/client-go@master dependency in istio/istio@master (#59169)
* Make gateway transport socket connect timeout configurable (#59154)
The transport socket connect timeout on gateway listeners was hardcoded
to 15 seconds (added in #52852). This breaks clients that need more time
after the TLS handshake before sending data, such as certain IoT devices.
This change introduces the PILOT_GATEWAY_TRANSPORT_SOCKET_CONNECT_TIMEOUT
environment variable to allow operators to configure this timeout. The
default remains 15s for backward compatibility. Setting the value to 0s
disables the timeout entirely.
Fixes #56320
Signed-off-by: rohansood10 <rohansood10@users.noreply.github.com>
Co-authored-by: rohansood10 <rohansood10@users.noreply.github.com>
* Automator: update ztunnel@master in istio/istio@master (#59197)
* addons: Bump addons version (#59198)
Signed-off-by: xin.li <xin.li@daocloud.io>
* Automator: update common-files@master in istio/istio@master (#59212)
* Automator: update proxy@master in istio/istio@master (#59199)
* Automator: update istio/client-go@master dependency in istio/istio@master (#59213)
* Automator: update proxy@master in istio/istio@master (#59215)
* Automator: update proxy@master in istio/istio@master (#59227)
* [agw] Refactor common gateway controller logic for future agw support (#59208)
* Refactor common gateway controller logic for future agw support
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix ref
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix import
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* [WIP] chore: Bump gateway-api to v1.5.0 (#59221)
* bump gwapi to 1.5-rc1
* update listenersets in tests
* more changes
* be safe
* even more changes
* codegen
* more codegen
* skip tests
* skip ValidatingAdmissionPolicyBinding
* fix typo
* skip more
* refresh golden
* fix lint
* use NewSimpleClientset
* updat lint check
* add tests to skip
* update static check
* remove vap from manifest
* skip GatewayBackendClientCertificateFeature
* bump gwapi
* remove vap
* Revert "remove vap"
This reverts commit 8835a27a6ead97540d114eef7f08b1dc72d73d37.
* unset mapper
* fix error
* remove mapper
* set attachedlistenersets only if >0
* Automator: update ztunnel@master in istio/istio@master (#59214)
* Allow HTTPRoute and GRPCRoute to coexist on the same gateway hostname (#59222)
* Allow HTTPRoute and GRPCRoute to coexist on the same gateway hostname
* Add release note
* Add e2e test for HTTPRoute and GRPCRoute coexistence
This test validates that HTTPRoute and GRPCRoute can coexist on the same
gateway hostname by sending actual HTTP and gRPC traffic through the gateway.
Without the fix, only one route would work due to VirtualService conflicts.
With the fix, both routes are merged and both traffic types succeed.
* Fix e2e test compilation error
Handle Call() return value correctly - it returns (CallResult, error)
* Automator: update proxy@master in istio/istio@master (#59243)
* Allowing TLSRoutes from E/W Gateway API. (#59223)
* configuring virtual services for E/W gateway
* clean up
* adding release notes
* removing feature flag
* renaming proxy.IsEastWestGateway to IsAmbientEastWestGateway
* Generate baggage peer metadata filters in all proxies that talk HBONE (#59225)
* Generate baggage peer metadata filters in all proxies that talk HBONE
This is a second PR required to fix
https://github.com/istio/istio/issues/59117 (the other one is
https://github.com/istio/proxy/pull/6851).
Basically it was reported that enabling baggage-based metadata discovery
breaks ingress gateways. That's because when baggage-based peer metadata
discovery is enabled we add `peer_metadata` filter to the
`connect_originate` and `inner_connect_originate` listeners.
However, for non-waypoints we didn't add a corresponding upstream
`peer_metadata` filter to the clusters that route to those internal
listeners.
And because those two filters communicate by injecting things into data
stream it broke things, as the filters have to come in pairs.
This PR fixes the issue by generating upstream `peer_metadata` filters
in the clusters that route to `connect_originate` and
`inner_connect_originate` listeners.
Why did we miss this before? We forgot to enable baggage-based peer
metadata discovery in the ambient multi-cluster integration tests -
those tests would have caught this issue (and more).
So this PR also enables baggage-based peer metadata discovery in ambient
multi-cluster tests to make sure that the feature is tested in all kinds
of configurations.
Enabling the tests reveals another issue that we overlooked - it's
possible to configure traffic policies using `DestinationRule` that
would make waypoint create a TLS connection inside HBONE or use PROXY
protocol for traffic inside HBONE.
Because peer metadata filters communicate by injecting into data stream
it breaks configuration that create TLS sessions and/or use PROXY
protocol, because that's done using upstream transport sockets.
We are working on a complete solution for this (basically we want those
filters communicate via shared memory instead of relying on injecting
into data stream), but in the meantime we want to disable baggage-based
peer metadata discovery in some cases.
For now, we do that on cluster-by-cluster basis, so we annontate
clusters that have use TLS or PROXY upstream transport sockets for HBONE
with metadata.
The metadata key is `istio.peer_metadata` and the field we populate to
disable baggage-based peer metadata discovery is
`disable_baggage_discovery`.
NOTE: We use key `istio.peer_metadata` instead of re-using existing
`istio` key, because `InternalUpstreamTransport` socket does not merge
metadata from multiple sources, so if you have metadata at the host
level with the key `istio` and metadata at the cluster level with key
`istio`, only one of those can be propagated upstream. While it's
possible to disable baggage-based peer metadata discovery on
endpoint-by-endpoint basis - it's not really necessary, because whether
we use TLS/PROXY for HBONE or not is defined on the cluster level.
Therefore we have to pick a different key to avoid endpoint-level
metadata overriding cluster-level metadata and vice versa.
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Only add baggage upstream filter to the encap cluster and not all internal clusters
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Reformat the code to make the intent clearer
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Add release note
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Fix build
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* Lint warning fixes
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
---------
Signed-off-by: Mikhail Krinkin <mkrinkin@microsoft.com>
* multicluster: consolidate ambient and standard multicluster controllers (#59211)
* move ambient multicluster handling into a top level controller
* avoid sending stop and opts into the multicluster controller
* remove unnecessary duplicate function
* reuse nested collection implementations
* cleanup
* remove unused field
* small refactor mc controller
* restore comment
* minor adjustments to reduce diff
* fix typo
* remove duplicate mark synced
* add cluster equals method, restore missing test
* remove stale comment
* format
* adjust missing testing scenarios
* Automator: update proxy@master in istio/istio@master (#59261)
* Update BASE_VERSION to master-2026-02-28T19-02-07 (#59269)
* Automator: update go-control-plane in istio/istio@master (#59076)
* Automator: update proxy@master in istio/istio@master (#59270)
* Add port validations to the istioctl (#58584)
* Add port verification to the istioctl
Signed-off-by: xin.li <xin.li@daocloud.io>
* fix review
Signed-off-by: xin.li <xin.li@daocloud.io>
---------
Signed-off-by: xin.li <xin.li@daocloud.io>
* add failover priority support for strict dns clusters (#58674)
* add failover priority support for strict dns clusters
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* fix ut and lint
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* cleanup and comments
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* add release notes
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* address review comments
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* add comments
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* move to different localities
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* add simulation test
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* lint
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
---------
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* test: increase retry delay in ipallocate tests to fix flake (#59239)
* test flake: autoregistration retry assertion (#59275)
* add namespaces option for debug endpoint auth (#59238)
* add namespaces option for debug endpoint auth
* added releasenotes
* krt: serviceentry controller (#58951)
* krt service entry controller
* add some comments
* improve comment
* fix build
* fix lint
* [agw] Generate xds from krt collections (#59182)
* Generate xds config from krt
Part of https://github.com/istio/istio/pull/58893
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* gen and lint
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix lint
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Respond to PR feedback
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix ConfigKey
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Automator: update istio/client-go@master dependency in istio/istio@master (#59287)
* [agw] Agentgateway gatewayclass (#59240)
* Agentgateway gatewayclass
Creates a NewAgentgatewayClassController for the
istio-agentgateway gatewayclass
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Move gatewayclass controller to gatewaycommon and move all
controller logic under the control of the gateway controller
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix comments
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix lint
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* fix: clean stale ips for se (#59226)
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
* Automator: update common-files@master in istio/istio@master (#59286)
* Automator: update ztunnel@master in istio/istio@master (#59288)
* addon: Bump addons version (#59300)
Signed-off-by: xin.li <xin.li@daocloud.io>
* Automator: update proxy@master in istio/istio@master (#59271)
* Add OpenTelemetry semantic convention-aligned service attribute enric… (#59207)
* Add OpenTelemetry semantic convention-aligned service attribute enrichment
When serviceAttributeEnrichment is set to OTEL_SEMANTIC_CONVENTIONS on an OpenTelemetry tracing provider, compute service attributes following the OTel K8s service attributes specification:
https://opentelemetry.io/docs/specs/semconv/non-normative/k8s-attributes/#service-attributes
service.name (xDS push time):
Computed via fallback chain: resource.opentelemetry.io/service.name annotation → app.kubernetes.io/name label → owner resource name → pod name → container name (single) → unknown_service.
service.namespace, service.version, service.instance.id (injection time):
Injected as OTEL_RESOURCE_ATTRIBUTES env var on the sidecar. Each uses a fallback chain from resource.opentelemetry.io/* annotations to K8s metadata (namespace, app.kubernetes.io/version label, Pod UID).
The Environment resource detector is auto-enabled when the feature is active so Envoy reads OTEL_RESOURCE_ATTRIBUTES at startup.
* updating the release notes
* Fixing the linting issues
* Correcting the order of attributes to look for according to OTEL conventions
* Correcting the linting issues
* Adding the injection tests
* Adding the injection tests
* trigger CI
* trigger CI
* dns: intentionally do not skip node's own IP (#59306)
This commit removes an open TODO questioning whether we should skip the node's own IP in the DNS name table, as is done for outbound listeners.
For Headless Services, many StatefulSets expect to be able to resolve their own IP to discover their peers or identity, which matches Kubernetes' default DNS behavior.
* chore: bump gateway-api to 1.5.0 (#59299)
* chore: bump gateway-api to 1.5.0
Previous PR #59221 handled rc.3.
* revert ReferenceGrant v1
* Add CRD watcher filter
Cherry picked from https://github.com/istio/istio/pull/59309
Co-authored-by: John Howard <howardjohn@google.com>
---------
Co-authored-by: John Howard <howardjohn@google.com>
* add new app protocols (#59259)
* krt: disable index optimization when it cannot be used (#59164)
* krt: disable index optimization when it cannot be used
* review comments
* fix jwks uri redirect check by using dial context (#59236)
* fix jwks uri redirect check by using dial context
* switch to control fuction
* add release note
* Automator: update proxy@master in istio/istio@master (#59313)
* fix: cross-network WE panic when ambient is enabled without multicluster flag (#59321)
* fix: cross-network WE causes panic when ambient is enabled without ambientmultinetwork
* relnote
* add test
* test: replace time.Sleep with retry loops in jwks_resolver_test (#59246)
* test: replace time.Sleep with retry loops in jwks_resolver_test
Replace two time.Sleep calls in jwks_resolver_test.go with proper
event-driven retry loops to reduce unnecessary wait times:
1. verifyKeyLastRefreshedTime: replaced time.Sleep(200ms) with
retry.UntilSuccessOrFail polling loop (5ms interval) that exits
as soon as the refresh timestamp changes (or doesn't).
2. TestJwtPubKeyRefreshedWhenErrorsGettingOtherURLs: replaced
time.Sleep(2*refreshInterval) with retry.UntilOrFail polling loop
that exits as soon as the expected JwtPubKey2 is returned.
These tests were listed in https://github.com/istio/istio/issues/37555
as taking 1.25s and 1.06s respectively due to the fixed sleeps.
With retry-based polling, they will complete as soon as the condition
is met rather than always waiting the full sleep duration.
Resolves part of: https://github.com/istio/istio/issues/37555
* test: fix review comments - remove dead code and fix false-positive in wantChanged=false
- Remove dead code after retry.UntilOrFail in TestJwtPubKeyRefreshedWhenErrorsGettingOtherURLs
(err==nil and pk==JwtPubKey2 are already guaranteed by a successful retry loop)
- Fix false-positive in verifyKeyLastRefreshedTime when wantChanged=false:
wait for at least one refresh cycle (via refreshJobKeyChangedCount /
refreshJobFetchFailedCount) before asserting the timestamp did not change
---------
Co-authored-by: Nagendra Reddy <nagendrareddy10@users.noreply.github.com>
* krt: allow listing index collection (#59342)
This is useful for tests which dump the index output
* Allow configuring wasm binary size limit via pilot env var (#59335)
* fix: make wasm OCI image binary size limit configurable via env var
* Move ISTIO_WASM_OCI_MAX_BINARY_SIZE_BYTES env var to pilot/pkg/features
* add release note for wasm OCI image binary size limit configuration
* Apply configurable wasm binary size limit to extractOCIArtifactImage as well
* use int64 to match io.LimitReader signature
* Apply pilot feature to httpfetcher wasm binary size limit
* Automator: update go-control-plane in istio/istio@master (#59354)
* Automator: update proxy@master in istio/istio@master (#59336)
* Do not run Envoy in Prometheus TLS sidecar mode (#59149)
* Do not run Envoy in Prometheus TLS sidecar mode
Add commented-out configuration for lightweight TLS cert provisioning
using DISABLE_ENVOY=true and OUTPUT_CERTS in Prometheus addon manifests.
This allows Prometheus to scrape application metrics in Strict mTLS mode
without running a full Envoy sidecar - only pilot-agent is needed to
provision certificates.
Updated files:
- samples/addons/prometheus.yaml: Added TLS annotations, cert volume,
and volumeMount as commented-out blocks.
- manifests/addons/values-prometheus.yaml: Added equivalent Helm values
as commented-out podAnnotations.
Fixes #34768
* build: merge generated prometheus.yaml
---------
Co-authored-by: Nagendra Reddy <nagendrareddy10@users.noreply.github.com>
* `istioctl`: Add JSON/YAML output options to `proxy-status` subcommand (#58493)
* `istioctl`: Add JSON/YAML output options to `proxy-status` subcommand
* `XdsStatusWriter`: Fix tests, make use of existing test helper
* `XdsStatusWriter`: Use stable JSON serialization method
* [agw] Agentgateway gateway and listenerSet collections (#59258)
* [agw] Agentgateway gateway and listenerSet collections
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Rebased and created seperate class maps for gateway and
agentgateway
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Cleanup some TODOs
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* fix lint
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* fix gen
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Address PR feedback
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* feat: istio-discovery template supports IPv6 remotePilotAddress (#58828)
* feat: istio-discovery template supports IPv6 remotePilotAddress
Signed-off-by: 王然 <ranwang@alauda.io>
* fix: make gen
Signed-off-by: 王然 <ranwang@alauda.io>
* feat: add IPv6 remotePilotAddress unit test
Signed-off-by: 王然 <ranwang@alauda.io>
* feat: merge master and add nolint:unparam
Signed-off-by: 王然 <ranwang@alauda.io>
---------
Signed-off-by: 王然 <ranwang@alauda.io>
* Remove redundant runAsUser/runAsGroup from gateway chart deployment template (#59291)
The gateway helm chart's deployment template sets `runAsUser: 1337` and
`runAsGroup: 1337` on the istio-proxy container. However, these values are
already injected by the gateway injection template
(gateway-injection-template.yaml), making them redundant.
This redundancy causes `adjustInitContainerUser` to detect user-supplied
RunAsUser/RunAsGroup overrides and attempt to sync them to init containers.
Since the gateway injection template does not include istio-init or
istio-validation init containers, this results in a spurious warning:
"Could not find either istio-init or istio-validation container"
Removing these fields from the deployment template eliminates the warning
while maintaining the same final pod spec, as injection always applies
the same values.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Automator: update istio/client-go@master dependency in istio/istio@master (#59365)
* Wildcard hostname support in SE for HTTP and TLS (#58688)
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
* fix flake: TestServiceDiscoveryWorkloadUpdate/cleanup (#59324)
* fix static collection event handler syncing (#59385)
* Fix waypoint routing for multi-cluster on single network (#59330)
* Fix waypoint routing for multi-cluster on single network
* filter VIPs belonging to clusters on different networks
* fix gen
* change network lookup impl to use workloadapi.Service from ambientindex
* always include local cluster VIP even if missing from ambient index
* require global scope and refactor to be purely additive to getAllAddressesForProxy
* refactor global scope check and simplify test
* address comments
* krt: fix static collection register syncing (#59388)
* expand dependabot (#59382)
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* Add HTTP compression to pilot-agent server (#59252)
* Add HTTP compression to pilot-agent server
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
* chore: apply go fmt pilot-agen/status/server_test.go
---------
Signed-off-by: Christian Rohmann <christian.rohmann@inovex.de>
* test: fix index out of range error in ingressImpl.callEcho (#59403)
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* tests: refactor ambient test suite setup (#59407)
* tests: refactor ambient test suite setup
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* Refactor overwriting nativeNftables
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
---------
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* dependabot: ignore updates we don't want to update (#59405)
* more explicit samples ignore
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* we need to do more ignores for things we want locked to minor versions
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* only bump for security on release branches, this should help avert unnecessary bumps since we want these to be stable.
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
---------
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* Use MS_SLAVE as fallback when MS_PRIVATE is blocked during sandbox setup (#59406)
On some nodes (e.g., Bottlerocket/EKS), the mount() call with MS_PRIVATE
can get blocked by the node's security policy, even after we've successfully
unshared the mount namespace. This causes runInSandbox() to give up entirely
and fall back to running iptables without any isolation while logging a warning
on every pod creation.
MS_SLAVE is a reasonable alternative as it still prevents our bind mounts from
leaking back to the host, which is all we actually need. With this change, if
MS_PRIVATE is blocked, we try MS_SLAVE before giving up on the sandbox.
Fixes: https://github.com/istio/istio/issues/59384
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
* we need to exclude from groups separately (#59415)
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* support istioctl proxy-status -oyaml/json list proxy in single namespace (#59378)
* support istioctl proxy-status -oyaml/json list proxy in single namespace
Signed-off-by: xin.li <xin.li@daocloud.io>
* add unit test
Signed-off-by: xin.li <xin.li@daocloud.io>
* fix reviewed
Signed-off-by: xin.li <xin.li@daocloud.io>
* fix pointer
Signed-off-by: xin.li <xin.li@daocloud.io>
* fix govet: MessageState contains sync.Mutex
Signed-off-by: xin.li <xin.li@daocloud.io>
---------
Signed-off-by: xin.li <xin.li@daocloud.io>
* one more fix (#59432)
Signed-off-by: Daniel Hawton <daniel.hawton@solo.io>
* Automator: update common-files@master in istio/istio@master (#59366)
* Update BASE_VERSION to master-2026-03-12T19-02-25 (#59424)
* Automator: update ztunnel@master in istio/istio@master (#59373)
* Automator: update proxy@master in istio/istio@master (#59363)
* Automator: update go-control-plane in istio/istio@master (#59440)
* Automator: update proxy@master in istio/istio@master (#59438)
* virtualservice controller (#59435)
* Automator: update istio/client-go@master dependency in istio/istio@master (#59442)
* fix webhook failurePolicy conflict on helm upgrade (with SSA) (#59367)
* fix server-side apply error during upgrade
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* release notes rewording
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* add control via values.yaml
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* webhook header timeout (#59456)
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* limit decompressed size in wasm gzip fetch path (#59395)
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Automator: update proxy@master in istio/istio@master (#59472)
* fallback to local mesh config when failing to read the remote (#59473)
* fallback to local mesh config when failing to read the remote
* relnote
* krt: fix goroutine leak when closing before ready (#59475)
* [agw] Route collection and conversion refactor (#59362)
* Add route building
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Add unit tests and cleanup conversion.go by splitting into mult
files
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix gen and add inferencepolicy
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
Remove inferencePool collection from change
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
Remove conversion.go
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* Fix gen
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
---------
Signed-off-by: Jackie Elliott <jaellio@microsoft.com>
* proxy: fix peer authentication dependencies (#59331)
* fix proxy peer authentication dependencies
* format
* fix authn policies init for gateways
* fix endpoint builder test
* fix xds cache test
* fix endpoint builder test
* fix typo in comment
* reset sidecar scope if peer authentication is updated
* rebuild sidecar scopes when authn changes
* address comments
* new testdata
* reintroduce deleted line
* Automator: update common-files@master in istio/istio@master (#59480)
* fix base chart webhook failurePolicy conflict on helm upgrade with SSA (#59488)
* fix base chart webhook failurePolicy conflict on helm upgrade with SSA
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* fix helm template stderr pollution
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* lint
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
---------
Signed-off-by: Petr McAllister <petr.mcallister@gmail.com>
* Automator: update proxy@master in istio/istio@master (#59478)
* tests: add integration tests for PQC compliance policy in ambient mode (#59220)
* tests: add integration tests for PQC compliance policy in ambient mode
Signed-off-by: Jacek Ewertowski <jacek.ewertowski1@gmail.com>
* tests: add integration test for PQC compliance policy in ambient mode
Signed-…
What this PR does / why we need it:
This a combination of two filters that have to be used together:
Those two filters together basically create a tunnel. The tunnel protocol just prepends a fixed size header to data stream coming from regular network filter to the upstream network filter, followed by the peer metadatra encoded as protobuf Any containing a protobuf Struct inside (I'm just re-using existing code from Istio proxy, that's why encoding is such as it is).
The regular network filter only triggers when there is some data coming from upstream connection in response. It's not correct in general, but in waypoints we do know that we proxy an L7 protocol (http or gRPC), so we do expect a some data in reply.
The regular network filter relies on TCP Proxy filter extracting response headers and saving them in the filter state. It then extracts and parses the baggage header from the saved headers.
In all cases I explicitly communicate when no peer metadata has been discovered by sending some data downstream. This ensures that upstream network filter running downstream can always remove the prefix from the data stream and does not really need to guess if it's there or not.
NOTE: We still do some checks to confirm that the prefix is there, but we cannot really rely on those checks for correctness in all the cases.
The upstream network filter, as pointed out above, extracts the data sent by the regular network filter from the data stream, it parses the data and populates filter state based on that.
Unlike the HTTP peer metadata filter, this one runs in the context of the upstream connection, so it populates the upstream filter state and not the regular one.
I plan to add support to the HTTP peer metadata filter option for new upstream metadata discovery via upstream filter metadata, thus propagating it all the way to the istio stats filter.
NOTE: None of those filters are yet generated by pilot and there are certainly some additional options to configure (e.g., maybe we can come up with a good way to transfer metadata via Envoy TLS instead of injecting it into the data stream directly - this way, in principle, we could avoid creating a custom upstream filter all together, if http peer metadata filter could get the peer metadata directly from connect_originate listener). All-in-all, it's not the final implementation.
Related to #58794
Special notes for your reviewer:
+cc @keithmattix @Stevenjin8 @grnmeira