tracing: Enable decorated operation in outbound (egress) listener to …#1858
tracing: Enable decorated operation in outbound (egress) listener to …#1858mattklein123 merged 10 commits intoenvoyproxy:masterfrom
Conversation
…be passed to inbound (ingress) listener to override server span operation Signed-off-by: Gary Brown <gary@brownuk.com>
|
@objectiser quick pre-review note that we will need docs for this change. It would be good to add this header into the router headers section that I mentioned in the issue, and also link to their from the tracing docs. Thank you. |
Signed-off-by: Gary Brown <gary@brownuk.com>
| x-envoy-decorator-operation | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| If this header is present on ingress requests, it's value will override any locally defined |
| HEADER_FUNC(EnvoyUpstreamRequestTimeoutAltResponse) \ | ||
| HEADER_FUNC(EnvoyUpstreamRequestTimeoutMs) \ | ||
| HEADER_FUNC(EnvoyUpstreamServiceTime) \ | ||
| HEADER_FUNC(EnvoyDecoratorOperation) \ |
| * This method returns the operation name. | ||
| * @return the operation name | ||
| */ | ||
| virtual const std::string getOperation() const PURE; |
| ConnectionManagerImpl::chargeTracingStats(tracing_decision.reason, | ||
| connection_manager_.config_.tracingStats()); | ||
|
|
||
| if (tracing_decision.is_tracing) { |
There was a problem hiding this comment.
nit: Now that you are in a new function, to reduce nesting can you do:
if (!tracing_decision.is_tracing) {
return;
}
| active_span_ = connection_manager_.tracer_.startSpan(*this, *request_headers_, request_info_); | ||
| if (cached_route_.value() && cached_route_.value()->decorator()) { | ||
| cached_route_.value()->decorator()->apply(*active_span_); | ||
| if (connection_manager_.config_.tracingConfig()->operation_name_ == |
There was a problem hiding this comment.
I didn't realize the header propagation was also part of this. Can you make that more clear in the docs. Also, can you add some comments here about the logic in place? It's a little hard to quickly untangle what we are doing and why.
|
|
||
| Envoy will potentially sanitize the following headers: | ||
|
|
||
| * :ref:`x-envoy-decorator-operation <config_http_filters_router_x-envoy-decorator-operation>` |
There was a problem hiding this comment.
I think this needs to be implemented? (Remove header on external requests, need to add test for that also in conn_manager_utility)
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
| if (request_headers_->EnvoyDecoratorOperation()) { | ||
| decorator_operation = request_headers_->EnvoyDecoratorOperation()->value().c_str(); | ||
| } | ||
|
|
There was a problem hiding this comment.
so, there is some work going on in Istio to be able to decode a request using swagger spec/openapi spec.. And if that is the case, one might want to set the operation name based on the API spec. Would it make sense to have an interface for a filter to overwrite the operation name, in addition to the inbound header name doing the full override?
There was a problem hiding this comment.
I think that makes sense, but let's track in a separate issue?
mattklein123
left a comment
There was a problem hiding this comment.
Looks good. Few small things.
|
|
||
| active_span_ = connection_manager_.tracer_.startSpan(*this, *request_headers_, request_info_); | ||
|
|
||
| // If a decorator has been defined, apply it to the active span |
There was a problem hiding this comment.
super nit: please end sentences with '.'. Here and below.
There was a problem hiding this comment.
should be sorted.
| state_.saw_connection_close_ = true; | ||
| } | ||
|
|
||
| std::string decorator_operation; |
There was a problem hiding this comment.
Instead of allocating a string here on the stack, it would be faster to just pass the pointer value of the header entry into the trace function as you did before. Any reason to switch it?
There was a problem hiding this comment.
I got a core dump when running the coverage tests - although strangely not when doing the bazel.dev build which also runs the tests. The problem seems to be caused by the header entry being removed/unallocated.
[----------] 38 tests from HttpConnectionManagerImplTest
[ RUN ] HttpConnectionManagerImplTest.HeaderOnlyRequestAndResponse
[ OK ] HttpConnectionManagerImplTest.HeaderOnlyRequestAndResponse (2 ms)
[ RUN ] HttpConnectionManagerImplTest.InvalidPathWithDualFilter
[ OK ] HttpConnectionManagerImplTest.InvalidPathWithDualFilter (1 ms)
[ RUN ] HttpConnectionManagerImplTest.StartAndFinishSpanNormalFlow
pure virtual method called
terminate called without an active exception
[2017-10-17 19:29:59.086][2295][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:101] Caught Aborted, suspect faulting address 0x3e8000008f7
[2017-10-17 19:29:59.086][2295][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:85] Backtrace obj</lib/x86_64-linux-gnu/libc.so.6> thr<0> (use tools/stack_decode.py):
[2017-10-17 19:29:59.086][2295][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<0> #0 0x7f26ff0d0428
[2017-10-17 19:29:59.086][2295][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:95] thr<0> #1 0x7f26ff0d2029
[2017-10-17 19:29:59.086][2295][critical][backtrace] bazel-out/local-dbg/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:93] thr<0> obj</usr/lib/x86_64-linux-gnu/libstdc++.so.6>
There was a problem hiding this comment.
Can you put the code back? If it fails I can help you figure out what the issue is.
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
57116af to
40b0b28
Compare
| state_.saw_connection_close_ = true; | ||
| } | ||
|
|
||
| const HeaderEntry* decorator_operation = request_headers_->EnvoyDecoratorOperation(); |
There was a problem hiding this comment.
So at minimum, I think you are grabbing a header before headers are cleaned, which means this can get deleted. Can you remove the local variable and do this inline below? (Or just pass the full constant header map ref to the trace function).
There was a problem hiding this comment.
I'll give it a go tomorrow - although I think the problem is that the x-envoy-decorator-operation is being removed in the sanitize step, as previously when I commented this line out the tests/coverage worked.
Signed-off-by: Gary Brown <gary@brownuk.com>
… internal request Signed-off-by: Gary Brown <gary@brownuk.com>
|
@mattklein123 Hi Matt, need some help. https://github.com/envoyproxy/envoy/pull/1858/files#diff-ed63d9053f7d0b3777c96f2aaf8f2f2fR400 - this test is failing when code for sanitizing the header is included. So this suggests this test is representing an external request, even though the comment https://github.com/envoyproxy/envoy/pull/1858/files#diff-ed63d9053f7d0b3777c96f2aaf8f2f2fR409 suggests it is internal. If this test is handling an external request, then I can change it to correctly check for a null header - and then introduce another test for internal request handling? If this is the case - how should I change the test to represent an internal request? Other option is to remove the santization code for now and deal with in a separate PR? thoughts? |
|
@objectiser I can check it out locally and see, but from a quick glance, given that |
|
@mattklein123 Adding that header made the test work - however looks like the sanitization is not having the desired effect. I tried out the jaegertracing example, and the decorated op name is not being propagated from the egress listener to the ingress: Is it ok to remove the sanitization for now and look at as a separate issue? |
Sure that's fine. Please open an issue so that we can track/remember. |
Signed-off-by: Gary Brown <gary@brownuk.com>
…1858) Description: Introduces an initial heuristic for attempting a connection using a socket with a bound interface. The heuristic is entirely contained within reportNetworkUsage(...), and as such is straightforward to iterate and experiment upon. In support of the heuristic, current network/configuration state is identified with a configuration_key that prevents stale information from influencing internal tracking. ~Current network state is stored and updated automatically.~ This allows static accessors to begin to report OS updates during Envoy's startup period, and ensures OS updates have a timely effect on new requests. The initial heuristic is as follows: For both WiFi and cellular network types, there is a default socket configuration that does not use a bound socket, and an alternative "override" configuration that will attempt to use an interface associated with the *opposite* network type (a cellular-bound socket when the preferred network type is WiFi, and a WiFi-bound socket when network type is cellular). A "network fault" is defined as a request that terminates in error while having received no upstream bytes. If a connection has never handled a request without a fault, it is allowed only one fault before the alternative will be tried. If a connection has ever handled a request without a fault, it is allowed up to three consecutive faults before the alternative will be tried. The above behavior is disabled by default, and may be enabled by settings enableInterfaceBinding(true) on the public EngineBuilders. Risk Level: Moderate Testing: Manual and new/updated coverage Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
…1858) Description: Introduces an initial heuristic for attempting a connection using a socket with a bound interface. The heuristic is entirely contained within reportNetworkUsage(...), and as such is straightforward to iterate and experiment upon. In support of the heuristic, current network/configuration state is identified with a configuration_key that prevents stale information from influencing internal tracking. ~Current network state is stored and updated automatically.~ This allows static accessors to begin to report OS updates during Envoy's startup period, and ensures OS updates have a timely effect on new requests. The initial heuristic is as follows: For both WiFi and cellular network types, there is a default socket configuration that does not use a bound socket, and an alternative "override" configuration that will attempt to use an interface associated with the *opposite* network type (a cellular-bound socket when the preferred network type is WiFi, and a WiFi-bound socket when network type is cellular). A "network fault" is defined as a request that terminates in error while having received no upstream bytes. If a connection has never handled a request without a fault, it is allowed only one fault before the alternative will be tried. If a connection has ever handled a request without a fault, it is allowed up to three consecutive faults before the alternative will be tried. The above behavior is disabled by default, and may be enabled by settings enableInterfaceBinding(true) on the public EngineBuilders. Risk Level: Moderate Testing: Manual and new/updated coverage Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
…up (#1858) Bumps the node group in /site with 1 update: [glob](https://github.com/isaacs/node-glob). Updates `glob` from 13.0.1 to 13.0.2 <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/isaacs/node-glob/commit/2135b0c3580caf6330e28dedb2d57cea75f15154"><code>2135b0c</code></a> 13.0.2</li> <li><a href="https://github.com/isaacs/node-glob/commit/52544940ddddead912084a436ac08604592c5a81"><code>5254494</code></a> ship minified by default, update deps</li> <li><a href="https://github.com/isaacs/node-glob/commit/0a603fe1857105aedf4a1f80a75375fc3e4c0385"><code>0a603fe</code></a> remove unused tshy config</li> <li>See full diff in <a href="https://github.com/isaacs/node-glob/compare/v13.0.1...v13.0.2">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>

…be passed to inbound (ingress) listener to override server span operation
Resolves #1849
Signed-off-by: Gary Brown gary@brownuk.com