Add Open Telemetry Spec attributes to IMC Dispatcher span#4659
Conversation
|
I wasn't sure if there is a better approach to populate the destination attribute so the simple approach uses url, which is accurate but not as user friendly as those for the other 2 services. |
Codecov Report
@@ Coverage Diff @@
## master #4659 +/- ##
=======================================
Coverage 81.07% 81.08%
=======================================
Files 291 291
Lines 8216 8220 +4
=======================================
+ Hits 6661 6665 +4
Misses 1154 1154
Partials 401 401
Continue to review full report at Codecov.
|
27dfb0b to
8f47744
Compare
| The Channel MUST recognize and pass through all tracing information from sender | ||
| to subscribers using [W3C Tracecontext](https://w3c.github.io/trace-context/), | ||
| although internally it MAY use another mechanism(s) to propagate the tracing | ||
| information. The Channel SHOULD sample and write traces to the location | ||
| specified in | ||
| [`config-tracing`](https://github.com/knative/eventing/blob/master/config/config-tracing.yaml). |
There was a problem hiding this comment.
Same applies for broker
There was a problem hiding this comment.
It seems to me the observabiliity parts of broker and channel should be exactly the same. I am wondering if it worths it is own spec and just reference the spec in both channel and broker. Adam also mentioned that whether the broker should have some common metrics, but I remembered the last conversation ended with not going to implement it.
There was a problem hiding this comment.
@zhongduo thats a fair point.
In the current spec, the channel has an additional requirement of passing through tracing info from sender to subscribers. It sounds like that would be applicable for the Broker as well ?
There was a problem hiding this comment.
It sounds like that would be applicable for the Broker as well ?
Yep, and similar to channel, there is no constraint on how to propagate tracing informations
There was a problem hiding this comment.
I think it will be very confusing if there are difference between Channel and broker, as you can build a "broker" using channels.
There was a problem hiding this comment.
as you can build a "broker" using channels.
But that would be an implementation detail of how you build that particular broker
There was a problem hiding this comment.
I wasn't really thinking about "how to build the broker". My point is that in tracing side, most requirements in broker will apply in channel as well, so just wondering how we can avoid such an inconsistency. I do understand theoretically they can be very different, but at least now they looks related to me.
| - the backoff rate | ||
|
|
||
| #### Metrics | ||
| #### Observability |
There was a problem hiding this comment.
This name change makes perfect sense. Thanks for catching that.
| The Channel MUST recognize and pass through all tracing information from sender | ||
| to subscribers using [W3C Tracecontext](https://w3c.github.io/trace-context/), | ||
| although internally it MAY use another mechanism(s) to propagate the tracing | ||
| information. The Channel SHOULD sample and write traces to the location | ||
| specified in | ||
| [`config-tracing`](https://github.com/knative/eventing/blob/master/config/config-tracing.yaml). |
There was a problem hiding this comment.
It seems to me the observabiliity parts of broker and channel should be exactly the same. I am wondering if it worths it is own spec and just reference the spec in both channel and broker. Adam also mentioned that whether the broker should have some common metrics, but I remembered the last conversation ended with not going to implement it.
8f47744 to
1e6fa2e
Compare
|
I am not talking about the implementation. I guess my point is that in some
sense they are originally designed to be mostly equivalent. So keep specs,
including observability, synced should be the default, unless we have a
reason not to.
…On Wed, Dec 16, 2020 at 11:21 AM Francesco Guardiani < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In docs/spec/channel.md
<#4659 (comment)>:
> +The Channel MUST recognize and pass through all tracing information from sender
+to subscribers using [W3C Tracecontext](https://w3c.github.io/trace-context/),
+although internally it MAY use another mechanism(s) to propagate the tracing
+information. The Channel SHOULD sample and write traces to the location
+specified in
+[`config-tracing`](https://github.com/knative/eventing/blob/master/config/config-tracing.yaml).
as you can build a "broker" using channels.
But that would be an implementation detail of how you build that
particular broker
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4659 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACE6CNANVESZF5QFM3MY3DLSVDNA7ANCNFSM4U46QO3Q>
.
|
1e6fa2e to
26c3de1
Compare
26c3de1 to
a7b27a4
Compare
20d08bb to
10bf6a0
Compare
|
/retest |
|
I was able to run the rekt -> TestBrokerAsMiddleware locally Am I missing something or is the test a flake ? |
| // 3. Channel Dispatcher span | ||
| // 4. Channel sends event to Mutator pod. |
There was a problem hiding this comment.
Why 2 spans here? Are we sure this isn't an impl detail here and only 1 match should be done (the point 3)?
There was a problem hiding this comment.
I discovered the the span containing the cloudevent & tracing tags wasn't being tested previously.
Removing the assertion for the span corresponding to Channel sends event to Mutator pod doesn't impact the issue this PR is attempting to address. However, It would just loose the test case to ensure the http status of the resultant request is not tested. I'd defer the importance of testing/not testing to folks with more context.
Looking at the broker tests, it doesn't appear that they are asserting for the presence of the span with the http status either.
So, I am happy to drop the assertions for spans corresponding to 4, 7 and 10.
Does that make sense @slinkydeveloper ?
There was a problem hiding this comment.
Sorry, I think i explained myself wrongly. Why do assert the span:
Channel Dispatcher span
Isn't that an implementation detail?
There was a problem hiding this comment.
Thats the span that contains the Open Telemetry Spec Attributes that this PR adds
Ditto for spans labeled 6 and 9
|
|
||
| if span.IsRecordingEvents() { | ||
| err = kncloudevents.WriteHTTPRequestWithAdditionalHeaders(ctx, message, req, additionalHeaders, kncloudevents.PopulateSpan(span)) | ||
| err = kncloudevents.WriteHTTPRequestWithAdditionalHeaders(ctx, message, req, additionalHeaders, tracing.PopulateSpan(span, url.String())) |
There was a problem hiding this comment.
Since you're keeping kncloudevents.PopulateSpan (which is fine, to avoid breaking downstream), can you add the deprecated comment to such function to signal users to use this new one in tracing module?
There was a problem hiding this comment.
The kncloudevents.PopulateSpan function & file was removed in the PR. However, I hadn't considered how this change might impact downstream.
The only references to PopulateSpan were from message_dispatcher.go, which I assumed would be shipped together. Are there downstream uses cases that would be affected ?
There was a problem hiding this comment.
I honestly don't remember, try to remove the function and check if the downstream checks break 😄
There was a problem hiding this comment.
Since this PR removes the function I was looking at the downstream checks for the PR. It looks like all but the Kafka downstream checks are passing. I am not sure if the Kafka failure is related to the changes in the PR.
I also did a github search in knative org and knative-sandbox org and the only reference is message_dispatcher.go, so this change feels safe.
Any additional things I could check ?
There was a problem hiding this comment.
Hmm, rebased off master just now and now the downstream kafak test is passing but the rabbit and nats ones are failing due to go mod issues.
|
|
||
| expectedAttributes := map[string]interface{}{ | ||
| "messaging.system": "knative", | ||
| "messaging.protocol": "HTTP", |
There was a problem hiding this comment.
I see that http is the protocol used, do we support https? How about protocol version maybe we should be explicit as there is wip (if I understood correctly) for http/2?
There was a problem hiding this comment.
Hey @skonto
If I understood correctly, once the http/2 feature is wrapped up, the protocol string could change and hence hardcoding a value in the test is not a good idea.
The current implementation has it hardcoded here which would need to change once the http/2 feature is completed.
If that change is imminent;
- the assertion could be updated to match a regex
- the string could be pulled out and put into an enum ( would still only be configured for
HTTPfor now ) - and if the test assertion for protocol isn't very valuable, the assertion for protocol could be dropped
wdyt ?
slinkydeveloper
left a comment
There was a problem hiding this comment.
Minor things, but for me we're good to go
|
|
||
| if span.IsRecordingEvents() { | ||
| err = kncloudevents.WriteHTTPRequestWithAdditionalHeaders(ctx, message, req, additionalHeaders, kncloudevents.PopulateSpan(span)) | ||
| err = kncloudevents.WriteHTTPRequestWithAdditionalHeaders(ctx, message, req, additionalHeaders, tracing.PopulateSpan(span, url.String())) |
There was a problem hiding this comment.
I honestly don't remember, try to remove the function and check if the downstream checks break 😄
| // 3. Channel Dispatcher span | ||
| // 4. Channel sends event to Mutator pod. |
There was a problem hiding this comment.
Sorry, I think i explained myself wrongly. Why do assert the span:
Channel Dispatcher span
Isn't that an implementation detail?
- trace should include Open Telemetry Spec attributes (messaging.*) messaging.destination messaging.message_id messaging.protocol messaging.system
- change Metrics -> Observability, to match spec/broker.md - relocate trace pass through & trace config information from Misc to Observability sections - add section specifying messaging.* attributes that should be included in the trace span Signed-off-by: Sameer Vohra <vsameer@vmware.com>
- add check for Open Telemetry Spec tags (messaging.*) in spans emitted by message dispatcher - exported helpful regex host suffix - re-worked the channel trace tree to include all the missing dispatcher spans. Signed-off-by: Sameer Vohra <vsameer@vmware.com>
- nit: change transformer -> mutator for consistency Signed-off-by: Sameer Vohra <vsameer@vmware.com>
e54a35d to
9ed38da
Compare
|
The following is the coverage report on the affected files.
|
Hey @slinkydeveloper |
|
For me this is good to go and I see all comments were addressed |
|
Conformance / upgrade failed to bring the cluster up: Broker middlewater is #4637 |
|
Also rerunning the downstream tests, natss/rabbitmq should be fixed, but keda is being worked on. |
|
As expected keda is still failing, everything else is passing. Thanks for doing this @xtreme-sameer-vohra |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: slinkydeveloper, vaikas, xtreme-sameer-vohra The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #4578
Proposed Changes
update channel spec observability section to reference the Open Telemetry Spec attributes
update channel spec for readability
🧹 Update or clean up current behavior
Release Note