-
Notifications
You must be signed in to change notification settings - Fork 339
[webhook] OTel changes #3189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[webhook] OTel changes #3189
Conversation
|
/assign @Cali0707 @evankanderson @skonto |
40c7046 to
a3884bc
Compare
f32c228 to
62402c6
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3189 +/- ##
==========================================
- Coverage 76.05% 76.00% -0.05%
==========================================
Files 205 205
Lines 11751 11710 -41
==========================================
- Hits 8937 8900 -37
+ Misses 2541 2540 -1
+ Partials 273 270 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0977ab8 to
bc10cae
Compare
|
This is ready for a review cc @evankanderson @Cali0707 Based on the feedback from the OTel folks I removed the package variable instruments. This makes unit tests easier to write and FYI the instruments are de-duped by the otel-go sdk when multiple webhooks are created (excluding the pointers in our webhook struct) |
| attrs = append(attrs, allowedAttr) | ||
| labeler.Add(allowedAttr) | ||
|
|
||
| wh.metrics.recordHandlerDuration(ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We report time for bad requests too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad requests may have a different time distribution than "good" requests, which can be a useful hint when debugging, so I'm +1 on separating these.
evankanderson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
One concern about use of LabelerFromContext without looking at the second value in admission.go, but this seems like (not counting libraries), this somewhat simplifies our overall code.
| attrs = append(attrs, allowedAttr) | ||
| labeler.Add(allowedAttr) | ||
|
|
||
| wh.metrics.recordHandlerDuration(ctx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad requests may have a different time distribution than "good" requests, which can be a useful hint when debugging, so I'm +1 on separating these.
| } | ||
| r.Body = io.NopCloser(&bodyBuffer) | ||
|
|
||
| labeler, _ := otelhttp.LabelerFromContext(r.Context()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the second result is false, then:
In this case it is safe to use the Labeler but any attributes added to it will not be used.
Do we want / need to call otelhttp.ContextWithLabeler to set the labeler if this was false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't need to since - the otelhttp middleware sets it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone's using the handler without that middleware then the labeler is a noop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't know if that was a concern. You can add a comment here that this relies on that middleware setting the labeler context as another option, since this looks like ignoring an error value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
webhook/conversion.go
Outdated
| labeler.Add( | ||
| ConversionResultStatus.With(strings.ToLower(response.Response.Result.Status)), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this after the recordHandlerDuration call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recordHandlerDuration will add the duration to our custom histogram metric that measures the call to the conversion controller.
The labeler on the other hand will add attributes that otelhttp middelware reports. The the order of these don't really mater in my mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the labeler labels won't have any relation to the metric attributes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the labeler labels won't have any relation to the metric attributes?
There is no relationship. What we should probably do is actually get the attribute keys from the label when recording the metric - that way we'll have the exact same attributes (though there's a slice alloc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - i moved this invocation upward and record will now fetch the labeler attributes
webhook/metrics.go
Outdated
| AdmissionOperation = attributekey.String("kn.webhook.admission.operation") | ||
| AdmissionGroup = attributekey.String("kn.webhook.admission.group") | ||
| AdmissionVersion = attributekey.String("kn.webhook.admission.version") | ||
| AdmissionKind = attributekey.String("kn.webhook.admission.kind") | ||
| AdmissionSubresource = attributekey.String("kn.webhook.admission.subresource") | ||
| AdmissionAllowed = attributekey.Bool("kn.webhook.admission.result.allowed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need admission in these keys? If not, they could cover all kinds of webhook metrics (conversion, defaulting, etc), with fewer definitions. (e.g. the admission and conversion status could use the same attribute)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible but a lot of the fields don't apply to conversion webhooks (maybe just two).
What are your concerns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to type a lot of long strings into Prometheus queries. (Yes, it's more human laziness than technical cost, which should be fairly minimal with a reasonable interning implementation.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - i did a pass take a look - i think it worked out
Cali0707
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold in case @dprotaso wants any more reviews
…o unit test Digging into OTel implementation it looks like instruments are cached. Thus when different webhooks request the same instrument a cache lookup for the instrument will occur.
7687b56 to
624cd94
Compare
|
rebased |
Cali0707
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Cali0707, dprotaso, evankanderson The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold cancel @evankanderson if you have any follow up comments let me know and i can do them in a follow up PR next week. |
This stacks onto the shared main PR - #3190
You can review the webhook changes here