Add activator metrics and dashboard#1567
Add activator metrics and dashboard#1567akyyy wants to merge 11 commits intoknative:masterfrom akyyy:activator-metrics1
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: akyyy If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage |
|
/test pull-knative-serving-go-coverage-dev |
4 similar comments
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage |
1 similar comment
|
/test pull-knative-serving-go-coverage |
|
/test pull-knative-serving-go-coverage-dev |
2 similar comments
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage-dev |
1 similar comment
|
/test pull-knative-serving-go-coverage-dev |
|
/test pull-knative-serving-go-coverage |
|
@mdemirhan @josephburnett gentle ping. It would be nice if I could get this in by the end of this week. Thanks! |
|
/test pull-knative-serving-go-coverage |
|
/test pull-knative-serving-go-coverage-dev |
|
/hold Wait until after we cut first release. |
|
/test pull-knative-serving-go-coverage-dev --skipped |
|
/test pull-knative-serving-go-coverage-dev |
|
The following is the coverage report on pkg/.
|
|
/skip |
|
|
||
| if resp != nil { | ||
| rrt.logger.Infof("It took %d tries to get response code %d", i, resp.StatusCode) | ||
| namespace := r.Header.Get(controller.GetRevisionHeaderNamespace()) |
There was a problem hiding this comment.
what is these are empty? Should we still send metrics?
There was a problem hiding this comment.
If it's empty or doesn't exist, we'll get 500 errors based on
serving/pkg/activator/revision.go
Line 68 in 5b8542f
I think we should report error for the given headers -- all requests routed to activator should have the headers set. If not, we can use the metrics to debug.
| start time.Time | ||
| } | ||
|
|
||
| func (rrt retryRoundTripper) RoundTrip(r *http.Request) (*http.Response, error) { |
There was a problem hiding this comment.
The interface implementation should be on the pointer, not the value type.
func (rrt *retryRoundTripper) RoundTrip(...
| rrt.reporter.ReportResponseCount(namespace, config, name, resp.StatusCode, i, 1.0) | ||
| rrt.reporter.ReportResponseTime(namespace, config, name, resp.StatusCode, time.Now().Sub(rrt.start)) | ||
| } | ||
| return resp, err |
There was a problem hiding this comment.
Why not report anything when all retries fail with an error (i.e. no response)?
There was a problem hiding this comment.
Sure, I can do that. In that case, what status code dimension you would suggest to use?
| } | ||
|
|
||
| func (a *activationHandler) handler(w http.ResponseWriter, r *http.Request) { | ||
| namespace := r.Header.Get(controller.GetRevisionHeaderNamespace()) |
There was a problem hiding this comment.
None of this new functionality in main.go have any unit tests - see #1585 that refactoring and adding unit tests. I would prefer waiting for that to get in and adding this functionality such that it is tested properly.
All of these should be easy to test - inject headers, test edge cases, see if the reporter is getting called, ...etc.
| // status code indicating why it could not. | ||
| type Activator interface { | ||
| ActiveEndpoint(namespace, name string) (Endpoint, Status, error) | ||
| ActiveEndpoint(namespace, configuration, name string) (Endpoint, Status, error) |
There was a problem hiding this comment.
Is configuration needed for activation? If not, it doesn't look right to add it here. Is there no better way to get this information for the stats?
There was a problem hiding this comment.
The configuration parameter is needed for activator to report metrics.
| } | ||
| if reqs, ok := a.pendingRequests[id]; ok { | ||
| a.reporter.ReportRequest(id.namespace, id.configuration, id.name, state, float64(len(reqs))) | ||
| logger.Infof("Wrote request_count metric for revision %s for namespace %s with value %d", id.name, id.namespace, len(reqs)) |
There was a problem hiding this comment.
We should not write logs to say that we wrong metrics. Seems extremely verbose. If you absolutely need this, I recommend using "Debug" rather than "Info"
| var ( | ||
| measurements = []*stats.Float64Measure{ | ||
| RequestCountM: stats.Float64( | ||
| "revision_request_count", |
There was a problem hiding this comment.
is it intentional to use the same name as revision request count from Istio? If so, do we have the same matching labels.
|
@mdemirhan I added "start" field in retryRoundTripper struct to calculate the response time in function If we reuse one retryRoundTripper, we lost the request start time? |
|
This is moved to #1726 after creating a new repo. |
Fixes #743
Proposed Changes