-
Notifications
You must be signed in to change notification settings - Fork 100
[app_dart] Add a handler that pushes benchmark results to Metrics Center #788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new service account or an existing one? Let's create a doc to document the origin, purpose and location. Going forward we'll need to track very closely all the service accounts used in flutter infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Added a comment as the best source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flutter? this should have failed the licenses check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have a handler to push metrics? can we reuse the existing one instead of creating a new ad hoc handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused, why we need to read the benchmark from datastore instead of directly processing it from the origin. e.g the request payload.
Inserting to datastore and processing later is adding unnecessary jumps and complexity IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can directly translate the payload in the future. I think this is mainly to migrate old metrics from cocoon to Skia perf for historical checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually reusing this handler for data migration is what I think we should avoid. Data migration should be addressed separately with a plan that includes: validation of the data migration in the target, how much resources is going to use, estimated duration of the migration, rollback plan, etc.
The dashboard is a production app that the team relies on and we need to avoid unnecessary risks as much as possible. For migration I'd suggest you create a tool that runs separately from the dashboard and we have more control over the processing rate it will be using.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can potentially move most of the content of this class to a service class and use it within the existing metrics handler to push the data. That way you won't need to synchronize timestamps and/or create an additional cronjob
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing metrics handler only handles new data, @liyuqian and I would like to also transfer existing data to skia perf. Although we could export the existing data, however there's no easy way to process the exported data (leveldb log format).
Besides adding more burden to existing handler will increase its latency, complexity and decrease stability. For example, a failed pushing to metrics center could be caused by network interruptions temporarily, after the incident is over, there's no way to retry the pushing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing metrics handler only handles new data, @liyuqian and I would like to also transfer existing data to skia perf. Although we could export the existing data, however there's no easy way to process the exported data (leveldb log format).
Besides adding more burden to existing handler will increase its latency, complexity and decrease stability. For example, a failed pushing to metrics center could be caused by network interruptions temporarily, after the incident is over, there's no way to retry the pushing.
If that is the case then I'd recommend to split the process in two parts one for the data migration and the other one for just sending the data as it arrives.
Data migration will require different processes, maybe running for days, data validation on the target, running after work hours to minimize disruption, etc.
For the second part the agent or the bots can call the two handlers with the same collected json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is partially implementing the data migration process.
For the second part, do you mean letting cocoon agent calls /api/push_benchmark_to_center in addition to /api/update-task-status?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, @kf6gpe has a plan to have the new metrics co-exist in both SkiaPerf/MetricsCenter and Cocoon. That way, he can compare between two dashboards' regression alerts, and make sure that SkiaPerf's alerts are properly set before sunsetting the Cocoon dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is partially implementing the data migration process.
For the second part, do you mean letting cocoon agent calls
/api/push_benchmark_to_centerin addition to/api/update-task-status?
Yes, that is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flutter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would parseSinceMillis make more sense in the utilities class? it would be easier to reuse and I don't think it depends on the handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter is closely tied to the handler and I don't see the benefit from reusing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this condition seems not necessary: https://github.com/flutter/cocoon/blob/master/app_dart/lib/src/model/appengine/time_series.dart#L57
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move these functions private if not intended to be called by others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files in lib/src have been library private already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to add this function. You can use values[key]=value to add value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test for destination result is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add more doc. about Metrics center storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
app_dart/pubspec.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a tracking issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe change "later than" to "strictly later than" to clarify that we're doing ">" instead of ">=" comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we'll need a more sophisticated batch size limit like https://github.com/liyuqian/metrics_center/blob/master/lib/src/flutter/center.dart#L121 ? Notice that we're returning points that are strictly later than a timestamp, and we would miss some points if many points have the same timestamp, and we cut the batch in the middle.
Some part of this PR's code is from #788. Related issue: flutter/flutter#73872
|
Closing as outdated. |
The handler reads benchmark results after [sinceMillis] parameter, transforms them into MetricPoints format, then writes to Metrics Center. The Metrics Center will transfer the data to Skia Perf later and let us evaluate the effectiveness of using Skia Perf.
Follow up PRs will add a cron job that runs this handler periodically.
Bug: flutter/flutter#58442
Currently the data store read operations of
flutter-dashboardis about 906 millions per day.This handler reads about 20 thousands data store records when transferring benchmark data for one day, so it's a tiny fraction of the overall usage.