-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Implement new metrics API / RFC #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @tustvold would like your opinion on the suitability / consistency of the metrics.rs API in this PR to other metric APIs |
| /// Sums the values for metrics for which `f(metric)` returns | ||
| /// true, and returns the value. Returns None if no metrics match | ||
| /// the predicate. | ||
| pub fn sum<F>(&self, mut f: F) -> Option<usize> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some of the aggregation primitives (sum and group by partition). I feel this API may grow as we understand the usecases more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked in great detail at the instrumentation of the operators themselves, nor am I particularly familiar with what came before, but this makes a lot of sense to me. Very nice 👍
Edit: r.e. snake case vs camel case, FWIW most metrics systems I've interacted with don't support upper-case letters, nor hyphens, so snake case is pretty typical
| } | ||
|
|
||
| /// Add `n` to the metric's value | ||
| pub fn add(&self, n: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that metric recording doesn't involve any string manipulation 👍
| impl Time { | ||
| /// Create a new [`Time`] wrapper suitable for recording elapsed | ||
| /// times for operations. | ||
| pub fn new(inner: Arc<SQLMetric>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these should be verifying the MetricKind of the SQLMetric? Or alternatively be private and have a member function on SQLMetric that does the verification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. 🤔 though now I think about it the more Rust-y way of doing this would be to do it in the type system... so perhaps I will collapse MetricKind 🤔
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THanks @tustvold -- I am going to implement the Cow approach and also try and consolidate SQLMetric and MetricKind to see how that looks
| impl Time { | ||
| /// Create a new [`Time`] wrapper suitable for recording elapsed | ||
| /// times for operations. | ||
| pub fn new(inner: Arc<SQLMetric>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. 🤔 though now I think about it the more Rust-y way of doing this would be to do it in the type system... so perhaps I will collapse MetricKind 🤔
|
@andygrove / @returnString as you implemented the From this PR's original description:
@tustvold notes that
Given that snake case is the standard in Rust as well, I would probably be inclined to update the metric names to use snake case as well |
This makes sense. I didn't even think about the casing. I just spend too much time looking at Spark query plans so that influenced my initial work. |
Thanks @andygrove -- I will then update this proposal to switch the names to snake_case then |
|
I have some non trivial feedback to incorporate so marking this as a draft (and maybe I will open a new PR after updating the proposal) |
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have time to review in detail at the moment but I took a quick look through and LGTM
|
This PR has changed enough since initial feedback that I have opened a second one #908 with the updates. Closing this one. |
Which issue does this PR close?
Closes #679
Note: If people basically like this API I will go ahead and add unit tests for metrics.rs (e.g. for aggregate_by_partition)
Rationale for this change
See the description on #679 (comment) for the full rationale, but the TLDR version is:
SQLMetricdata model to ease integration in other metric systems (e.g. prometheus, influxdb, etc)What changes are included in this PR?
SQLMetricAPI to be in its own module, have labels, know about partitions, and allow for real time inspectionSQLMetricin DataFusion and Ballista to the new APIAre there any user-facing changes?
No
The SQLMetric API is basically now totally different so any code that creates / uses
SQLMetricswould have to be updated.Notes
In keeping with Rust's tradition of static typing, I also changed to using a more strongly typed version of these metrics to avoid mistakes such as adding a "time" to a counter value, as well as allowing other counter specific operations.
Open Questions:
The current SQL counters use "camel case" for the counter names (e.g.
numRows) rather than the Rust standard "snake case" (e.g.num_rows). I kept the same naming convention in this PR, but I wonder if we want to make them more Rust standard snake case given we are messing with them all anyways.Not included in this PR: