[percentiles] first pass at adding percentile sketch#293
Conversation
There was a problem hiding this comment.
First if I understand this correctly Distribution will only be available using DosgStatsD ?
I really think we should merge the dist_context.go with context_metrics.go and dist_sampler.go with time_sampler.go. Right now we are duplicating the logic of both class.
Also if you could split this PR in 2 commits to have in one the new metric for the aggregator and in another one the binding with DosgStatsd.
Finally as a side note: what about v2 endpoints and protocol buffer serialization for Distribution ? Is it plan ? see: https://github.com/DataDog/agent-payload.
There was a problem hiding this comment.
Could you rename the method to SubmitV1SketchSeries to match the naming convention.
There was a problem hiding this comment.
the input dogstatsdIn is meant for outsite metrics (coming only from dogstatsd, right now). I you want to distribution to be available from checks you need to be update the case using checkMetricIn.
There was a problem hiding this comment.
I think it should be coming from dogstatsd only, at least for now. But I'll double check.
There was a problem hiding this comment.
Distribution metric has the same interface as every other metric. The only difference is the instead of returning a []*Serie it returns a *SketchSerie.
The following class is the same as context_metrics.go and the dist_sampler.go has the same purpose and behavior as our time_sampler.go.
I think we could avoid having another sampler and context by reworking our interfaces. On the top of my head: we could create an interface used by every metrics (to replace Serie). But we can brainstorm for better idea.
There was a problem hiding this comment.
Agreed, and your suggestion on replacing Serie sounds good to me (haven't given it that much thought though).
Not sure if it's absolutely necessary to avoid code duplication right now, but this is something that'll have to be done at some point (these 2 very similar implementations of samplers will be hard to maintain otherwise).
At the very least, let's add a FIXME message at the top of this file and of dist_sampler.go.
There was a problem hiding this comment.
Yes, there's a lot of code/logic duplication - I wanted to keep my additions as separate as possible in the beginning. I agree they should be integrated better into the codebase.
olivielpeau
left a comment
There was a problem hiding this comment.
Haven't looked at the specifics of the QSketch implementation, but overall it looks like this would work well with the existing aggregator.
Overall I agree with @hush-hush's comments, but see my comment below on the code duplication.
There was a problem hiding this comment.
Agreed, and your suggestion on replacing Serie sounds good to me (haven't given it that much thought though).
Not sure if it's absolutely necessary to avoid code duplication right now, but this is something that'll have to be done at some point (these 2 very similar implementations of samplers will be hard to maintain otherwise).
At the very least, let's add a FIXME message at the top of this file and of dist_sampler.go.
There was a problem hiding this comment.
this won't work with +Inf/-Inf values, unfortunately "infinity" doesn't exist in JSON (using +Inf/-Inf
directly in the serialized byte slice would cause a serialization error).
If you need to be able to express "infinite" values you could use a separate JSON field that would encode that in a custom way.
There was a problem hiding this comment.
Good point! I think ignoring +Inf/-Inf values is the right thing to do.
7520117 to
89ad1e1
Compare
* simple docker secrets * simple docker tests * lint * lint * lint * Update backend/docker/secrets.go Co-authored-by: rahulkaukuntla <144174402+rahulkaukuntla@users.noreply.github.com> * update error * update path slash * seperate `docker` backend into `file.file` wrapper * add k8s file backend * rename k8s function * go mod tidy * lint comment * lint comment * rename `file.file` to `file.text` * lint * trim & consts * add file read max size * add max file size tests * clean up path check * lint * lint --------- Co-authored-by: rahulkaukuntla <144174402+rahulkaukuntla@users.noreply.github.com>
What does this PR do?
Allow agent to generate and submit percentile sketches of the raw data.
Motivation
We're working on adding accurate percentile capabilities to the datadog metrics.
Additional Notes