More OpenTelemetry attributes on worker span#11
Conversation
Addresses #8 by bringing in a number of other OpenTelemetry attributes for jobs on the worker side of the middleware. Specifically: * `id` * `created_at` * `priority` * `scheduled_at` * `tag` We also tag spans with an OpenTelemetry "kind" (consumer or producer), and tag spans as cancelled or snoozed in cases where one of those occurred. Fixes #8.
bgentry
left a comment
There was a problem hiding this comment.
LGTM albeit with a concern about cardinality. It's possible some of these should be opt-in or opt-out, as things that are helpful at small scale become untenable at scale.
| attrs := []attribute.KeyValue{ | ||
| attribute.Int64("id", job.ID), | ||
| attribute.Int("attempt", job.Attempt), | ||
| attribute.String("created_at", job.CreatedAt.Format(time.RFC3339)), |
There was a problem hiding this comment.
Hmm, aren't these attrs like id and timestamps going to have ~infinite cardinality? My understanding is that's a problem for large scale OTEL implementations. Maybe I'm not correctly understanding where these limits do/don't apply though.
There was a problem hiding this comment.
Yeah, I wasn't too sure on the cardinality thing either, but I tried to look into it, and I can't find any good sources that indicate that this kind of thing isn't recommended.
I found this one with respect to metrics (as opposed to spans), but it wouldn't really apply here because we do keep cardinality on metrics lower:
I also found this issue on the OpenTelemetry repo, but it went unanswered:
open-telemetry/opentelemetry-specification#3554
I'm having trouble finding other citations warning against this though. Do you know of any?
I figure that if you're mapping spans to something like Sentry, it would be pretty useful to have the job ID in there (and I can say fairly confidently that cardinality on Sentry spans doesn't matter since we regularly store things like request ID there).
There was a problem hiding this comment.
You know, I'm really only finding warnings about cardinality in metrics and not traces. Seems fine to ship it for now and see if we hear back about issues.
There was a problem hiding this comment.
K cool.
I think too if it becomes a problem later we could probably change this — removing an attribute wouldn't be anywhere near the same level of breaking change that an API change would be.
|
Thanks! |
This ones follows up #11. I said in the pull request that higher cardinality attributes aren't emitted with metrics, but I got that wrong as I'd forgotten I was feeding all span attributes into metrics as well. Here, bifurcate attributes so that high cardinality attributes are added to spans, but not to metrics.
This ones follows up #11. I said in the pull request that higher cardinality attributes aren't emitted with metrics, but I got that wrong as I'd forgotten I was feeding all span attributes into metrics as well. Here, bifurcate attributes so that high cardinality attributes are added to spans, but not to metrics.
Addresses #8 by bringing in a number of other OpenTelemetry attributes
for jobs on the worker side of the middleware. Specifically:
idcreated_atpriorityscheduled_attagWe also tag spans with an OpenTelemetry "kind" (consumer or producer),
and tag spans as cancelled or snoozed in cases where one of those
occurred.
Fixes #8.