Optionally handle DELTA metrics as counters#167
Optionally handle DELTA metrics as counters#167osabina wants to merge 3 commits intoprometheus-community:masterfrom
Conversation
…an gauges. This is accomplished by caching the "current" running value in this case and exporting that value as a counter. The helper code for this is in the collector namespace (I am using the fnv code and followed a similar pattern in adding it here). This is flagged as --aggregate-deltas.
This is done similarly to the DELTA INT64/DOUBLE code, with the major change being that we need to cache all the histogram buckets instead of a single value. It leans on existing code as much as possible. This is controlled by the same --aggregate-deltas flag.
Once this was rolled out to production, we pretty quickly realized we were doing (unallowed) concurrent map access. We're kind of using a sledgehammer here with a global mutex across each type of cache. But it's also the simplest fix if it doesn't cause too much lock contention.
|
I figured it might be easier to move this convo to the PR. Related to question 3 from here, I think this might be an issue for slower moving metrics. Everything works really well when the metric is constantly changing which is awesome! Here's an example with a slower moving metric, I'm not quite sure how to resolve this without going to the solution which always reports the counters even if GCP drops it for a longer period of time. Happy to try to merge your work with mine which attempts to solve the problem this way. I'm currently attempting to fix an issue with the implementation which moves the reported time too far in to the future. |
|
@kgeckhart thanks for the excellent explanation of the problem -- we must only be tracking things that are getting reasonable traffic since everyone seems happy (so far) with what we have. But, certainly If you can integrate a solution to this, that would be awesome! Perhaps have a flag that sets the TTL for continuing to export metrics no longer showing up, default 0s -- then folks could adjust for their needs? |
This PR introduces support for aggregating DELTA metrics in-memory. This implementation is somewhat based on #167 but with extra functionality which is intended to solve the problem identified here, #167 (comment). I chose to inject interfaces for the delta and distribution stores at the Collector level to try to keep library friendliness intact, #157 I did as much as I could to explain the solution and the tradeoffs with aggregating deltas in this manner in the README.md. The bulk of the changes are in `monitoring_metrics.go` where I tried to obfuscate as much of the delta logic as possible. Should hopefully resolve #116 and #25 Signed-off-by: Kyle Eckhart <kgeckhart@users.noreply.github.com>
|
Sorry meant to close this since @kgeckhart got this fix into the mainline already! |

I've cleaned up the internal set of commits we are (successfully, as far as we can determine) running as of this week (8/8/2022) and it seemed to make sense to drop this as a PR instead of a random branch floating out there.
Open to whatever cleanup/changes/suggestions folks have in order to get this merged into the mainline.