Skip to content

Response time histograms using the prometheus sink are no longer in seconds #728

@harpunius

Description

@harpunius

We migrated from statsD to the prometheus sink and use the following mapper snippet to monitor our infrastructure:

    - match: "ratelimit_server.*.response_time"
      name: "ratelimit_service_response_time_seconds"
      timer_type: histogram
      labels:
        grpc_method: "$1"

These metrics used to be output in seconds, but are now output in ms.

As stated in the statsd-exporter README:

Statsd timer data is transmitted in milliseconds, while Prometheus expects the unit to be seconds. The exporter converts all timer observations to seconds. Histogram and distribution events (h and d metric type) are not subject to unit conversion.

This used to happen when parsing observer events https://github.com/prometheus/statsd_exporter/blob/c18857b71b4afc2c304e4d34aa431a41234843ac/pkg/line/line.go#L82. In the new implementation, the histogram value is taken as-is:

.

This change (regression?) means that the default histogram buckets no longer make sense. I think we need to implement the same kind of unit switch.

WDYT?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions