Skip to content

v0.14.0-rc2 breaks previously-working textfile metrics #509

@matthiasr

Description

@matthiasr

We had metrics like

# TYPE smartmon_device_info gauge
smartmon_device_info{disk="/dev/bus/0",type="sat+megaraid,0",model_family="Seagate Constellation.2 (SATA)",device_model="ST91000640NS",serial_number="<redacted>",firmware_version="AA07"} 1
…
smartmon_device_info{disk="/dev/sda",type="scsi",vendor="DELL",product="PERC H710",revision="3.13",lun_id="<redacted>"} 1
…

(both repeated for 4 disks total)

from a script that parses smartctl output. The duplication is caused by devices being represented multiple times via the RAID controller.

With 0.14.0-rc2, we now get this error on the metrics endpoint:

An error has occurred during metrics gathering:

4 error(s) occurred:
* collected metric smartmon_device_info label:<name:"disk" value:"/dev/sda" > label:<name:"lun_id" value:"<redacted>" > label:<name:"product" value:"PERC H710" > label:<name:"revision" value:"3.13" > label:<name:"type" value:"scsi" > label:<name:"vendor" value:"DELL" > gauge:<value:1 >  has label dimensions inconsistent with previously collected metrics in the same metric family
…

I suspect this is a side effect of the vendoring update #372 bringing in stricter checks from client_golang.

The check and error are correct, we should not produce inconsistent metrics – however this does break metrics collection completely for previously working textfile metrics. This check should either be disabled or at least the breaking change clearly marked in the change log.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions