-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Disclaimer: I always find the weirdest bugs and apologize in advance. :)
Host operating system: output of uname -a
Linux hho 5.10.18 #3 SMP Wed Feb 24 14:19:12 CET 2021 x86_64 AMD Ryzen 7 PRO 4750U with Radeon Graphics AuthenticAMD GNU/Linux
node_exporter version: output of node_exporter --version
node_exporter, version 1.1.1 (branch: non-git, revision: 4e837d4)
build user: portage@localhost
build date: 20210223-15:25:02
go version: go1.16
platform: linux/amd64
node_exporter command line flags
--collector.textfile.directory=/run/node_exporter/ --collector.ntp --collector.ntp.server-is-local --collector.qdisc --no-collector.bcache --no-collector.bonding --no-collector.conntrack --no-collector.infiniband --no-collector.ipvs --no-collector.mdadm --no-collector.nfs --no-collector.nfsd --no-collector.powersupplyclass --no-collector.schedstat --no-collector.time --no-collector.uname --no-collector.xfs --no-collector.zfs --log.level=warn
Are you running node_exporter in Docker?
No
What did you do that produced an error?
Enabled collector.powersupplyclass (disabled above)
What did you expect to see?
Valid output of battery status
What did you see instead?
panic: label value "LNV-5B10W13894\xc0\xc0" is not valid UTF-8
goroutine 28 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
/tmp/portage/app-metrics/node_exporter-1.1.1/homedir/go/pkg/mod/github.com/prometheus/client_golang@v1.9.0/prometheus/value.go:107
github.com/prometheus/node_exporter/collector.(*powerSupplyClassCollector).Update(0xc00035e5a0, 0xc00007f6e0, 0x10afc20, 0x0)
/tmp/portage/app-metrics/node_exporter-1.1.1/work/node_exporter-1.1.1/collector/powersupplyclass.go:166 +0x27f9
github.com/prometheus/node_exporter/collector.execute(0xbf759e, 0x10, 0xccec60, 0xc00035e5a0, 0xc00007f6e0, 0xcce520, 0xc0001219b0)
Analysis
This is not strictly a bug in node_exporter, but since it crashes (due to the intentional panic in MustNewConstMetric) I figured I report it here anyway. Let me know if it's better suited for client_golang. It's not clear that anybody except the hardware is at fault here.
The problem is caused by garbage bytes returned by a sysfs entry and can be reduced to inspecting the model_name attribute of the battery in question:
$cat /sys/class/power_supply/BAT0/model_name
LNV-5B10W13894��
I traced this back into the drivers/acpi/battery driver and indeed the two extraneous 0xc0 bytes are returned by the hardware (see here; battery is populated in line 204), so AFAICT it's not just random byte garbage due to an uninitilaized char[] or something like that. The two bytes are always the same.
My XFCE desktop environment can display the value correctly in xfce-power-manager, presumably because it normalizes the string value (probably by accident).
Possible Mitigations
It's not clear how to 'best' fix this since this seems a problem with this particular battery model and/or my BIOS (1.16, updated a few days ago). I don't know whether it happened before.
My first idea was to sanitize all characters always and everywhere, but realized that would be overly drastic, regardless of whether they are sanitized in the kernel, in client_golang or node_exporter.
Instead I belive it makes more sense to be less strinct in client_golang when reading an attribute value in NewConstMetric() (called by MustNewConstMetric here). If the "invalid UTF-8" exception can be avoided - possibly by catching it and simply ignoring the bogus characters - the (completely unnecessary) performance impact of unconditional sanitation would be mitigated. I don't know enough about Go character set conversions to tell whether that is feasible.