Add StorCli text collector example script by mattbostock · Pull Request #320 · prometheus/node_exporter

mattbostock · 2016-10-04T08:45:53Z

Collect metrics from the StorCLI utility on the health of MegaRAID
hardware RAID controllers and write them to stdout so that they can be
used by the textfile collector.

We parse the JSON output that StorCLI provides.

Script must be run as root or with appropriate capabilities for storcli
to access the RAID card.

Designed to run under Python 2.7, using the system Python provided with
many Linux distributions.

The metrics look like this:

mbostock@host:~$ sudo ./storcli.py
megaraid_status_code 0
megaraid_controllers_count 1
megaraid_emergency_hot_spare{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_scheduled_patrol_read{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_virtual_drives{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_drive_groups{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_virtual_drives_optimal{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_degraded{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 0
megaraid_battery_backup_healthy{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_ports{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 8
megaraid_failed{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 0
megaraid_drive_groups_optimal{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_healthy{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1
megaraid_physical_drives{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 24
mbostock@host:~$

I don't code regularly in Python so any suggestions for improvements welcome.

brian-brazil · 2016-10-04T09:15:36Z

text_collector_examples/storcli.py

This path needs to be configurable

I left it hardcoded as I figured someone using this would likely copy and adapt it (as an example), but agree it's nicer to make it configurable. Would you prefer a commandline flag?

Yes, a binary-path flag would be great.

brian-brazil · 2016-10-04T09:17:26Z

text_collector_examples/storcli.py

This is only writign to stdout, it should take care of creating the file and atomically changing it

This is intentional as I didn't want the script to have knowledge of where it should put the metrics, since that information will likely be repeated in each different script it seems better to put that logic in the cronjob or whatever calls the script.

I agree, I think we should have a standard interface for all of these scripts. I also would prefer stdout and not have to implement atomic writing in every script, and have a simple atomic write wrapper that can be used to put the metrics files in the correct place.

brian-brazil · 2016-10-04T09:32:21Z

Also model doesn't belong as a label, the controller number is sufficient to identify the controller. I'd suggest using the machine role approach for it.

mattbostock · 2016-10-04T09:57:17Z

I hesitated to rely on the controller number as I didn't know how stable it is. Happy to remove the model label.

brian-brazil · 2016-10-04T12:05:57Z

If it's not stable, the model likely doesn't help you as the chances are you have identical models in a given machine.

SuperQ · 2016-10-05T13:08:08Z

I would recommend a megaraid_controller_info metric with the controller and model labels. This allows for annotation without having the model label on every metric.

SuperQ · 2016-11-27T11:12:34Z

Ping, any progress on finishing this?

mattbostock · 2016-12-06T23:04:24Z

Sorry for the delay, will get to this soon.

discordianfish

Please address comments

Collect metrics from the StorCLI utility on the health of MegaRAID hardware RAID controllers and write them to stdout so that they can be used by the textfile collector. We parse the JSON output that StorCLI provides. Script must be run as root or with appropriate capabilities for storcli to access the RAID card. Designed to run under Python 2.7, using the system Python provided with many Linux distributions. The metrics look like this: mbostock@host:~$ sudo ./storcli.py megaraid_status_code 0 megaraid_controllers_count 1 megaraid_emergency_hot_spare{controller="0"} 1 megaraid_scheduled_patrol_read{controller="0"} 1 megaraid_virtual_drives{controller="0"} 1 megaraid_drive_groups{controller="0"} 1 megaraid_virtual_drives_optimal{controller="0"} 1 megaraid_degraded{controller="0"} 0 megaraid_battery_backup_healthy{controller="0"} 1 megaraid_ports{controller="0"} 8 megaraid_failed{controller="0"} 0 megaraid_drive_groups_optimal{controller="0"} 1 megaraid_healthy{controller="0"} 1 megaraid_physical_drives{controller="0"} 24 megaraid_controller_info{controller="0", model="AVAGOMegaRAIDSASPCIExpressROMB"} 1 mbostock@host:~$

mattbostock · 2016-12-22T23:04:21Z

@discordianfish @brian-brazil @SuperQ: I've amended to address the comments in this PR:

added a --storcli_path option to set the path to storcli
moved the model label to a megaraid_controller_info metric

I've also added --help and --version flags, and changed the logic so that the script fails more gracefully on machines where no MegaRAID cards are installed.

Any suggestions for changes welcome.

discordianfish

Great, LGMT!

Correctly handle powercap name strings by splitting on "-" rather than assuming the index is a single digit number. prometheus#1808 Signed-off-by: Ben Kochie <superq@gmail.com>

mattbostock changed the title ~~Add StorClI text collector example script~~ Add StorCli text collector example script Oct 4, 2016

mattbostock force-pushed the add_storcli branch from 3268c69 to 9fa8ee3 Compare October 4, 2016 08:46

mattbostock mentioned this pull request Oct 4, 2016

Megacli collector should move to textfile collector #101

Closed

brian-brazil reviewed Oct 4, 2016

View reviewed changes

discordianfish requested changes Dec 22, 2016

View reviewed changes

mattbostock added 2 commits December 22, 2016 22:55

Add text_collector_examples README

004bdca

mattbostock force-pushed the add_storcli branch from 9fa8ee3 to 004bdca Compare December 22, 2016 23:00

discordianfish approved these changes Dec 25, 2016

View reviewed changes

SuperQ approved these changes Dec 25, 2016

View reviewed changes

discordianfish merged commit ad1befe into prometheus:master Dec 26, 2016

SuperQ mentioned this pull request Jan 15, 2017

Release v0.14.0-rc.1. #423

Merged

uniemimu mentioned this pull request Aug 20, 2020

Fix invalid metric name panic reading powercap dir #1800

Closed

Comments

Conversation

mattbostock commented Oct 4, 2016

Uh oh!

brian-brazil Oct 4, 2016

Choose a reason for hiding this comment

Uh oh!

mattbostock Oct 4, 2016

Choose a reason for hiding this comment

Uh oh!

SuperQ Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

brian-brazil Oct 4, 2016

Choose a reason for hiding this comment

Uh oh!

mattbostock Oct 4, 2016

Choose a reason for hiding this comment

Uh oh!

SuperQ Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

brian-brazil commented Oct 4, 2016

Uh oh!

mattbostock commented Oct 4, 2016

Uh oh!

brian-brazil commented Oct 4, 2016

Uh oh!

SuperQ commented Oct 5, 2016

Uh oh!

SuperQ commented Nov 27, 2016

Uh oh!

mattbostock commented Dec 6, 2016

Uh oh!

discordianfish left a comment

Choose a reason for hiding this comment

Uh oh!

mattbostock commented Dec 22, 2016

Uh oh!

discordianfish left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants