Conversation
brian-brazil
left a comment
There was a problem hiding this comment.
Shouldn't there be a change in the end-to-end output too?
collector/edac_linux.go
Outdated
| func NewEdacCollector() (Collector, error) { | ||
| return &edacCollector{ | ||
| ceCount: prometheus.NewDesc( | ||
| prometheus.BuildFQName(Namespace, edacSubsystem, "ce_count"), |
There was a problem hiding this comment.
_count is the suffix for Summaries/Histograms. This is probably a counter, so should be _total
There was a problem hiding this comment.
I was translating the names directly without thinking about it too hard.
What about correctable_errors_total, and similar for the other names.
collector/edac_linux.go
Outdated
| []string{"controller"}, nil, | ||
| ), | ||
| ueCount: prometheus.NewDesc( | ||
| prometheus.BuildFQName(Namespace, edacSubsystem, "ue_count"), |
There was a problem hiding this comment.
spell out "uncorrectable"
|
Yes, it should be in the end-to-end output, not sure why. |
collector/edac_linux.go
Outdated
| []string{"controller"}, nil, | ||
| ), | ||
| ueNoinfoCount: prometheus.NewDesc( | ||
| prometheus.BuildFQName(Namespace, edacSubsystem, "no_csrow_uncorrectable_errors_total"), |
There was a problem hiding this comment.
Might we want to lump these into csrow_uncorrectable_errors_total with a label like "unknown"?
There was a problem hiding this comment.
Hmm. That depends on how well the kernel implements this data. In theory, unknown row + the csrow numbers should be possible to aggregate. Then we only need two metrics, one for correctable and one for uncorrectable.
cc3eb99 to
dd3dc45
Compare
|
@SuperQ What is the state of this? Is it ready to get reviewed/merged? |
19efeed to
e8b92d3
Compare
|
@discordianfish Ok, finally fixed up the end-to-end test. This is ready to go. |
2b36015 to
374e060
Compare
discordianfish
left a comment
There was a problem hiding this comment.
Looks good beside the regex question. Also needs rebasing.
collector/edac_linux.go
Outdated
There was a problem hiding this comment.
Why are the regexes needed? The globbing should already limit the files to the same, no?
There was a problem hiding this comment.
The regexp is used to extract the controller number from the directory name.
There was a problem hiding this comment.
Ah I see, makes sense. In general I slightly prefer doing such basic parsing manually.. then again, possibly just a personal preference. So fine with me!
collector/edac_linux.go
Outdated
collector/edac_linux.go
Outdated
There was a problem hiding this comment.
csrowCeCount -> csRowCECount
csrowUeCount -> csRowUECount
|
👍 otherwise |
d793aed to
87fd930
Compare
Collect "Error detection and correction" metrics from memory controllers. * Supported on Linux only. * Add basic fixtures. * Enabled by default.
87fd930 to
e4566d0
Compare
e4566d0 to
38a4a36
Compare
Signed-off-by: prombot <prometheus-team@googlegroups.com>
Collect "Error detection and correction" metrics from memory
controllers.