Skip to content

Comments

Handle small backwards jumps in CPU idle#2067

Merged
SuperQ merged 1 commit intomasterfrom
superq/idle_jump
Jul 7, 2021
Merged

Handle small backwards jumps in CPU idle#2067
SuperQ merged 1 commit intomasterfrom
superq/idle_jump

Conversation

@SuperQ
Copy link
Member

@SuperQ SuperQ commented Jul 4, 2021

The Linux CPU idle stat can also jump backwards slightly in some cases.

Fixes: #1903

Signed-off-by: Ben Kochie superq@gmail.com

@SuperQ SuperQ requested a review from discordianfish July 4, 2021 08:49
@SuperQ SuperQ force-pushed the superq/idle_jump branch 2 times, most recently from 5dc3919 to bde48ea Compare July 4, 2021 10:45
@roidelapluie
Copy link
Member

Thanks!
I would rather go with a fix value (e.g. 3s ?) rather than %.

@SuperQ
Copy link
Member Author

SuperQ commented Jul 5, 2021

A fixed value would also work.

@roidelapluie
Copy link
Member

from the logs in the issue, even 1s would be enough.

@SuperQ SuperQ force-pushed the superq/idle_jump branch from bde48ea to 922f2ee Compare July 5, 2021 12:14
Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor nit, but LGTM

@SuperQ SuperQ force-pushed the superq/idle_jump branch from 922f2ee to 8bbd4dc Compare July 6, 2021 15:16
@roidelapluie
Copy link
Member

Okay, tests are failing. Otherwise, 👍

@SuperQ SuperQ force-pushed the superq/idle_jump branch from 8bbd4dc to 657088f Compare July 7, 2021 08:53
@roidelapluie
Copy link
Member

collector/cpu_linux.go:56:2: jumpBackDebugMessage is unused (deadcode)
jumpBackDebugMessage = fmt.Sprintf("CPU Idle counter jumped backwards more than %f seconds, possible hotplug event, resetting CPU stats", jumpBackSeconds)

@roidelapluie
Copy link
Member

You are using the seconds as message instead of the error message.

@SuperQ
Copy link
Member Author

SuperQ commented Jul 7, 2021

Oops!

The Linux CPU idle stat can also jump backwards slightly in some cases.
Allow the jump back up to 3 seconds before we attempt to reset the CPU
counter cache.

Fixes: #1903

Signed-off-by: Ben Kochie <superq@gmail.com>
@SuperQ SuperQ force-pushed the superq/idle_jump branch from 657088f to 73c9a10 Compare July 7, 2021 10:24
@roidelapluie
Copy link
Member

LGTM, thanks!

@SuperQ SuperQ merged commit 2510378 into master Jul 7, 2021
@SuperQ SuperQ deleted the superq/idle_jump branch July 7, 2021 11:27
SuperQ added a commit that referenced this pull request Jul 12, 2021
NOTE: Ignoring invalid network speed will be the default in 2.x
NOTE: Filesystem collector flags have been renamed. `--collector.filesystem.ignored-mount-points` is now `--collector.filesystem.mount-points-exclude` and `--collector.filesystem.ignored-fs-types` is now `--collector.filesystem.fs-types-exclude`. The old flags will be removed in 2.x.

* [CHANGE] Rename filesystem collector flags to match other collectors #2012
* [CHANGE] Make node_exporter print usage to STDOUT #2039
* [FEATURE] Add conntrack statistics metrics #1155
* [FEATURE] Add ethtool stats collector #1832
* [FEATURE] Add flag to ignore network speed if it is unknown #1989
* [FEATURE] Add tapestats collector for Linux #2044
* [ENHANCEMENT] Add ErrorLog plumbing to promhttp #1887
* [ENHANCEMENT] Add time zone offset metric #2060
* [BUGFIX] Add ErrorLog plumbing to promhttp #1887
* [BUGFIX] Handle errors from disabled PSI subsystem #1983
* [BUGFIX] Fix panic when using backwards compatible flags #2000
* [BUGFIX] Only initiate collectors once #2048
* [BUGFIX] Handle small backwards jumps in CPU idle #2067

Signed-off-by: Ben Kochie <superq@gmail.com>
@SuperQ SuperQ mentioned this pull request Jul 12, 2021
SuperQ added a commit that referenced this pull request Jul 15, 2021
NOTE: Ignoring invalid network speed will be the default in 2.x
NOTE: Filesystem collector flags have been renamed. `--collector.filesystem.ignored-mount-points` is now `--collector.filesystem.mount-points-exclude` and `--collector.filesystem.ignored-fs-types` is now `--collector.filesystem.fs-types-exclude`. The old flags will be removed in 2.x.

* [CHANGE] Rename filesystem collector flags to match other collectors #2012
* [CHANGE] Make node_exporter print usage to STDOUT #2039
* [FEATURE] Add conntrack statistics metrics #1155
* [FEATURE] Add ethtool stats collector #1832
* [FEATURE] Add flag to ignore network speed if it is unknown #1989
* [FEATURE] Add tapestats collector for Linux #2044
* [FEATURE] Add nvme collector #2062
* [ENHANCEMENT] Add ErrorLog plumbing to promhttp #1887
* [ENHANCEMENT] Add more Infiniband counters #2019
* [ENHANCEMENT] netclass: retrieve interface names and filter before parsing #2033
* [ENHANCEMENT] Add time zone offset metric #2060
* [BUGFIX] Handle errors from disabled PSI subsystem #1983
* [BUGFIX] Fix panic when using backwards compatible flags #2000
* [BUGFIX] Fix wrong value for OpenBSD memory buffer cache #2015
* [BUGFIX] Only initiate collectors once #2048
* [BUGFIX] Handle small backwards jumps in CPU idle #2067

Signed-off-by: Ben Kochie <superq@gmail.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
NOTE: Ignoring invalid network speed will be the default in 2.x
NOTE: Filesystem collector flags have been renamed. `--collector.filesystem.ignored-mount-points` is now `--collector.filesystem.mount-points-exclude` and `--collector.filesystem.ignored-fs-types` is now `--collector.filesystem.fs-types-exclude`. The old flags will be removed in 2.x.

* [CHANGE] Rename filesystem collector flags to match other collectors prometheus#2012
* [CHANGE] Make node_exporter print usage to STDOUT prometheus#2039
* [FEATURE] Add conntrack statistics metrics prometheus#1155
* [FEATURE] Add ethtool stats collector prometheus#1832
* [FEATURE] Add flag to ignore network speed if it is unknown prometheus#1989
* [FEATURE] Add tapestats collector for Linux prometheus#2044
* [FEATURE] Add nvme collector prometheus#2062
* [ENHANCEMENT] Add ErrorLog plumbing to promhttp prometheus#1887
* [ENHANCEMENT] Add more Infiniband counters prometheus#2019
* [ENHANCEMENT] netclass: retrieve interface names and filter before parsing prometheus#2033
* [ENHANCEMENT] Add time zone offset metric prometheus#2060
* [BUGFIX] Handle errors from disabled PSI subsystem prometheus#1983
* [BUGFIX] Fix panic when using backwards compatible flags prometheus#2000
* [BUGFIX] Fix wrong value for OpenBSD memory buffer cache prometheus#2015
* [BUGFIX] Only initiate collectors once prometheus#2048
* [BUGFIX] Handle small backwards jumps in CPU idle prometheus#2067

Signed-off-by: Ben Kochie <superq@gmail.com>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
NOTE: Ignoring invalid network speed will be the default in 2.x
NOTE: Filesystem collector flags have been renamed. `--collector.filesystem.ignored-mount-points` is now `--collector.filesystem.mount-points-exclude` and `--collector.filesystem.ignored-fs-types` is now `--collector.filesystem.fs-types-exclude`. The old flags will be removed in 2.x.

* [CHANGE] Rename filesystem collector flags to match other collectors prometheus#2012
* [CHANGE] Make node_exporter print usage to STDOUT prometheus#2039
* [FEATURE] Add conntrack statistics metrics prometheus#1155
* [FEATURE] Add ethtool stats collector prometheus#1832
* [FEATURE] Add flag to ignore network speed if it is unknown prometheus#1989
* [FEATURE] Add tapestats collector for Linux prometheus#2044
* [FEATURE] Add nvme collector prometheus#2062
* [ENHANCEMENT] Add ErrorLog plumbing to promhttp prometheus#1887
* [ENHANCEMENT] Add more Infiniband counters prometheus#2019
* [ENHANCEMENT] netclass: retrieve interface names and filter before parsing prometheus#2033
* [ENHANCEMENT] Add time zone offset metric prometheus#2060
* [BUGFIX] Handle errors from disabled PSI subsystem prometheus#1983
* [BUGFIX] Fix panic when using backwards compatible flags prometheus#2000
* [BUGFIX] Fix wrong value for OpenBSD memory buffer cache prometheus#2015
* [BUGFIX] Only initiate collectors once prometheus#2048
* [BUGFIX] Handle small backwards jumps in CPU idle prometheus#2067

Signed-off-by: Ben Kochie <superq@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gracefully handle small "idle" backward counter jumps

3 participants