Fix timer GC delays in the Linux filesystem collector#2169
Fix timer GC delays in the Linux filesystem collector#2169SuperQ merged 2 commits intoprometheus:masterfrom
Conversation
Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
5a6a7cb to
a289a6a
Compare
|
Hi, not sure I understand the implications of this change. If you have 500k mountpoints, having 500k timer objects that disappear again after the timeout is expected. And from what I can tell, with your change this would still be the case, right? |
89f3c96 to
8a87b71
Compare
cdcb117 to
ce7799d
Compare
|
@SuperQ hi, why |
ce7799d to
a8ee98a
Compare
|
I think the issue with CircleCI is because you've selected to "Follow" the node_exporter fork in your CircleCI account. This triggers CircleCI to think that it should build under your account, rather than the Prometheus account. If you click the "Unfollow Project" for node_exporter in your CircleCI account, we can try to re-trigger builds. |
Okay, maybe this is my problem; |
|
You might need to make a commit to trigger the build. Maybe squash up the commit history with |
a8ee98a to
95c1241
Compare
time.NewTimer instead of using time.After that maybe cause memory leak.
Signed-off-by: bawenmao <jordy1024@163.com> Use Timer instead of using time.After that can avoid memory leak. Signed-off-by: bawenmao <jordy1024@163.com> Avoid time.after memory leak when calling time.After Signed-off-by: bawenmao <jordy1024@163.com> Use Timer instead of using time.After that maybe cause memory leak. Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
95c1241 to
d028a85
Compare
Okay,Thanks. |
SuperQ
left a comment
There was a problem hiding this comment.
LGTM, for the record if you read the Go docs for time.After() this is explicitly mentioned.
After waits for the duration to elapse and then sends the current time on the returned channel. It is equivalent to NewTimer(d).C. The underlying Timer is not recovered by the garbage collector until the timer fires. If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.
So, yes, it's much better to explicitly call Stop() when there's nothing to wait for anymore.
time.NewTimer instead of using time.After that maybe cause memory leak.|
Awesome findings! |
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector #2146
* [CHANGE] Exclude filesystems under /run/credentials #2157
* [FEATURE] Add support for monitoring GPUs on Linux #1998
* [FEATURE] Add Darwin thermal collector #2032
* [FEATURE] Add os release collector #2094
* [FEATURE] Add netdev.address-info collector #2105
* [ENHANCEMENT] Support glob textfile collector directories #1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric #2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering #2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics #2123
* [ENHANCEMENT] Add DMI collector #2131
* [ENHANCEMENT] Add threads metrics to processes collector #2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector #2169
* [BUGFIX] ethtool: Sanitize metric names #2093
* [BUGFIX] Fix ethtool collector for multiple interfaces #2126
* [BUGFIX] Fix possible panic on macOS #2133
* [BUGFIX] Collect flag_info and bug_info only for one core #2156
Signed-off-by: Ben Kochie <superq@gmail.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector #2146
* [CHANGE] Exclude filesystems under /run/credentials #2157
* [FEATURE] Add darwin powersupply collector #1777
* [FEATURE] Add support for monitoring GPUs on Linux #1998
* [FEATURE] Add Darwin thermal collector #2032
* [FEATURE] Add os release collector #2094
* [FEATURE] Add netdev.address-info collector #2105
* [ENHANCEMENT] Support glob textfile collector directories #1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric #2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering #2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics #2123
* [ENHANCEMENT] Add DMI collector #2131
* [ENHANCEMENT] Add threads metrics to processes collector #2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector #2169
* [BUGFIX] ethtool: Sanitize metric names #2093
* [BUGFIX] Fix ethtool collector for multiple interfaces #2126
* [BUGFIX] Fix possible panic on macOS #2133
* [BUGFIX] Collect flag_info and bug_info only for one core #2156
Signed-off-by: Ben Kochie <superq@gmail.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector #2146
* [CHANGE] Exclude filesystems under /run/credentials #2157
* [FEATURE] Add lnstat collector for metrics from /proc/net/stat/ #1771
* [FEATURE] Add darwin powersupply collector #1777
* [FEATURE] Add support for monitoring GPUs on Linux #1998
* [FEATURE] Add Darwin thermal collector #2032
* [FEATURE] Add os release collector #2094
* [FEATURE] Add netdev.address-info collector #2105
* [ENHANCEMENT] Support glob textfile collector directories #1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric #2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering #2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics #2123
* [ENHANCEMENT] Add DMI collector #2131
* [ENHANCEMENT] Add threads metrics to processes collector #2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector #2169
* [BUGFIX] ethtool: Sanitize metric names #2093
* [BUGFIX] Fix ethtool collector for multiple interfaces #2126
* [BUGFIX] Fix possible panic on macOS #2133
* [BUGFIX] Collect flag_info and bug_info only for one core #2156
Signed-off-by: Ben Kochie <superq@gmail.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector #2146
* [CHANGE] Exclude filesystems under /run/credentials #2157
* [FEATURE] Add lnstat collector for metrics from /proc/net/stat/ #1771
* [FEATURE] Add darwin powersupply collector #1777
* [FEATURE] Add support for monitoring GPUs on Linux #1998
* [FEATURE] Add Darwin thermal collector #2032
* [FEATURE] Add os release collector #2094
* [FEATURE] Add netdev.address-info collector #2105
* [ENHANCEMENT] Support glob textfile collector directories #1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric #2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering #2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics #2123
* [ENHANCEMENT] Add DMI collector #2131
* [ENHANCEMENT] Add threads metrics to processes collector #2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector #2169
* [BUGFIX] ethtool: Sanitize metric names #2093
* [BUGFIX] Fix ethtool collector for multiple interfaces #2126
* [BUGFIX] Fix possible panic on macOS #2133
* [BUGFIX] Collect flag_info and bug_info only for one core #2156
Signed-off-by: Ben Kochie <superq@gmail.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector #2146
* [CHANGE] Exclude filesystems under /run/credentials #2157
* [FEATURE] Add lnstat collector for metrics from /proc/net/stat/ #1771
* [FEATURE] Add darwin powersupply collector #1777
* [FEATURE] Add support for monitoring GPUs on Linux #1998
* [FEATURE] Add Darwin thermal collector #2032
* [FEATURE] Add os release collector #2094
* [FEATURE] Add netdev.address-info collector #2105
* [ENHANCEMENT] Support glob textfile collector directories #1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric #2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering #2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics #2123
* [ENHANCEMENT] Add DMI collector #2131
* [ENHANCEMENT] Add threads metrics to processes collector #2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector #2169
* [BUGFIX] ethtool: Sanitize metric names #2093
* [BUGFIX] Fix ethtool collector for multiple interfaces #2126
* [BUGFIX] Fix possible panic on macOS #2133
* [BUGFIX] Collect flag_info and bug_info only for one core #2156
Signed-off-by: Ben Kochie <superq@gmail.com>
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling. Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector prometheus#2146
* [CHANGE] Exclude filesystems under /run/credentials prometheus#2157
* [FEATURE] Add lnstat collector for metrics from /proc/net/stat/ prometheus#1771
* [FEATURE] Add darwin powersupply collector prometheus#1777
* [FEATURE] Add support for monitoring GPUs on Linux prometheus#1998
* [FEATURE] Add Darwin thermal collector prometheus#2032
* [FEATURE] Add os release collector prometheus#2094
* [FEATURE] Add netdev.address-info collector prometheus#2105
* [ENHANCEMENT] Support glob textfile collector directories prometheus#1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric prometheus#2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering prometheus#2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics prometheus#2123
* [ENHANCEMENT] Add DMI collector prometheus#2131
* [ENHANCEMENT] Add threads metrics to processes collector prometheus#2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector prometheus#2169
* [BUGFIX] ethtool: Sanitize metric names prometheus#2093
* [BUGFIX] Fix ethtool collector for multiple interfaces prometheus#2126
* [BUGFIX] Fix possible panic on macOS prometheus#2133
* [BUGFIX] Collect flag_info and bug_info only for one core prometheus#2156
Signed-off-by: Ben Kochie <superq@gmail.com>
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling. Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
`node_textfile_mtime_seconds` now contain the full path name.
* [CHANGE] Add path label to rapl collector prometheus#2146
* [CHANGE] Exclude filesystems under /run/credentials prometheus#2157
* [FEATURE] Add lnstat collector for metrics from /proc/net/stat/ prometheus#1771
* [FEATURE] Add darwin powersupply collector prometheus#1777
* [FEATURE] Add support for monitoring GPUs on Linux prometheus#1998
* [FEATURE] Add Darwin thermal collector prometheus#2032
* [FEATURE] Add os release collector prometheus#2094
* [FEATURE] Add netdev.address-info collector prometheus#2105
* [ENHANCEMENT] Support glob textfile collector directories prometheus#1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric prometheus#2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering prometheus#2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics prometheus#2123
* [ENHANCEMENT] Add DMI collector prometheus#2131
* [ENHANCEMENT] Add threads metrics to processes collector prometheus#2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector prometheus#2169
* [BUGFIX] ethtool: Sanitize metric names prometheus#2093
* [BUGFIX] Fix ethtool collector for multiple interfaces prometheus#2126
* [BUGFIX] Fix possible panic on macOS prometheus#2133
* [BUGFIX] Collect flag_info and bug_info only for one core prometheus#2156
Signed-off-by: Ben Kochie <superq@gmail.com>
Hello,I found a potential hazard in
filesystem_linux.gothat may cause memory leaks,the details are as follows:function
GetStatswill loop through mount points, according to the number of all mount points.Recently I encountered a problem: the number of mount points of a certain system in the production environment is abnormal, as many as more than 500,000 rows; mainly because the for loop is blocked, and then the
Timer Heap objectsin the statementfor {select case->time.Aftercause accumulation, and eventually a memory leak occurs.It may be a better way to control the number of mount points to be checked, but this may require more changes; therefore, I only made a slight optimization for this problem: change
time.Aftertotime.NewTimer.Expect to be accepted or answered.