Skip to content

Consider implementing timeout for collectors #244

@powerman

Description

@powerman

I've just noticed prometheus fail to get metrics from node_exporter in last 2 days (my workstation uptime is 3 days, so node_exporter actually worked only about a day after last reboot). Log of node_exporter complains about "too many open files", and it's really hit 1024 open file limit:

# ls /proc/$(pidof node_exporter)/fd | wc -l
   1023

But most of them are sockets:

# ls -l /proc/$(pidof node_exporter)/fd | grep socket | wc -l
   1018

And I believe all of them except one is leaked because there are no related connections:

# ss -anp | grep node_exporter
tcp    LISTEN     0      128    127.0.0.1:9100                  *:*                   users:(("node_exporter",pid=9801,fd=4))

Also I noticed node_exporter uses too many memory, not sure is this related to this issue, but it shouldn't be second process in memory usage top:

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 1638 powerman   20   0 2514M 1476M 77384 S  0.5 18.6  1h44:02 /usr/bin/firefox
 9801 root       20   0 12.5G  432M  2764 S  0.0  5.4 10:53.37 /home/powerman/gocode/bin/node_exporter

I'm running node_exporter as a service under runit using this run file:

#!/bin/sh
exec 2>&1
exec /home/powerman/gocode/bin/node_exporter -web.listen-address 127.0.0.1:9100 \
    -collectors.enabled "conntrack,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,textfile,time,uname,version,vmstat,runit,tcpstat"

I'm now using node_exporter d890b63 (about month old) and at a glance next commits doesn't mention anything related to this issue so I suppose it's still actual. I will try to avoid restarting it in next couple of days in case you'll need more details from live process, but if it continues to eat RAM I may have to restart it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions