-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
I've just noticed prometheus fail to get metrics from node_exporter in last 2 days (my workstation uptime is 3 days, so node_exporter actually worked only about a day after last reboot). Log of node_exporter complains about "too many open files", and it's really hit 1024 open file limit:
# ls /proc/$(pidof node_exporter)/fd | wc -l
1023
But most of them are sockets:
# ls -l /proc/$(pidof node_exporter)/fd | grep socket | wc -l
1018
And I believe all of them except one is leaked because there are no related connections:
# ss -anp | grep node_exporter
tcp LISTEN 0 128 127.0.0.1:9100 *:* users:(("node_exporter",pid=9801,fd=4))
Also I noticed node_exporter uses too many memory, not sure is this related to this issue, but it shouldn't be second process in memory usage top:
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
1638 powerman 20 0 2514M 1476M 77384 S 0.5 18.6 1h44:02 /usr/bin/firefox
9801 root 20 0 12.5G 432M 2764 S 0.0 5.4 10:53.37 /home/powerman/gocode/bin/node_exporter
I'm running node_exporter as a service under runit using this run file:
#!/bin/sh
exec 2>&1
exec /home/powerman/gocode/bin/node_exporter -web.listen-address 127.0.0.1:9100 \
-collectors.enabled "conntrack,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,sockstat,stat,textfile,time,uname,version,vmstat,runit,tcpstat"
I'm now using node_exporter d890b63 (about month old) and at a glance next commits doesn't mention anything related to this issue so I suppose it's still actual. I will try to avoid restarting it in next couple of days in case you'll need more details from live process, but if it continues to eat RAM I may have to restart it.