NFS server's disconnection caused the host high load and high VIRT size

### Host operating system: output of `uname -a`
Linux  3.10.0-862.6.3.el7.x86_64 #1 SMP Tue Jun 26 16:32:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


### node_exporter version: output of `node_exporter --version`


node_exporter, version 0.16.0 (branch: HEAD, revision: d42bd70f4363dced6b77d8fc311ea57b63387e4f)
  build user:       root@a67a9bc13a69
  build date:       20180515-15:52:42
  go version:       go1.9.6


### node_exporter command line flags

/usr/local/prometheus/node_exporter-0.16.0.linux-amd64/node_exporter


### Are you running node_exporter in Docker?

No, not in Docker

### What did you do that produced an error?
The server has mounted a NFS volume to local directory, when the NFS server is down, the server's load average increase to 330. 
After I restarted the node_exporter process, everything came back to normal.


### What did you expect to see?
This should not happen, the node_exporter should detect that the NFS server is down.

### What did you see instead?

> top - 16:41:00 up 103 days, 20:04,  2 users,  load average: 331.40, 330.96, 302.72
> Tasks: 222 total,   1 running, 221 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  1.4 us,  1.1 sy,  0.0 ni, 96.8 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
> KiB Mem : 65975208 total,   487048 free, 33177676 used, 32310484 buff/cache
> KiB Swap: 16777212 total, 16654076 free,   123136 used. 31770544 avail Mem
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 19263 root      20   0       0      0      0 D   0.0  0.0   0:00.00 172.16.89.197-m
> 21651 nobody    20   0 10.924g 200524   5236 D   0.0  0.3   0:25.59 node_exporter

As we can see from the above, the VIRT of node_exporter is about 10G, and the load of the server rised up to 331, which are VERY high.

After I kill -9 the node_exporter process, the load drop quickly.


> top - 16:41:06 up 103 days, 20:04,  2 users,  load average: 305.02, 325.50, 301.11
> Tasks: 223 total,   2 running, 221 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  2.2 us,  1.2 sy,  0.0 ni, 94.8 id,  1.8 wa,  0.0 hi,  0.0 si,  0.0 st
> KiB Mem : 65975208 total,   919604 free, 32944664 used, 32110940 buff/cache
> KiB Swap: 16777212 total, 16654076 free,   123136 used. 32004128 avail Mem
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 19263 root      20   0       0      0      0 D   0.0  0.0   0:00.00 172.16.89.197-m


after just 4 minutes, everything is normal now

```
top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print;
count++} } END {print "Total status D: "count}'
```

> top - 16:45:40 up 103 days, 20:09,  2 users,  load average: 5.06, 132.79, 225.54
> Tasks: 221 total,   1 running, 220 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  1.4 us,  1.2 sy,  0.0 ni, 95.4 id,  1.9 wa,  0.0 hi,  0.0 si,  0.1 st
> KiB Mem : 65975208 total,   841576 free, 32951132 used, 32182500 buff/cache
> KiB Swap: 16777212 total, 16654076 free,   123136 used. 31997280 avail Mem
> 
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
> 19263 root      20   0       0      0      0 D   0.0  0.0   0:00.00 172.16.89.197-m
> Total status D: 1


and the node_exporter's VIRT size: 
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
> nobody   25360  0.4  0.0 1263600 12488 ?       Ssl  16:41   0:46 /usr/local/prometheus/node_exporter-0.16.0.linux-amd64/node_exporter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NFS server's disconnection caused the host high load and high VIRT size #1121

Host operating system: output of `uname -a`

node_exporter version: output of `node_exporter --version`

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you see instead?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NFS server's disconnection caused the host high load and high VIRT size #1121

Description

Host operating system: output of uname -a

node_exporter version: output of node_exporter --version

node_exporter command line flags

Are you running node_exporter in Docker?

What did you do that produced an error?

What did you expect to see?

What did you see instead?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Host operating system: output of `uname -a`

node_exporter version: output of `node_exporter --version`