Throughput decreases and cpu usage increases significantly when adding more vms connected to the same socket_vmnet daemon.
Tested using:
- host: running
iperf3 -c ...
- server vm: running
iper3 -s
- 1-4 additional idle vms
| vms |
bitrate (Gbits/sec) |
cpu (%) |
| 1 |
3.52 |
51.23 |
| 2 |
2.42 |
58.17 |
| 3 |
1.22 |
81.28 |
| 4 |
0.81 |
93.07 |
Expected behavior
- Performance and cpu usage should remain the same when adding more idle vms
- Packets sent to one vm should not be forwarded to other vms
- Packets should be copied directly to vz datagram socket in socket_vmnet, bypassing limactl
Why it happens
When we have multiple vms connected to socket_vmnet:
- every packet sent from the vmnet interface is forwarded to every vm, instead of the vm with the right mac address.
- every packet sent from any vm is forwarded to all other vms, and vmnet inteterface, instead of one of the vm or only vmnet interface
- when a packet is forwarded to a vm, it is copied to the vz datagram socket via a socket pair in limactl
- packets forwarded from limactl to the vz are copied and processed in the guest, where they are dropped (since the packets are not related to the guest).
Flow when receiving a packet from vmnet with 4 vms
host iperf3 ->
host kernel ->
vmnet ->
socket_vmnet ->
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel ->
guest iperf3
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
Flow when receiving a packet from a vm
guest iperf3 ->
guest kernel ->
vz ->
host kernel ->
limactl ->
host kernel ->
socket_vmnet ->
vmnet ->
host_kernel ->
host iperf3
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
host kernel ->
limactl ->
host kernel ->
vz ->
guest kernel (drop)
CPU usage for all vms processes
Looking at cpu usage of socket_vmnet, vm service processes, and limactl processes, we see that there is extreme cpu usage related with processing partly or completely unrelated packets:
| command |
%cpu |
related |
| com.apple.Virtua |
136.9 |
yes |
| limactl |
121.4 |
yes |
| iperf3-darwin |
13.7 |
yes |
| socket_vmnet |
106.6 |
partly |
| kernel_task |
39.1 |
partly |
| com.apple.Virtua |
83.5 |
no |
| com.apple.Virtua |
81.0 |
no |
| com.apple.Virtua |
77.4 |
no |
| limactl |
67.1 |
no |
| limactl |
65.6 |
no |
| limactl |
62.9 |
no |
Total cpu usage:
| work |
%cpu |
| related |
272.0 |
| partly |
145.7 |
| unrelated |
437.5 |
Tested on M1 Pro (8 performance cores, 2 efficiency cores)
Full results
1 vm
% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[ 5] local 192.168.105.1 port 60990 connected to 192.168.105.58 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd RTT
[ 5] 0.00-1.00 sec 460 MBytes 3.86 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 1.00-2.00 sec 421 MBytes 3.53 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 2.00-3.00 sec 435 MBytes 3.65 Gbits/sec 0 8.00 MBytes 10ms
[ 5] 3.00-4.00 sec 411 MBytes 3.45 Gbits/sec 0 8.00 MBytes 14ms
[ 5] 4.00-5.00 sec 317 MBytes 2.66 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 5.00-6.00 sec 430 MBytes 3.61 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 6.00-7.00 sec 423 MBytes 3.55 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 7.00-8.00 sec 433 MBytes 3.63 Gbits/sec 0 8.00 MBytes 10ms
[ 5] 8.00-9.00 sec 437 MBytes 3.67 Gbits/sec 0 8.00 MBytes 9ms
[ 5] 9.00-10.00 sec 430 MBytes 3.61 Gbits/sec 0 8.00 MBytes 9ms
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 4.10 GBytes 3.52 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 4.10 GBytes 3.52 Gbits/sec receiver
cpu usage
CPU usage: 20.3% user, 31.19% sys, 48.77% idle
PID COMMAND %CPU #TH
49183 com.apple.Virtua 166.3 19/3
49173 limactl 100.0 16/2
48954 socket_vmnet 64.4 5/1
0 kernel_task 57.8 561/10
54694 iperf3-darwin 18.6 1/1
2 vms
% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[ 5] local 192.168.105.1 port 60997 connected to 192.168.105.58 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd RTT
[ 5] 0.00-1.00 sec 269 MBytes 2.26 Gbits/sec 0 8.00 MBytes 13ms
[ 5] 1.00-2.00 sec 299 MBytes 2.51 Gbits/sec 0 8.00 MBytes 14ms
[ 5] 2.00-3.00 sec 263 MBytes 2.21 Gbits/sec 0 8.00 MBytes 15ms
[ 5] 3.00-4.00 sec 296 MBytes 2.48 Gbits/sec 0 8.00 MBytes 13ms
[ 5] 4.00-5.00 sec 298 MBytes 2.50 Gbits/sec 0 8.00 MBytes 12ms
[ 5] 5.00-6.00 sec 284 MBytes 2.38 Gbits/sec 0 8.00 MBytes 13ms
[ 5] 6.00-7.00 sec 299 MBytes 2.51 Gbits/sec 0 8.00 MBytes 14ms
[ 5] 7.00-8.00 sec 298 MBytes 2.50 Gbits/sec 0 8.00 MBytes 14ms
[ 5] 8.00-9.00 sec 285 MBytes 2.39 Gbits/sec 0 8.00 MBytes 13ms
[ 5] 9.00-10.00 sec 298 MBytes 2.50 Gbits/sec 0 8.00 MBytes 12ms
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 2.82 GBytes 2.42 Gbits/sec 0 sender
[ 5] 0.00-10.01 sec 2.82 GBytes 2.42 Gbits/sec receiver
cpu usage
CPU usage: 20.84% user, 37.32% sys, 41.83% idle
PID COMMAND %CPU #TH
49183 com.apple.Virtua 132.9 18/2
49173 limactl 92.2 16/3
48954 socket_vmnet 77.0 6/1
49905 com.apple.Virtua 74.2 18/1
49900 limactl 57.3 16/1
0 kernel_task 41.4 561/12
54259 iperf3-darwin 22.1 1/1
3 vms
% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[ 5] local 192.168.105.1 port 61004 connected to 192.168.105.58 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd RTT
[ 5] 0.00-1.00 sec 161 MBytes 1.35 Gbits/sec 0 2.91 MBytes 21ms
[ 5] 1.00-2.00 sec 138 MBytes 1.16 Gbits/sec 0 3.05 MBytes 17ms
[ 5] 2.00-3.00 sec 143 MBytes 1.20 Gbits/sec 0 3.15 MBytes 44ms
[ 5] 3.00-4.00 sec 139 MBytes 1.17 Gbits/sec 0 3.24 MBytes 19ms
[ 5] 4.00-5.00 sec 138 MBytes 1.16 Gbits/sec 0 3.30 MBytes 25ms
[ 5] 5.00-6.00 sec 144 MBytes 1.21 Gbits/sec 0 3.34 MBytes 22ms
[ 5] 6.00-7.00 sec 154 MBytes 1.29 Gbits/sec 0 3.37 MBytes 23ms
[ 5] 7.00-8.00 sec 145 MBytes 1.21 Gbits/sec 0 3.38 MBytes 15ms
[ 5] 8.00-9.00 sec 142 MBytes 1.19 Gbits/sec 0 3.39 MBytes 17ms
[ 5] 9.00-10.00 sec 154 MBytes 1.29 Gbits/sec 0 3.39 MBytes 23ms
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.42 GBytes 1.22 Gbits/sec 0 sender
[ 5] 0.00-10.01 sec 1.42 GBytes 1.22 Gbits/sec receiver
cpu usage
CPU usage: 24.13% user, 57.13% sys, 18.72% idle
PID COMMAND %CPU #TH
49183 com.apple.Virtua 145.8 18/2
49173 limactl 120.5 15/2
48954 socket_vmnet 99.8 7/2
49905 com.apple.Virtua 82.9 18/1
50380 com.apple.Virtua 82.1 18/1
50375 limactl 63.4 16/1
49900 limactl 61.7 16/1
0 kernel_task 43.4 561/11
53677 iperf3-darwin 15.2 1/1
4 vms
% caffeinate -d iperf3-darwin -c 192.168.105.58 -l 1m -t 10
Connecting to host 192.168.105.58, port 5201
[ 5] local 192.168.105.1 port 61014 connected to 192.168.105.58 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd RTT
[ 5] 0.00-1.00 sec 99.8 MBytes 837 Mbits/sec 0 2.90 MBytes 26ms
[ 5] 1.00-2.00 sec 98.3 MBytes 824 Mbits/sec 0 2.53 MBytes 25ms
[ 5] 2.00-3.00 sec 98.2 MBytes 823 Mbits/sec 0 3.03 MBytes 69ms
[ 5] 3.00-4.00 sec 99.7 MBytes 836 Mbits/sec 0 3.04 MBytes 30ms
[ 5] 4.00-5.00 sec 103 MBytes 860 Mbits/sec 0 3.03 MBytes 22ms
[ 5] 5.00-6.00 sec 91.2 MBytes 765 Mbits/sec 0 3.03 MBytes 27ms
[ 5] 6.00-7.00 sec 100 MBytes 842 Mbits/sec 0 3.03 MBytes 61ms
[ 5] 7.00-8.00 sec 102 MBytes 858 Mbits/sec 0 3.04 MBytes 33ms
[ 5] 8.00-9.00 sec 98.2 MBytes 823 Mbits/sec 0 3.04 MBytes 31ms
[ 5] 9.00-10.00 sec 103 MBytes 862 Mbits/sec 0 3.04 MBytes 28ms
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 993 MBytes 833 Mbits/sec 0 sender
[ 5] 0.00-10.02 sec 991 MBytes 830 Mbits/sec receiver
cpu usage
CPU usage: 25.28% user, 67.77% sys, 6.93% idle
PID COMMAND %CPU #TH
49183 com.apple.Virtua 136.9 18/2
49173 limactl 121.4 15/2
48954 socket_vmnet 106.6 8/1
50380 com.apple.Virtua 83.5 18/2
50731 com.apple.Virtua 81.0 18/1
49905 com.apple.Virtua 77.4 18/2
50375 limactl 67.1 16/1
50726 limactl 65.6 16/1
49900 limactl 62.9 16/1
0 kernel_task 39.1 561/10
53126 iperf3-darwin 13.7 1
Throughput decreases and cpu usage increases significantly when adding more vms connected to the same socket_vmnet daemon.
Tested using:
iperf3 -c ...iper3 -sExpected behavior
Why it happens
When we have multiple vms connected to socket_vmnet:
Flow when receiving a packet from vmnet with 4 vms
Flow when receiving a packet from a vm
CPU usage for all vms processes
Looking at cpu usage of socket_vmnet, vm service processes, and limactl processes, we see that there is extreme cpu usage related with processing partly or completely unrelated packets:
Total cpu usage:
Tested on M1 Pro (8 performance cores, 2 efficiency cores)
Full results
1 vm
cpu usage
2 vms
cpu usage
3 vms
cpu usage
4 vms
cpu usage