Skip to content

Improve dispatch queue usage#69

Merged
AkihiroSuda merged 1 commit into
lima-vm:masterfrom
nirs:queues
Nov 20, 2024
Merged

Improve dispatch queue usage#69
AkihiroSuda merged 1 commit into
lima-vm:masterfrom
nirs:queues

Conversation

@nirs
Copy link
Copy Markdown
Member

@nirs nirs commented Nov 8, 2024

The way we used dispatch queue was not optimal, leading to creation of
too many threads, which can cause unneeded context switching, and lower
performance.

We use now 2 queues:

  • vms_queue - concurrent queue use to run accepted connections
    threads. For this queue we must use concurrent queue since we want to
    process requests from VMs in parallel.

  • host_queue - serial queue for processing events from vmnet. This queue
    is used for the start and stop callback and for the events callback.

With this change we use consistent number of threads: main thread, 1
vmnet thread, 1 thread per connected vm.

Testing show no change in performance.

hardware os test before (Gb/s) after (Gb/s)
M1 Pro 14.7.1 host-to-vm 3.252 3.219
M1 Pro 14.7.1 host-to-vm-2 1.902 1.880
M1 Pro 14.7.1 vm-to-vm 1.493 1.507
M2 Max 15.1 host-to-vm 2.335 2.359
M2 Max 15.1 host-to-vm-2 1.798 1.802
M2 Max 15.1 vm-to-vm 1.275 1.266

Results are average send throughput of 10 runs. Using performance tests
from #66.

We have a performance regression on macOS 15.1. With 14.7 on same M2 Max
machine it was up ot 1.2 times faster. With 15.1 it is up to 1.4 times
slower.

@AkihiroSuda
Copy link
Copy Markdown
Member

Thanks, do you have a benchmark result?

@nirs
Copy link
Copy Markdown
Member Author

nirs commented Nov 8, 2024

Thanks, do you have a benchmark result?

I used both configurations when collecting data for #58 and I don't remember any difference. I can do another run on macOS 15 and M2 Max.

@nirs
Copy link
Copy Markdown
Member Author

nirs commented Nov 8, 2024

Rebased, I'll do quick test run to validate the lack of performance improvement and update the commit message.

The way we used dispatch queue was not optimal, leading to creation of
too many threads, which can cause unneeded context switching, and lower
performance.

We use now 2 queues:

- vms_queue - concurrent queue use to run accepted connections
  threads. For this queue we must use concurrent queue since we want to
  process requests from VMs in parallel.

- host_queue - serial queue for processing events from vmnet. This queue
  is used for the start and stop callback and for the events callback.

With this change we use consistent number of threads: main thread, 1
vmnet thread, 1 thread per connected vm.

Testing show no change in performance.

| hardware | os     | test          | before (Gb/s) | after (Gb/s) |
|----------|--------|---------------|---------------|--------------|
| M1 Pro   | 14.7.1 | host-to-vm    |         3.252 |        3.219 |
| M1 Pro   | 14.7.1 | host-to-vm-2  |         1.902 |        1.880 |
| M1 Pro   | 14.7.1 | vm-to-vm      |         1.493 |        1.507 |
| M2 Max   | 15.1   | host-to-vm    |         2.335 |        2.359 |
| M2 Max   | 15.1   | host-to-vm-2  |         1.798 |        1.802 |
| M2 Max   | 15.1   | vm-to-vm      |         1.275 |        1.266 |

Results are average send throughput of 10 runs. Using performance tests
from lima-vm#66.

We have a performance regression on macOS 15.1. With 14.7 on same M2 Max
machine it was up ot 1.2 times faster. With 15.1 it is up to 1.4 times
slower.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
@nirs nirs marked this pull request as ready for review November 9, 2024 20:23
@nirs
Copy link
Copy Markdown
Member Author

nirs commented Nov 9, 2024

@AkihiroSuda added test results, confirming that there is no improvement, and also showing that we have a performance regression in macOS 15.1 (not related to this change).

@AkihiroSuda
Copy link
Copy Markdown
Member

@AkihiroSuda added test results, confirming that there is no improvement, and also showing that we have a performance regression in macOS 15.1 (not related to this change).

What is the benefit of this PR?

@nirs
Copy link
Copy Markdown
Member Author

nirs commented Nov 18, 2024

What is the benefit of this PR?

  • Avoiding unneeded queues for initialization and shutdown - we use the single serial host queue
  • Avoiding the unwanted case when additional threads are created for serving vmnet traffic. Testing show that this rare enough so we don't see any effect on performance.
  • Improve documentation and structure of the program

We can also use the serial queue later to simpilify locking - if we dispatch add/remove connection to the serial queue, no locking is needed. However this does not solve the need for locking when forwarding packets from one vms to other vms, since it is done in one of the vms queues.

I'm using this for few months without any issues.

@AkihiroSuda AkihiroSuda requested a review from a team November 19, 2024 07:41
@AkihiroSuda AkihiroSuda added this to the v1.2.0 milestone Nov 20, 2024
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit 87ad870 into lima-vm:master Nov 20, 2024
@nirs nirs deleted the queues branch November 20, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants