-
Notifications
You must be signed in to change notification settings - Fork 23
Closed
Labels
Description
Expected behaviour
PMDA should run with "best performance" out of the box.
At a minimum, the docs should be clear what one has to do.
The dask docs recommend distributed for GIL-bound code so in the following I use it as the default, but we should benchmark the single machine schedulers
- thread
- processes
- distributed
For using distributed:
from dask.distributed import Client
client = Client()
import pmda
...dask.config.set(scheduler='distributed')Actual behaviour
With PMDA now using dask's preferred way to select a scheduler (#66), we now default to Dask's default scheduler. For delayed(), this is the threads scheduler, which does not work well with our Python based code: the GIL serializes the tasks and I expect that performance is poor out of the box.
Reactions are currently unavailable