Skip to content

updated docs with explicit scheduler setting example #78

@orbeckst

Description

@orbeckst

Expected behaviour

PMDA should run with "best performance" out of the box.

At a minimum, the docs should be clear what one has to do.

The dask docs recommend distributed for GIL-bound code so in the following I use it as the default, but we should benchmark the single machine schedulers

  • thread
  • processes
  • distributed

For using distributed:

from dask.distributed import Client
client = Client()

import pmda
...

or configure the scheduler

dask.config.set(scheduler='distributed')

Actual behaviour

With PMDA now using dask's preferred way to select a scheduler (#66), we now default to Dask's default scheduler. For delayed(), this is the threads scheduler, which does not work well with our Python based code: the GIL serializes the tasks and I expect that performance is poor out of the box.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions