Skip to content

Incorrect arguments to Worker/LocalCluster gives a confusing error #4640

@gjoseph92

Description

@gjoseph92

On 2.30.0, passing an invalid argument to Worker gave you an error telling you what you did wrong. This is easy for users to do via LocalCluster/Client (for example, giving num_workers instead of n_workers, which is exactly what I mistyped):

# used to get this
In [1]: import distributed

In [2]: distributed.__version__
Out[2]: '2.30.0'

In [3]: distributed.Worker("tcp://localhost", 8000, foo=1)
TypeError: __init__() got an unexpected keyword argument 'foo'

On main (likely anything after #4365), the error message is less helpful:

# now get this
In [1]: import distributed

In [2]: distributed.__version__
Out[2]: '2021.03.0+36.g8c2c6738.dirty'

In [3]: distributed.Worker("tcp://localhost", 8000, foo=1)
TypeError: object.__init__() takes exactly one argument (the instance to initialize)

This is coming from Server.__init__ here:

super().__init__(**kwargs)

because now that Scheduler is using multiple-inheritance, we wanted the base classes to be able to play nicely with each other and propagate kwargs up the hierarchy. However, the fact that none of the base classes "bottom out" and assume that they are, in fact, the base of the MRO means we lose the opportunity to decide when some kwargs are actually wrong, versus meant for the class above us.

FWIW, Server does (at the moment) appear a safe place to not take **kwargs or call super().__init__, since it does come last in the MRO for all of its subclasses:

In [8]: distributed.Worker.__mro__
Out[8]: 
(distributed.worker.Worker,
 distributed.node.ServerNode,
 distributed.core.Server,
 object)

In [9]: distributed.Scheduler.__mro__
Out[9]: 
(distributed.scheduler.Scheduler,
 distributed.scheduler.SchedulerState,
 distributed.node.ServerNode,
 distributed.core.Server,
 object)

In [10]: distributed.node.ServerNode.__mro__
Out[10]: (distributed.node.ServerNode, distributed.core.Server, object)

However, it could be brittle to assume this will always be the case. And since all we really care about is just presenting a better error message, maybe there's a different approach to take here?

Some discussion of this already in https://github.com/dask/distributed/pull/4365/files/2212edc11b424c17f2f656eb73b2ca206742705a#r557630488

cc @jakirkham

Environment:

  • Dask version: bf9ddab
  • Python version: 3.9
  • Operating System: macOS
  • Install method (conda, pip, source): source

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions