Skip to content

"Bug" introduced in 2022.04.2 requiring explicit cluster close #6255

@evamaxfield

Description

@evamaxfield

What happened:

If you attempt to exit a Python process while a LocalCluster (threaded) is running, an error is raised.

What you expected to happen:

No error raises, the Python process exists with no error exit code.

Minimal Complete Verifiable Example:

Step-by-step:
distributed==2022.04.1
distributed==2022.04.2 (bug)
distributed==2022.04.2 (fixed)

2022.04.1

pip install distributed==2022.04.1
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
exit()

no error exit code

2022.04.2 (bug)

pip install distributed==2022.04.2
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
exit()
raised error / exit code (expand to see full traceback):
cannot schedule new futures after interpreter shutdown
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/worker.py", line 1604, in close
    await to_thread(_close)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/base_events.py", line 819, in run_in_executor
    executor.submit(func, *args), loop=self)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/concurrent/futures/thread.py", line 169, in submit
    raise RuntimeError('cannot schedule new futures after '
RuntimeError: cannot schedule new futures after interpreter shutdown
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/spec.py", line 654, in close_clusters
    cluster.close(timeout=10)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/cluster.py", line 194, in close
    return self.sync(self._close, callback_timeout=timeout)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 318, in sync
    return sync(
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 385, in sync
    raise exc.with_traceback(tb)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 358, in f
    result = yield future
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/deploy/spec.py", line 420, in _close
    assert w.status == Status.closed, w.status
AssertionError: Status.closing
Exception ignored in: <coroutine object Worker.close at 0x7f8e31f3cdc0>
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 779, in __exit__
    frame = stack[self.unroll_stack]
IndexError: list index out of range
Task was destroyed but it is pending!
task: <Task pending name='Task-68' coro=<Worker.close() done, defined at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py:757> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31f37eb0>()]> cb=[IOLoop.add_future.<locals>.<lambda>() at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/ioloop.py:688]>
Exception ignored in: <coroutine object Worker.close at 0x7f8e31f3cd40>
Traceback (most recent call last):
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 759, in wrapper
    return await func(*args, **kwargs)
  File "/Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/utils.py", line 779, in __exit__
    frame = stack[self.unroll_stack]
IndexError: list index out of range
Task was destroyed but it is pending!
task: <Task pending name='Task-20' coro=<InProcListener._listen() running at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/comm/inproc.py:265> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31e6cfa0>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-30' coro=<Worker.handle_scheduler() running at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/distributed/worker.py:169> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f8e31e74730>()]> cb=[IOLoop.add_future.<locals>.<lambda>() at /Users/maxfield/miniconda3/envs/test/lib/python3.9/site-packages/tornado/ioloop.py:688]>

2022.04.2 (fixed)

pip install distributed==2022.04.2
python
from distributed import LocalCluster
cluster = LocalCluster(processes=False)
cluster.close()
exit()

no error exit code

Anything else we need to know?:

Environment:

  • Python version: 3.9.12
  • Operating System: macOS 11.6.1
  • Install method (conda, pip, source): pip

🤷 Technically it's not really a bug but definitely caused a lot of my CRON jobs to be marked as failing because I wasn't explicitly closing my clusters. If this is the desired behavior, then great, but it didn't seem like this was reported in the changelog outside of this PR: #6091

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions