Skip to content

test_creating_multiple_universe_without_offset fails spuriously #4540

@orbeckst

Description

@orbeckst

Expected behavior

Running our test suite should not fail randomly.

Actual behavior

Tests occasionally fail due to test_creating_multiple_universe_without_offset timing out (see below).

=================================== FAILURES ===================================
________________ test_creating_multiple_universe_without_offset ________________
[gw3] linux -- Python 3.12.2 /home/runner/micromamba/envs/mda/bin/python3.12

temp_xtc = PosixPath('/tmp/pytest-of-runner/pytest-0/popen-gw3/test_creating_multiple_univers0/testing.xtc')
ncopies = 3

    def test_creating_multiple_universe_without_offset(temp_xtc, ncopies=3):
        #  test if they can be created without generating
        #  the offset simultaneously.
        #  The tested XTC file is way too short to induce a race scenario but the
        #  test is included as documentation for the scenario that used to create
        #  a problem (see PR #3375 and issues #3230, #1988)
    
        args = (GRO, str(temp_xtc))
>       with multiprocessing.Pool(2) as p:

/home/runner/work/mdanalysis/mdanalysis/testsuite/MDAnalysisTests/parallelism/test_multiprocessing.py:159: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/pool.py:739: in __exit__
    self.terminate()
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/pool.py:657: in terminate
    self._terminate()
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/util.py:227: in __call__
    res = self._callback(*self._args, **self._kwargs)
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/pool.py:732: in _terminate_pool
    p.join()
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/process.py:149: in join
    res = self._popen.wait(timeout)
/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/popen_fork.py:43: in wait
    return self.poll(os.WNOHANG if timeout == 0.0 else 0)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <multiprocessing.popen_fork.Popen object at 0x7f1dd0786420>, flag = 0

    def poll(self, flag=os.WNOHANG):
        if self.returncode is None:
            try:
>               pid, sts = os.waitpid(self.pid, flag)
E               Failed: Timeout >200.0s

/home/runner/micromamba/envs/mda/lib/python3.12/multiprocessing/popen_fork.py:27: Failed
----------------------------- Captured stdout call -----------------------------
+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
~~~~~~~~~~~~~~~~~~~~~ Stack of Thread-2 (139766194525760) ~~~~~~~~~~~~~~~~~~~~~~
  File "/home/runner/micromamba/envs/mda/lib/python3.12/threading.py", line 1030, in _bootstrap
    self._bootstrap_inner()
  File "/home/runner/micromamba/envs/mda/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
    self.run()
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/tqdm/_monitor.py", line 60, in run
    self.was_killed.wait(self.sleep_interval)
  File "/home/runner/micromamba/envs/mda/lib/python3.12/threading.py", line 655, in wait
    signaled = self._cond.wait(timeout)
  File "/home/runner/micromamba/envs/mda/lib/python3.12/threading.py", line 359, in wait
    gotit = waiter.acquire(True, timeout)
~~~~~~~~~~~~~~~~~~~~~ Stack of <unknown> (139766989321792) ~~~~~~~~~~~~~~~~~~~~~
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/execnet/gateway_base.py", line 361, in _perform_spawn
    reply.run()
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/execnet/gateway_base.py", line 296, in run
    self._result = func(*args, **kwargs)
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/execnet/gateway_base.py", line 1049, in _thread_receiver
    msg = Message.from_io(io)
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/execnet/gateway_base.py", line 507, in from_io
    header = io.read(9)  # type 1, channel 4, payload 4
  File "/home/runner/micromamba/envs/mda/lib/python3.12/site-packages/execnet/gateway_base.py", line 474, in read
    data = self._read(numbytes - len(buf))
+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++

Solution

Looking at the test

def test_creating_multiple_universe_without_offset(temp_xtc, ncopies=3):
# test if they can be created without generating
# the offset simultaneously.
# The tested XTC file is way too short to induce a race scenario but the
# test is included as documentation for the scenario that used to create
# a problem (see PR #3375 and issues #3230, #1988)
args = (GRO, str(temp_xtc))
with multiprocessing.Pool(2) as p:
universes = [p.apply_async(mda.Universe, args) for i in range(ncopies)]
universes = [universe.get() for universe in universes]
assert_equal(universes[0].trajectory._xdr.offsets,
universes[1].trajectory._xdr.offsets)
we should really first understand if our underlying code is broken or if the test is inherently flaky.

If the underlying code is broken (check issues #3230, #1988 and PR #3375) then we need to fix the code.

If this is an issue with the test then we could add a much shorter timeout using the pytest timeout plugin (which we are already using).

Current version of MDAnalysis

  • Which version are you using? (run python -c "import MDAnalysis as mda; print(mda.__version__)") develop 2.8.0-dev on GH CI
  • Which version of Python (python -V)? any tested (I think...)
  • Which operating system? Linux (macOS??)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions