Skip to content

Conversation

@pentschev
Copy link
Member

By marking all UCX tests as asyncio we ensure that they all use the fixture that properly releases UCX resources once the event loop is closed.

This should fix CI failures that have appeared in rapidsai/ucx-py#804, where a subset of Distributed tests run.

By marking all UCX tests as asyncio we ensure that they all use the
fixture that properly releases UCX resources once the event loop is
closed.
@jakirkham
Copy link
Member

Looks like distributed.comm.tests.test_ucx.test_stress is failing on CI:

    def reset():
        """Resets the UCX library by shutting down all of UCX.
    
        The library is initiated at next API call.
        """
        global _ctx
        if _ctx is not None:
            weakref_ctx = weakref.ref(_ctx)
            _ctx = None
            gc.collect()
            if weakref_ctx() is not None:
                msg = (
                    "Trying to reset UCX but not all Endpoints and/or Listeners "
                    "are closed(). The following objects are still referencing "
                    "ApplicationContext: "
                )
                for o in gc.get_referrers(weakref_ctx()):
                    msg += "\n  %s" % str(o)
>               raise UCXError(msg)
E               ucp.exceptions.UCXError: Trying to reset UCX but not all Endpoints and/or Listeners are closed(). The following objects are still referencing ApplicationContext: 
E                 {'_ep': <ucp._libs.ucx_api.UCXEndpoint object at 0x7fe4d2cb5900>, '_ctx': <ucp.core.ApplicationContext object at 0x7fe4d2cb6460>, '_send_count': 1714, '_recv_count': 214, '_finished_recv_count': 214, '_shutting_down_peer': False, '_close_after_n_recv': None, '_tags': {'msg_send': 17641360423509091357, 'msg_recv': 858844837446935544, 'ctrl_send': 1644723418746821278, 'ctrl_recv': 523734172212257047}}

@pentschev
Copy link
Member Author

Thanks @jakirkham , should be fixed now. Failing test shouldn't be related.

cc @quasiben

@jakirkham
Copy link
Member

Thanks Peter! 😄

There seem to be other test failures. However they don't appear to be related to UCX (and therefore not related to the changes here). Going to go ahead and merge

@pentschev
Copy link
Member Author

Thanks for reviewing and merging, John!

@pentschev pentschev deleted the update-ucx-asyncio-tests branch November 2, 2021 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants