Skip to content

Reduce danger of accidental exception suppression #30

@Nikratio

Description

@Nikratio

Consider the following code. It consists of an echo server with a bug, and a unit test for the server:

#!/usr/bin/env python3

import pytest
import pytest_trio
import trio
import logging

log = logging.getLogger(__name__)
PORT = 12346
BUFSIZE = 16384
async def echo_server(server_sock):
    try:
        with server_sock:
            await server_sock.recv(BUFSIZE)
            raise RuntimeError('Damn!')
    except Exception as exc:
        await trio.sleep(0.1) # make problem deterministic
        log.exception('Things went wrong...')

async def echo_listener(nursery):
    with trio.socket.socket() as listen_sock:
        await listen_sock.bind(("127.0.0.1", PORT))
        listen_sock.listen()
        while True:
            server_sock, _ = await listen_sock.accept()
            nursery.start_soon(echo_server, server_sock)

async def parent():
    async with trio.open_nursery() as nursery:
        nursery.start_soon(echo_listener, nursery)

@pytest.mark.trio
async def test_server(nursery):
    nursery.start_soon(parent)
    await trio.sleep(0.5) # wait for server to listen
    with trio.socket.socket() as client_sock:
        await client_sock.connect(("127.0.0.1", PORT))
        await client_sock.send(b'hello, world')
        resp = await client_sock.recv(BUFSIZE)
        assert resp == b'hello, world'

As expected, the unittest fails - but the root cause is nowhere to be seen:

$ python3 -m pytest tests/lograce.py 
==================================== test session starts =====================================
platform linux -- Python 3.6.4, pytest-3.3.2, py-1.5.2, pluggy-0.6.0 -- /home/nikratio/consulting/iofabric/venv/bin/python3
cachedir: tests/.cache
rootdir: /home/nikratio/consulting/iofabric/gruagach/tests, inifile: pytest.ini
plugins: trio-0.3.0
collected 1 item                                                                             

tests/lograce.py::test_server FAILED                                                   [100%]

========================================== FAILURES ==========================================
________________________________________ test_server _________________________________________
Traceback (most recent call last):
  File "/home/nikratio/consulting/iofabric/gruagach/tests/lograce.py", line 41, in test_server
    assert resp == b'hello, world'
AssertionError: assert b'' == b'hello, world'
  Right contains more items, first extra item: 104
  Full diff:
  - b''
  + b'hello, world'
================================== 1 failed in 0.63 seconds ==================================

If we force the test to wait a moment before failing...

@pytest.mark.trio
async def test_server(nursery):
    nursery.start_soon(parent)
    await trio.sleep(0.5)  # wait for server to listen
    try:
        with trio.socket.socket() as client_sock:
            await client_sock.connect(("127.0.0.1", PORT))
            await client_sock.send(b'hello, world')
            resp = await client_sock.recv(BUFSIZE)
            assert resp == b'hello, world'
    finally:
        await trio.sleep(0.5)

...we get much better results:

$ python3 -m pytest tests/lograce.py 
[...]
collected 1 item                                                                             

tests/lograce.py::test_server FAILED                                                   [100%]

========================================== FAILURES ==========================================
________________________________________ test_server _________________________________________
Traceback (most recent call last):
  File "/home/nikratio/consulting/iofabric/gruagach/tests/lograce.py", line 42, in test_server
    assert resp == b'hello, world'
AssertionError: assert b'' == b'hello, world'
  Right contains more items, first extra item: 104
  Full diff:
  - b''
  + b'hello, world'
------------------------------------- Captured log call --------------------------------------
21:55:13.383 ERROR    lograce.echo_server: Things went wrong...
Traceback (most recent call last):
  File "/home/nikratio/consulting/iofabric/gruagach/tests/lograce.py", line 15, in echo_server
    raise RuntimeError('Damn!')
RuntimeError: Damn!
================================== 1 failed in 1.13 seconds ==================================

I believe the problem is that the failing test causes trio to cancel the server tasks before it has a chance to log its exception. Note that the call to trio.wait in echo_server is not necessay for the problem to occur, it just makes it much more likely.

I am not sure how to best fix this. Maybe the trio test runner could, if the test fails, give the other nursery task a short grace period in which they have a chance to report any errors before cancelling them?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions