-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
- Package Name: azure-eventhub
- Package Version: 5.0.0
- Operating System: macOS Catalina 10.15.2 (19C57)
- Python Version: Python 3.6.6
Describe the bug
If a connection to Azure EventHub is established and messages are sent, but then the connection is severed, the send path's error handling deadlocks on itself when trying to close the connection.
To Reproduce
Steps to reproduce the behavior:
- Modify https://gist.github.com/ellieayla/649dfd2e8f883a8af3653a0f5c3f444a with an eventhub connection string & name.
- Run script under step debugger, with a breakpoint in
_producer.pyinEventHubProducer.close(). - Break the network connection (eg, disable wifi on MacbookPro).
- Observe breakpoint hit.
- Observe stack.
close, _producer.py:272
_handle_exception, exceptions.py:167
_handle_exception, _client_base.py:390
_do_retryable_operation, _client_base.py:412
_send_event_data_with_retry, _producer.py:173
send, _producer.py:264
send_batch, _producer_client.py:217
<module>, debug_eventhub_sender.py:59
- Observe lock mentioned in close() is the same lock aquired in send().
- Step debugger forward.
- Observe execution hangs (forever?) trying to aquire the lock.
Expected behavior
During network connection failure, the Python azure.eventhub module's attempts to send() an event are subject to timeout and raise exceptions. The send() doesn't block forever.
Logs
DEBUG:uamqp.client:Opening client connection.
DEBUG:uamqp:Initializing platform.
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.ERROR: 14> to <ConnectionState.START: 0>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.START: 0> to <ConnectionState.START: 0>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.START: 0> to <ConnectionState.HDR_SENT: 2>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.HDR_SENT: 2> to <ConnectionState.HDR_EXCH: 3>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.HDR_EXCH: 3> to <ConnectionState.OPEN_SENT: 7>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.OPEN_SENT: 7> to <ConnectionState.OPENED: 9>
DEBUG:uamqp.c_uamqp:Destroying cSource
DEBUG:uamqp.c_uamqp:Destroying cTarget
...
INFO:uamqp.c_uamqp:CBS for connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' completed opening with status: 0
INFO:uamqp.c_uamqp:Token put complete with result: 0, status: 202, description: b'Accepted', connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'
DEBUG:uamqp.c_uamqp:Wrapping value type: <AMQPType.CompositeType: 22>
DEBUG:uamqp.c_uamqp:Wrapping value type: <AMQPType.CompositeType: 22>
DEBUG:uamqp.c_uamqp:Deallocating 'DictValue'
DEBUG:uamqp.c_uamqp:Destroying 'DictValue'
INFO:uamqp.sender:Message sender b'sender-link-ab898990-b117-4c6c-98a5-e77fc95f11c7' state changed from <MessageSenderState.Idle: 0> to <MessageSenderState.Opening: 1> on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'
...
Sending message: 0: {'timestamp': 1583599668, 'metric': 'debug.eventhub.sender', 'value': 0}
...
Sending message: 1: {'timestamp': 1583599674, 'metric': 'debug.eventhub.sender', 'value': 1}
...
Sever the network connection. I am doing this by disabling wifi on a MacbookPro.
Observe following is logged on next loop.
DEBUG:uamqp.c_uamqp:Destroying cMessage
DEBUG:uamqp.c_uamqp:Destroying cMessage
Sending message: 8: {'timestamp': 1583599683, 'metric': 'debug.eventhub.sender', 'value': 8}
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.OPENED: 9> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
...
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
DEBUG:uamqp.client:Message sent: <MessageSendResult.Timeout: 2>, None
DEBUG:uamqp.c_uamqp:Deallocating cMessage
DEBUG:uamqp.c_uamqp:Destroying cMessage
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.DISCARDING: 12>
And then the script hangs. Interesting.
Run it again with a debugger attached (eg IntelliJ IDEA w/ pydev) and put a breakpoint in _producer.py in EventHubProducer.close().
def close(self):
# type:() -> None
"""
Close down the handler. If the handler has already closed,
this will be a no op.
"""
with self._lock: # BREAKPOINT HERE
super(EventHubProducer, self).close()At this point, the debugger reveals a stack trace of following:
close, _producer.py:272
_handle_exception, exceptions.py:167
_handle_exception, _client_base.py:390
_do_retryable_operation, _client_base.py:412
_send_event_data_with_retry, _producer.py:173
send, _producer.py:264
send_batch, _producer_client.py:217
<module>, debug_eventhub_sender.py:59
Print the repr(self._lock): it's '<locked _thread.lock object at 0x1080a7a58>' here. This is the same lock taken 6 frames earlier in _producer.py in EventHubProducer.send().
with self._lock: # The lock taken here
with send_context_manager() as child:
self._check_closed()
wrapper_event_data = self._wrap_eventdata(event_data, child, partition_key)
self._unsent_events = [wrapper_event_data.message]
if child:
self._client._add_span_request_attributes( # pylint: disable=protected-access
child
)
self._send_event_data_with_retry(timeout=timeout)Step the debugger forward once. The attempt to aquire the lock blocks as predicted. Leave the debugger sitting here for 5 minutes.
Hit the stop button in debugger, which interrupts stuff. Following is logged.
DEBUG:uamqp.c_uamqp:Destroying cMessageSender
INFO:uamqp.sender:Message sender b'sender-link-ab898990-b117-4c6c-98a5-e77fc95f11c7' state changed from <MessageSenderState.Open: 2> to <MessageSenderState.Closing: 3> on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
INFO:uamqp.c_uamqp:b'Detaching link failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/message_sender.c':b'messagesender_close':814)
INFO:uamqp.sender:Message sender b'sender-link-ab898990-b117-4c6c-98a5-e77fc95f11c7' state changed from <MessageSenderState.Closing: 3> to <MessageSenderState.Error: 4> on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'
DEBUG:uamqp.c_uamqp:Destroying cLink
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
DEBUG:uamqp.c_uamqp:Deallocating 'CompositeValue'
DEBUG:uamqp.c_uamqp:Destroying 'CompositeValue'
DEBUG:uamqp.c_uamqp:Deallocating 'CompositeValue'
DEBUG:uamqp.c_uamqp:Destroying 'CompositeValue'
DEBUG:uamqp.c_uamqp:Deallocating cLink
DEBUG:uamqp.c_uamqp:Deallocating cMessageSender
DEBUG:uamqp.client:CBS session pending.
DEBUG:uamqp.client:Closing exclusive connection.
DEBUG:uamqp.connection:Unlocked connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' to close.
INFO:uamqp.connection:Shutting down connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
INFO:uamqp.authentication.cbs_auth:Shutting down CBS session on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
DEBUG:uamqp.authentication.cbs_auth:Unlocked CBS to close on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
DEBUG:uamqp.c_uamqp:Destroying CBSTokenAuth for connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
INFO:uamqp.c_uamqp:b'Detaching link failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/message_sender.c':b'messagesender_close':814)
INFO:uamqp.c_uamqp:CBS error occured on connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
INFO:uamqp.c_uamqp:b'messagesender_close failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/amqp_management.c':b'amqp_management_close':1050)
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
INFO:uamqp.c_uamqp:b'link detach failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/message_receiver.c':b'messagereceiver_close':420)
INFO:uamqp.c_uamqp:b'messagereceiver_close failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/amqp_management.c':b'amqp_management_close':1057)
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.c_uamqp:b'Sending detach frame failed in session send' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'send_detach':236)
INFO:uamqp.c_uamqp:b'Sending detach frame failed' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/link.c':b'link_detach':1316)
INFO:uamqp.authentication.cbs_auth:Auth closed, destroying session on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
DEBUG:uamqp.c_uamqp:Destroying cSession
INFO:uamqp.c_uamqp:b'Connection not open' (b'/Users/runner/runners/2.164.8/work/1/s/src/vendor/azure-uamqp-c/src/connection.c':b'connection_encode_frame':2048)
INFO:uamqp.authentication.cbs_auth:Finished shutting down CBS session on connection: b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
DEBUG:uamqp.c_uamqp:Destroying Connection
INFO:uamqp.connection:Connection b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923' state changed from <ConnectionState.DISCARDING: 12> to <ConnectionState.END: 13>
DEBUG:uamqp.c_uamqp:Destroying XIO
DEBUG:uamqp.c_uamqp:Destroying XIO
DEBUG:uamqp.c_uamqp:Destroying SASLMechanism
INFO:uamqp.connection:Connection shutdown complete b'EHProducer-64bfa046-007a-40b6-a69d-edcb5a3c8923'.
DEBUG:uamqp:Deinitializing platform.
Got here! # This is the print() in the example script's `finally` block.
Additional context
I note that simply force-releasing the lock in EventHubProducer.close() appears to work sanely. This results in a azure.eventhub.exceptions.OperationTimeoutError or azure.eventhub.exceptions.ConnectionLostError (depending on timing?) raised all the way up to the calling script once retries are exhausted, which appears to be intended.
def close(self):
# type:() -> None
"""
Close down the handler. If the handler has already closed,
this will be a no op.
"""
try:
self._lock.release()
except RuntimeError:
pass # wasn't held
with self._lock:
super(EventHubProducer, self).close()