From ab6096ae0ef95dc7994a505ac314d28029f6df86 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 29 May 2025 20:54:26 +0000 Subject: [PATCH 01/28] Initial plan for issue From 35d2dc92e2dfbe82db85b717286edd215cc57444 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 29 May 2025 21:03:38 +0000 Subject: [PATCH 02/28] Add comprehensive TROUBLESHOOTING.md for Azure Service Bus Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 422 ++++++++++++++++++ 1 file changed, 422 insertions(+) create mode 100644 sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md new file mode 100644 index 000000000000..a82f9ad1469d --- /dev/null +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -0,0 +1,422 @@ +# Troubleshoot Azure Service Bus client library issues + +This troubleshooting guide contains instructions to diagnose frequently encountered issues while using the Azure Service Bus client library for Python. + +## Table of contents + +* [General troubleshooting](#general-troubleshooting) + * [Enable client logging](#enable-client-logging) + * [Common exceptions](#common-exceptions) + * [Timeouts](#timeouts) +* [Troubleshooting authentication issues](#troubleshooting-authentication-issues) + * [Authentication errors](#authentication-errors) + * [Authorization errors](#authorization-errors) + * [Connection string issues](#connection-string-issues) +* [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) + * [Connection errors](#connection-errors) + * [Firewall and proxy issues](#firewall-and-proxy-issues) + * [Service busy errors](#service-busy-errors) +* [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) + * [Message lock issues](#message-lock-issues) + * [Message size issues](#message-size-issues) + * [Message settlement issues](#message-settlement-issues) + * [Dead letter queue issues](#dead-letter-queue-issues) +* [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) + * [Session lock issues](#session-lock-issues) + * [Session cannot be locked](#session-cannot-be-locked) +* [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) + * [Quota exceeded errors](#quota-exceeded-errors) + * [Entity not found errors](#entity-not-found-errors) +* [Frequently asked questions](#frequently-asked-questions) +* [Get additional help](#get-additional-help) + +## General troubleshooting + +Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and [azure.servicebus.exceptions](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions). + +### Enable client logging + +This library uses the standard [logging](https://docs.python.org/3/library/logging.html) library for logging. + +Basic information about HTTP sessions (URLs, headers, etc.) is logged at `INFO` level. + +Detailed `DEBUG` level logging, including request/response bodies and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. + +To enable client logging and AMQP frame level trace: + +```python +import logging +import sys + +handler = logging.StreamHandler(stream=sys.stdout) +log_fmt = logging.Formatter(fmt="%(asctime)s | %(threadName)s | %(levelname)s | %(name)s | %(message)s") +handler.setFormatter(log_fmt) +logger = logging.getLogger('azure.servicebus') +logger.setLevel(logging.DEBUG) +logger.addHandler(handler) + +# Enable AMQP frame level trace +from azure.servicebus import ServiceBusClient + +client = ServiceBusClient(connection_string, logging_enable=True) +``` + +See full Python SDK logging documentation with examples [here](https://learn.microsoft.com/azure/developer/python/azure-sdk-logging). + +### Common exceptions + +The Service Bus APIs generate the following exceptions in `azure.servicebus.exceptions`: + +#### Connection and Authentication Exceptions + +- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. + +- **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. + +- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation. It is recommended to check the permission of the credentials. + +#### Operation and Timeout Exceptions + +- **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. + +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. + +#### Message Handling Exceptions + +- **MessageSizeExceededError:** This indicates that the message content is larger than the service bus frame size. This could happen when too many service bus messages are sent in a batch or the content passed into the body of a `Message` is too large. It is recommended to reduce the count of messages being sent in a batch or the size of content being passed into a single `ServiceBusMessage`. + +- **MessageAlreadySettled:** This indicates failure to settle the message. This could happen when trying to settle an already-settled message. + +- **MessageLockLostError:** The lock on the message has expired and it has been released back to the queue. It will need to be received again in order to settle it. You should be aware of the lock duration of a message and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. + +- **MessageNotFoundError:** Attempt to receive a message with a particular sequence number. This message isn't found. Make sure the message hasn't been received already. Check the deadletter queue to see if the message has been deadlettered. + +#### Session Handling Exceptions + +- **SessionLockLostError:** The lock on the session has expired. All unsettled messages that have been received can no longer be settled. It is recommended to reconnect to the session if receive messages again if necessary. You should be aware of the lock duration of a session and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the session automatically renewed. + +- **SessionCannotBeLockedError:** Attempt to connect to a session with a specific session ID, but the session is currently locked by another client. Make sure the session is unlocked by other clients. + +#### Service and Entity Exceptions + +- **ServiceBusQuotaExceededError:** The messaging entity has reached its maximum allowable size, or the maximum number of connections to a namespace has been exceeded. Create space in the entity by receiving messages from the entity or its subqueues. + +- **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. + +- **MessagingEntityNotFoundError:** Entity associated with the operation doesn't exist or it has been deleted. Please make sure the entity exists. + +- **MessagingEntityDisabledError:** Request for a runtime operation on a disabled entity. Please activate the entity. + +#### Auto Lock Renewal Exceptions + +- **AutoLockRenewFailed:** An attempt to renew a lock on a message or session in the background has failed. This could happen when the receiver used by `AutoLockRenewer` is closed or the lock of the renewable has expired. It is recommended to re-register the renewable message or session by receiving the message or connect to the sessionful entity again. + +- **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. + +### Timeouts + +There are various timeouts a user should be aware of within the library: + +- **10 minute service side link closure:** A link, once opened, will be closed after 10 minutes idle to protect the service against resource leakage. This should largely be transparent to a user, but if you notice a reconnect occurring after such a duration, this is why. Performing any operations, including management operations, on the link will extend this timeout. + +- **max_wait_time:** Provided on creation of a receiver or when calling `receive_messages()`, the time after which receiving messages will halt after no traffic. This applies both to the imperative `receive_messages()` function as well as the length a generator-style receive will run for before exiting if there are no messages. Passing None (default) will wait forever, up until the 10 minute threshold if no other action is taken. + +> **NOTE:** If processing of a message or session is sufficiently long as to cause timeouts, as an alternative to calling `receiver.renew_message_lock`/`receiver.session.renew_lock` manually, one can leverage the `AutoLockRenewer` functionality. + +## Troubleshooting authentication issues + +### Authentication errors + +Authentication errors typically occur when the credentials provided are incorrect or have expired. + +**Common causes:** +- Incorrect connection string +- Expired SAS token +- Invalid managed identity configuration +- Wrong credential type being used + +**Resolution:** +1. Verify your connection string is correct and complete +2. Check if using SAS tokens that they haven't expired +3. For managed identity, ensure the identity is properly configured and has the necessary permissions +4. Test connectivity using a simple connection string first + +```python +# Example of proper authentication +from azure.servicebus import ServiceBusClient + +# Using connection string +client = ServiceBusClient.from_connection_string("your_connection_string") + +# Using Azure Identity +from azure.identity import DefaultAzureCredential +credential = DefaultAzureCredential() +client = ServiceBusClient("your_namespace.servicebus.windows.net", credential) +``` + +### Authorization errors + +Authorization errors occur when the authenticated identity doesn't have sufficient permissions. + +**Required permissions for Service Bus operations:** +- **Send:** Required to send messages to queues/topics +- **Listen:** Required to receive messages from queues/subscriptions +- **Manage:** Required for management operations (create/delete entities) + +**Resolution:** +1. Check the Access Control (IAM) settings in Azure portal +2. Ensure the identity has the appropriate Service Bus roles: + - `Azure Service Bus Data Owner` + - `Azure Service Bus Data Sender` + - `Azure Service Bus Data Receiver` +3. For connection strings, verify the SAS policy has the correct permissions + +### Connection string issues + +**Common connection string problems:** +- Missing required components (Endpoint, SharedAccessKeyName, SharedAccessKey) +- Incorrect namespace or entity names +- URL encoding issues with special characters + +**Example of correct connection string format:** +``` +Endpoint=sb://your-namespace.servicebus.windows.net/;SharedAccessKeyName=your-policy;SharedAccessKey=your-key +``` + +## Troubleshooting connectivity issues + +### Connection errors + +Connection errors can occur due to network issues, firewall restrictions, or service problems. + +**Common causes:** +- Network connectivity issues +- DNS resolution problems +- Firewall or proxy blocking connections +- Service Bus namespace not accessible from current location + +**Resolution:** +1. Test basic network connectivity to `your-namespace.servicebus.windows.net` on port 5671 (AMQP) or 443 (AMQP over WebSockets) +2. Try using AMQP over WebSockets if regular AMQP is blocked: + +```python +from azure.servicebus import ServiceBusClient, TransportType + +client = ServiceBusClient.from_connection_string( + connection_string, + transport_type=TransportType.AmqpOverWebsocket +) +``` + +### Firewall and proxy issues + +If your environment has strict firewall rules or requires proxy configuration: + +**For firewall:** +- Allow outbound connections to `*.servicebus.windows.net` on ports 5671-5672 (AMQP) and 443 (HTTPS/WebSockets) +- Consider using AMQP over WebSockets (port 443) if AMQP ports are blocked + +**For proxy:** +- Service Bus supports HTTP CONNECT proxy for AMQP over WebSockets +- Configure proxy settings in your environment variables or application + +### Service busy errors + +`ServiceBusServerBusyError` indicates the service is temporarily overloaded. + +**Resolution:** +1. Implement exponential backoff retry logic +2. Reduce the frequency of requests +3. Consider scaling up your Service Bus tier if errors persist + +## Troubleshooting message handling issues + +### Message lock issues + +Messages in Service Bus have a lock duration during which they must be settled (completed, abandoned, etc.). + +**MessageLockLostError resolution:** +1. Process messages faster or increase lock duration +2. Use `AutoLockRenewer` for long-running processing: + +```python +from azure.servicebus import AutoLockRenewer + +renewer = AutoLockRenewer() +with receiver: + received_msgs = receiver.receive_messages(max_message_count=10) + for message in received_msgs: + renewer.register(receiver, message, max_lock_renewal_duration=300) + # Process message + receiver.complete_message(message) +``` + +3. Handle lock lost errors gracefully by catching the exception and potentially re-receiving the message + +### Message size issues + +**MessageSizeExceededError resolution:** +1. Reduce message payload size +2. Use message properties and application properties for metadata instead of body +3. For batch operations, reduce the number of messages in the batch +4. Consider splitting large messages across multiple smaller messages + +**Service Bus message size limits:** +- Standard tier: 256 KB per message +- Premium tier: 1 MB per message + +### Message settlement issues + +**MessageAlreadySettled resolution:** +1. Ensure you're not trying to settle the same message multiple times +2. Check your application logic for race conditions +3. Use try-catch blocks when settling messages + +```python +try: + receiver.complete_message(message) +except MessageAlreadySettled: + # Message was already settled, this is expected in some scenarios + pass +``` + +### Dead letter queue issues + +Messages can be moved to the dead letter queue for various reasons: + +**Common reasons:** +- Message TTL expired +- Max delivery count exceeded +- Message was explicitly dead lettered +- Message processing failed repeatedly + +**Debugging dead letter messages:** +```python +# Receive from dead letter queue +dlq_receiver = servicebus_client.get_queue_receiver( + queue_name="your_queue", + sub_queue=ServiceBusSubQueue.DEAD_LETTER +) + +with dlq_receiver: + messages = dlq_receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Dead letter reason: {message.dead_letter_reason}") + print(f"Dead letter description: {message.dead_letter_error_description}") +``` + +## Troubleshooting session handling issues + +### Session lock issues + +Session-enabled entities require proper session management. + +**SessionLockLostError resolution:** +1. Renew session locks before they expire +2. Use `AutoLockRenewer` for automatic session lock renewal +3. Handle session lock lost errors by reconnecting to the session + +```python +from azure.servicebus import AutoLockRenewer + +renewer = AutoLockRenewer() +with receiver: + session = receiver.session + renewer.register(receiver, session, max_lock_renewal_duration=300) + # Process messages in session +``` + +### Session cannot be locked + +**SessionCannotBeLockedError resolution:** +1. Ensure no other clients are already connected to the same session +2. Wait for the current session lock to expire before reconnecting +3. Use a different session ID if specific session is not required + +## Troubleshooting quota and capacity issues + +### Quota exceeded errors + +**ServiceBusQuotaExceededError resolution:** +1. **For message count limits:** Receive and process messages to reduce queue/subscription size +2. **For size limits:** Remove old messages or increase entity size limits +3. **For connection limits:** Close unused connections or consider scaling to Premium tier + +### Entity not found errors + +**MessagingEntityNotFoundError resolution:** +1. Verify the queue/topic/subscription name is spelled correctly +2. Ensure the entity exists in the Service Bus namespace +3. Check if the entity was deleted and needs to be recreated +4. Verify you're connecting to the correct namespace + +## Frequently asked questions + +### Q: Why am I getting connection timeout errors? + +**A:** Connection timeouts can occur due to: +- Network connectivity issues +- Firewall blocking AMQP ports (5671-5672) +- DNS resolution problems + +Try using AMQP over WebSockets (port 443) or check your network configuration. + +### Q: How do I handle transient errors? + +**A:** Implement retry logic with exponential backoff for transient errors like: +- `ServiceBusConnectionError` +- `OperationTimeoutError` +- `ServiceBusServerBusyError` + +### Q: Why are my messages going to the dead letter queue? + +**A:** Common reasons include: +- Message TTL expiration +- Maximum delivery count exceeded +- Explicit dead lettering in message processing logic +- Poison message detection + +Check the `dead_letter_reason` and `dead_letter_error_description` properties on dead lettered messages. + +### Q: How do I process messages faster? + +**A:** Consider: +- Receiving messages in batches +- Using concurrent message processing +- Optimizing your message processing logic +- Using `prefetch_count` to pre-fetch messages +- Scaling out with multiple receivers + +### Q: What's the difference between `complete_message()` and `abandon_message()`? + +**A:** +- `complete_message()`: Removes the message from the queue/subscription (successful processing) +- `abandon_message()`: Returns the message to the queue/subscription for reprocessing + +## Get additional help + +Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. + +### Filing GitHub issues + +When filing GitHub issues for Service Bus, please include: + +1. **Environment details:** + - Python version + - Azure Service Bus SDK version + - Operating system + +2. **Service Bus configuration:** + - Namespace tier (Basic, Standard, Premium) + - Entity configuration (queue, topic, subscription settings) + - Session-enabled or not + +3. **Code sample:** Minimal reproducible code sample + +4. **Logs:** DEBUG level logs with transport logging enabled (see [Enable client logging](#enable-client-logging)) + +5. **Error details:** Complete exception stack trace and error messages + +The more information provided, the faster we can help resolve your issue. + +Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. \ No newline at end of file From efb3b47030f0b17e48399d8c9f025bfbeafcb3c3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 29 May 2025 23:09:07 +0000 Subject: [PATCH 03/28] Replace connection string with Azure Identity credential auth in logging example Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index a82f9ad1469d..cb653c6212c8 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -57,8 +57,10 @@ logger.addHandler(handler) # Enable AMQP frame level trace from azure.servicebus import ServiceBusClient +from azure.identity import DefaultAzureCredential -client = ServiceBusClient(connection_string, logging_enable=True) +credential = DefaultAzureCredential() +client = ServiceBusClient(fully_qualified_namespace, credential, logging_enable=True) ``` See full Python SDK logging documentation with examples [here](https://learn.microsoft.com/azure/developer/python/azure-sdk-logging). From 604c3e76fefefd78cdb508bd6278a50924c2741f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 30 May 2025 13:14:18 +0000 Subject: [PATCH 04/28] Update TROUBLESHOOTING.md based on PR feedback Co-authored-by: kashifkhan <361477+kashifkhan@users.noreply.github.com> --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index cb653c6212c8..dd90c6e4088f 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -267,6 +267,8 @@ with receiver: - Standard tier: 256 KB per message - Premium tier: 1 MB per message +For the most up-to-date information on Service Bus limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. + ### Message settlement issues **MessageAlreadySettled resolution:** @@ -383,7 +385,6 @@ Check the `dead_letter_reason` and `dead_letter_error_description` properties on ### Q: How do I process messages faster? **A:** Consider: -- Receiving messages in batches - Using concurrent message processing - Optimizing your message processing logic - Using `prefetch_count` to pre-fetch messages @@ -421,4 +422,4 @@ When filing GitHub issues for Service Bus, please include: The more information provided, the faster we can help resolve your issue. -Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. \ No newline at end of file +Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. From d60ef1d3a84fff4891a31c28d0c2ca50a8e542c3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 30 May 2025 13:16:27 +0000 Subject: [PATCH 05/28] Update WebSockets example to explicitly show connection string usage Co-authored-by: kashifkhan <361477+kashifkhan@users.noreply.github.com> --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index dd90c6e4088f..595221b4d5a6 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -204,8 +204,9 @@ Connection errors can occur due to network issues, firewall restrictions, or ser ```python from azure.servicebus import ServiceBusClient, TransportType +# Using connection string with WebSockets client = ServiceBusClient.from_connection_string( - connection_string, + "your_connection_string", transport_type=TransportType.AmqpOverWebsocket ) ``` From aec77c9df53a6daf2f31daf680a7bf27310cb77f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 5 Jun 2025 17:02:14 +0000 Subject: [PATCH 06/28] Update WebSockets example to use Azure Identity instead of connection string Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 595221b4d5a6..fa6f53f34ffe 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -203,10 +203,13 @@ Connection errors can occur due to network issues, firewall restrictions, or ser ```python from azure.servicebus import ServiceBusClient, TransportType +from azure.identity import DefaultAzureCredential -# Using connection string with WebSockets -client = ServiceBusClient.from_connection_string( - "your_connection_string", +# Using Azure Identity with WebSockets +credential = DefaultAzureCredential() +client = ServiceBusClient( + "your_namespace.servicebus.windows.net", + credential, transport_type=TransportType.AmqpOverWebsocket ) ``` From 03cbdbe5a18a7c4b79cb0e523c8f706dc245e92f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 5 Jun 2025 19:05:16 +0000 Subject: [PATCH 07/28] Add comprehensive content sections to TROUBLESHOOTING.md Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 549 ++++++++++++++++++ 1 file changed, 549 insertions(+) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index fa6f53f34ffe..318d4fd9cc4b 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -24,9 +24,21 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) +* [Troubleshooting sender issues](#troubleshooting-sender-issues) + * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) + * [Batch fails to send](#batch-fails-to-send) + * [Message encoding issues](#message-encoding-issues) +* [Troubleshooting receiver issues](#troubleshooting-receiver-issues) + * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) + * [Message completion behavior](#message-completion-behavior) + * [Receive operation hangs](#receive-operation-hangs) + * [Messages not being received](#messages-not-being-received) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) + * [Async/await best practices](#asyncawait-best-practices) * [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) @@ -341,6 +353,406 @@ with receiver: 2. Wait for the current session lock to expire before reconnecting 3. Use a different session ID if specific session is not required +## Troubleshooting sender issues + +### Cannot send batch with multiple partition keys + +When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. + +**Error symptoms:** +- Messages are rejected or go to different partitions than expected +- Inconsistent message ordering + +**Resolution:** +1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** +```python +from azure.servicebus import ServiceBusMessage + +# Correct: All messages have the same session_id +messages = [ + ServiceBusMessage("Message 1", session_id="session1"), + ServiceBusMessage("Message 2", session_id="session1"), + ServiceBusMessage("Message 3", session_id="session1") +] + +with sender: + sender.send_messages(messages) +``` + +2. **For partitioned entities, group messages by partition key:** +```python +# Group messages by partition key before sending +partition1_messages = [ + ServiceBusMessage("Message 1", application_properties={"region": "east"}), + ServiceBusMessage("Message 2", application_properties={"region": "east"}) +] + +partition2_messages = [ + ServiceBusMessage("Message 3", application_properties={"region": "west"}), + ServiceBusMessage("Message 4", application_properties={"region": "west"}) +] + +# Send each group separately +with sender: + sender.send_messages(partition1_messages) + sender.send_messages(partition2_messages) +``` + +### Batch fails to send + +The Service Bus service has size limits for message batches and individual messages. + +**Error symptoms:** +- `MessageSizeExceededError` when sending batches +- Messages larger than expected failing to send + +**Resolution:** +1. **Reduce batch size or message payload:** +```python +from azure.servicebus import ServiceBusMessage +from azure.servicebus.exceptions import MessageSizeExceededError +import json + +def send_large_dataset(sender, data_list, max_batch_size=100): + """Send large datasets in smaller batches""" + for i in range(0, len(data_list), max_batch_size): + batch = data_list[i:i + max_batch_size] + messages = [ServiceBusMessage(json.dumps(item)) for item in batch] + + try: + sender.send_messages(messages) + except MessageSizeExceededError: + # If batch is still too large, send individually + for message in messages: + sender.send_messages(message) +``` + +2. **Check message size limits:** + - Standard tier: 256 KB per message + - Premium tier: 1 MB per message + - Batch limit: 1 MB regardless of tier + +3. **Use message properties for metadata instead of body:** +```python +# Instead of including metadata in message body +large_message = ServiceBusMessage(json.dumps({ + "data": large_data_payload, + "metadata": {"source": "app1", "timestamp": "2023-01-01"} +})) + +# Use application properties for metadata +optimized_message = ServiceBusMessage(large_data_payload) +optimized_message.application_properties = { + "source": "app1", + "timestamp": "2023-01-01" +} +``` + +### Message encoding issues + +Python string encoding can cause issues when sending messages with special characters. + +**Error symptoms:** +- Messages appear corrupted on the receiver side +- Encoding/decoding exceptions + +**Resolution:** +1. **Explicitly handle string encoding:** +```python +import json +from azure.servicebus import ServiceBusMessage + +# For text messages, ensure proper UTF-8 encoding +text_data = "Message with special characters: ñáéíóú" +message = ServiceBusMessage(text_data.encode('utf-8')) + +# For JSON data, use explicit encoding +json_data = {"message": "Data with unicode: ñáéíóú"} +json_string = json.dumps(json_data, ensure_ascii=False) +message = ServiceBusMessage(json_string.encode('utf-8')) + +# Set content type to help receivers +message.content_type = "application/json; charset=utf-8" +``` + +2. **Handle binary data correctly:** +```python +# For binary data, pass bytes directly +binary_data = b"\x00\x01\x02\x03" +message = ServiceBusMessage(binary_data) +message.content_type = "application/octet-stream" +``` + +## Troubleshooting receiver issues + +### Number of messages returned doesn't match number requested + +When attempting to receive multiple messages using `receive_messages()` with `max_message_count` greater than 1, you're not guaranteed to receive the exact number requested. + +**Why this happens:** +- Service Bus optimizes for throughput and latency +- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages +- The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones + +**Resolution:** +1. **Don't assume all available messages will be received in one call:** +```python +import time +from azure.servicebus.exceptions import MessagingEntityNotFoundError, MessagingEntityDisabledError + +def receive_all_available_messages(receiver, total_expected=None): + """Receive all available messages from a queue/subscription""" + all_messages = [] + + while True: + # Receive in batches + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if not messages: + break # No more messages available + + all_messages.extend(messages) + + # Process messages immediately to avoid lock expiration + for message in messages: + try: + # Process message logic here + print(f"Processing: {message}") + receiver.complete_message(message) + except Exception as e: + print(f"Error processing message: {e}") + receiver.abandon_message(message) + + return all_messages +``` + +2. **Use continuous receiving for stream processing:** +```python +import time + +def continuous_message_processing(receiver): + """Continuously process messages as they arrive""" + while True: + try: + messages = receiver.receive_messages(max_message_count=1, max_wait_time=60) + + for message in messages: + # Process immediately + try: + process_message(message) + receiver.complete_message(message) + except Exception as e: + print(f"Processing failed: {e}") + receiver.abandon_message(message) + + except KeyboardInterrupt: + break + except Exception as e: + print(f"Receive error: {e}") + time.sleep(5) # Brief pause before retry +``` + +### Message completion behavior + +**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. + +**What this means:** +- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately +- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled +- This can lead to scenarios where the local operation succeeds but the service operation fails + +**Implications:** +1. **Message state uncertainty:** +```python +# This operation may succeed locally but fail on the service +try: + receiver.complete_message(message) + print("Message completed successfully") # This may be misleading +except Exception as e: + print(f"Local completion failed: {e}") + # But even if no exception, service operation might have failed +``` + +2. **Potential message redelivery:** +- If the service doesn't receive the completion acknowledgment, the message may be redelivered +- This can lead to duplicate processing if not handled properly + +**Mitigation strategies:** +1. **Implement idempotent message processing:** +```python +import hashlib + +processed_messages = set() + +def process_message_idempotently(receiver, message): + """Process messages in an idempotent manner""" + # Create a unique identifier for the message + message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() + + if message_id in processed_messages: + print(f"Message {message_id} already processed, skipping") + receiver.complete_message(message) + return + + try: + # Your message processing logic here + result = process_business_logic(message) + + # Record successful processing before completing + processed_messages.add(message_id) + receiver.complete_message(message) + + return result + except Exception as e: + print(f"Processing failed for message {message_id}: {e}") + receiver.abandon_message(message) + raise +``` + +2. **Use external tracking for critical operations:** +```python +import logging + +def track_message_completion(receiver, message, tracking_store): + """Track message completion in external store""" + message_id = message.message_id + + try: + # Process the message + result = process_message(message) + + # Store completion in external tracking system + tracking_store.mark_completed(message_id, result) + + # Complete the message in Service Bus + receiver.complete_message(message) + + logging.info(f"Message {message_id} processed and completed successfully") + + except Exception as e: + logging.error(f"Failed to process message {message_id}: {e}") + + # Check if we should retry or dead letter + if should_retry(message, e): + receiver.abandon_message(message) + else: + receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) +``` + +3. **Monitor for redelivered messages:** +```python +def handle_potential_redelivery(receiver, message): + """Handle messages that might be redelivered due to completion uncertainty""" + delivery_count = message.delivery_count + + if delivery_count > 1: + logging.warning(f"Message has been delivered {delivery_count} times. " + f"This might indicate completion acknowledgment issues.") + + # Process with extra caution for high delivery count messages + if delivery_count > 3: + # Consider different processing logic or dead lettering + logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") + receiver.dead_letter_message(message, + reason="HighDeliveryCount", + error_description=f"Delivered {delivery_count} times") + return + + # Normal processing + process_message_idempotently(receiver, message) +``` + +### Receive operation hangs + +Receive operations may appear to hang when no messages are available. + +**Symptoms:** +- `receive_messages()` doesn't return for extended periods +- Application appears unresponsive + +**Resolution:** +1. **Set appropriate timeouts:** +```python +# Don't wait indefinitely for messages +messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) + +# For polling scenarios, use shorter timeouts +def poll_for_messages(receiver): + while True: + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if messages: + for message in messages: + process_message(message) + receiver.complete_message(message) + else: + print("No messages available, waiting...") + time.sleep(1) +``` + +2. **Use async operations with proper cancellation:** +```python +import asyncio + +async def receive_with_cancellation(receiver): + try: + # Use asyncio timeout for better control + messages = await asyncio.wait_for( + receiver.receive_messages(max_message_count=10, max_wait_time=30), + timeout=35 # Slightly longer than max_wait_time + ) + return messages + except asyncio.TimeoutError: + print("Receive operation timed out") + return [] +``` + +### Messages not being received + +Messages might not be received due to various configuration or state issues. + +**Common causes and resolutions:** + +1. **Check entity state:** +```python +# Verify the queue/subscription exists and is active +try: + # This will fail if entity doesn't exist + receiver = client.get_queue_receiver(queue_name) + messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) + + if not messages: + print("No messages available - check if messages are being sent") + +except MessagingEntityNotFoundError: + print("Queue/subscription does not exist") +except MessagingEntityDisabledError: + print("Queue/subscription is disabled") +``` + +2. **Verify message filters (for subscriptions):** +```python +# For topic subscriptions, check if messages match subscription filters +from azure.servicebus.management import ServiceBusAdministrationClient + +admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) + +# Check subscription rules +rules = admin_client.list_rules(topic_name, subscription_name) +for rule in rules: + print(f"Rule: {rule.name}, Filter: {rule.filter}") +``` + +3. **Check for competing consumers:** +```python +# Multiple receivers on the same queue will compete for messages +# Ensure this is intended behavior or use topic/subscription pattern + +# For debugging, temporarily use peek to see if messages exist +messages = receiver.peek_messages(max_message_count=10) +print(f"Found {len(messages)} messages in queue without receiving them") +``` + ## Troubleshooting quota and capacity issues ### Quota exceeded errors @@ -358,6 +770,143 @@ with receiver: 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace +## Threading and concurrency issues + +### Thread safety limitations + +**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: + +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior + +**Best practices:** + +1. **Use separate client instances per thread/task:** +```python +import threading +from azure.servicebus import ServiceBusClient + +def worker_thread(connection_string, queue_name): + # Create a separate client instance for each thread + client = ServiceBusClient.from_connection_string(connection_string) + with client: + sender = client.get_queue_sender(queue_name) + with sender: + # Perform operations... + pass + +# Start multiple threads with separate clients +threads = [] +for i in range(5): + t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) + threads.append(t) + t.start() + +for t in threads: + t.join() +``` + +2. **Use connection pooling patterns when needed:** +```python +# For high-throughput scenarios, consider using a thread-safe queue +# to manage client instances +import queue +import threading + +client_pool = queue.Queue() + +def get_client(): + try: + return client_pool.get_nowait() + except queue.Empty: + return ServiceBusClient.from_connection_string(connection_string) + +def return_client(client): + try: + client_pool.put_nowait(client) + except queue.Full: + client.close() +``` + +3. **Avoid sharing clients across async tasks:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) + +async def bad_async_pattern(): + # Multiple tasks sharing the same client can cause issues + sender = client.get_queue_sender(queue_name) + # This can lead to race conditions + +# DO THIS INSTEAD +async def good_async_pattern(): + # Each async function should use its own client + async with ServiceBusClient.from_connection_string(connection_string) as client: + sender = client.get_queue_sender(queue_name) + async with sender: + # Perform operations safely + pass +``` + +### Async/await best practices + +When using the async APIs in the Python Service Bus SDK: + +1. **Always use async context managers properly:** +```python +async def proper_async_usage(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello World") + await sender.send_messages(message) + + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + await receiver.complete_message(message) +``` + +2. **Don't mix sync and async code without proper handling:** +```python +# Avoid mixing sync and async incorrectly +async def mixed_code_example(): + # Don't call synchronous methods from async context without wrapping + # client = ServiceBusClient.from_connection_string(conn_str) # This is sync + + # Instead, create clients within async context or use proper wrapping + async with ServiceBusClient.from_connection_string(conn_str) as client: + pass +``` + +3. **Handle async exceptions properly:** +```python +import asyncio +from azure.servicebus import ServiceBusError + +async def handle_async_errors(): + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) + # Process messages... + except ServiceBusError as e: + print(f"Service Bus error: {e}") + except asyncio.TimeoutError: + print("Operation timed out") + except Exception as e: + print(f"Unexpected error: {e}") +``` + +**Common threading/concurrency mistakes to avoid:** + +- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads +- Not properly closing clients and their resources in multi-threaded scenarios +- Using the same connection string with too many concurrent clients (can hit connection limits) +- Mixing blocking and non-blocking operations incorrectly +- Not handling connection failures in multi-threaded scenarios + ## Frequently asked questions ### Q: Why am I getting connection timeout errors? From 8930ce819889ea16f5cf99b8eaa9393f8c3bf1b3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 5 Jun 2025 19:07:40 +0000 Subject: [PATCH 08/28] Complete TROUBLESHOOTING.md with async operations and enhanced content Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 363 +++++++++++++++++- 1 file changed, 360 insertions(+), 3 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 318d4fd9cc4b..91843b0eebd1 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -39,6 +39,10 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Threading and concurrency issues](#threading-and-concurrency-issues) * [Thread safety limitations](#thread-safety-limitations) * [Async/await best practices](#asyncawait-best-practices) +* [Troubleshooting async operations](#troubleshooting-async-operations) + * [Event loop issues](#event-loop-issues) + * [Async context manager problems](#async-context-manager-problems) + * [Mixing sync and async code](#mixing-sync-and-async-code) * [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) @@ -127,6 +131,34 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. +#### Python-Specific Considerations + +- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: +```bash +pip install azure-servicebus +``` + +- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: +```python +# Incorrect: passing string instead of ServiceBusMessage +sender.send_messages("Hello World") # This will fail + +# Correct: create ServiceBusMessage objects +from azure.servicebus import ServiceBusMessage +message = ServiceBusMessage("Hello World") +sender.send_messages(message) +``` + +- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: +```python +import socket +try: + # Test DNS resolution + socket.gethostbyname("your-namespace.servicebus.windows.net") +except socket.gaierror as e: + print(f"DNS resolution failed: {e}") +``` + ### Timeouts There are various timeouts a user should be aware of within the library: @@ -907,6 +939,249 @@ async def handle_async_errors(): - Mixing blocking and non-blocking operations incorrectly - Not handling connection failures in multi-threaded scenarios +## Troubleshooting async operations + +### Event loop issues + +Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. + +**Common symptoms:** +- `RuntimeError: no running event loop` +- `RuntimeError: cannot be called from a running event loop` +- Async operations hanging indefinitely + +**Resolution:** + +1. **Proper event loop management:** +```python +import asyncio +from azure.servicebus.aio import ServiceBusClient + +async def main(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello async world") + await sender.send_messages(message) + +# Correct way to run async Service Bus code +if __name__ == "__main__": + asyncio.run(main()) +``` + +2. **Handling existing event loops (e.g., in Jupyter notebooks):** +```python +import asyncio +import nest_asyncio + +# In environments like Jupyter where an event loop is already running +nest_asyncio.apply() + +async def notebook_friendly_function(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # Your async Service Bus operations + pass + +# Can be called directly in Jupyter +await notebook_friendly_function() +``` + +3. **Event loop in multi-threaded applications:** +```python +import asyncio +import threading +from concurrent.futures import ThreadPoolExecutor + +def run_async_in_thread(connection_string, queue_name): + """Run async Service Bus operations in a separate thread""" + async def async_operations(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Received: {message}") + await receiver.complete_message(message) + + # Create new event loop for this thread + asyncio.run(async_operations()) + +# Use ThreadPoolExecutor for better management +with ThreadPoolExecutor(max_workers=3) as executor: + futures = [ + executor.submit(run_async_in_thread, connection_string, f"queue_{i}") + for i in range(3) + ] + + for future in futures: + future.result() # Wait for completion +``` + +### Async context manager problems + +Improper use of async context managers can lead to resource leaks and connection issues. + +**Common mistakes:** + +1. **Not using async context managers:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +await sender.send_messages(message) +# Resources not properly closed + +# DO THIS INSTEAD +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Improper exception handling in async context:** +```python +async def proper_exception_handling(): + """Handle exceptions properly in async context managers""" + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + + for message in messages: + try: + # Process message + await process_message_async(message) + await receiver.complete_message(message) + except Exception as processing_error: + print(f"Processing failed: {processing_error}") + await receiver.abandon_message(message) + + except ServiceBusError as sb_error: + print(f"Service Bus error: {sb_error}") + except Exception as general_error: + print(f"Unexpected error: {general_error}") +``` + +3. **Resource cleanup in long-running async operations:** +```python +import asyncio +from contextlib import AsyncExitStack + +async def long_running_processor(): + """Properly manage resources in long-running async operations""" + async with AsyncExitStack() as stack: + client = await stack.enter_async_context( + ServiceBusClient.from_connection_string(connection_string) + ) + receiver = await stack.enter_async_context( + client.get_queue_receiver(queue_name) + ) + + # Long-running processing loop + while True: + try: + messages = await receiver.receive_messages( + max_message_count=10, + max_wait_time=30 + ) + + if not messages: + await asyncio.sleep(1) + continue + + # Process messages with proper error handling + await process_messages_batch(receiver, messages) + + except KeyboardInterrupt: + print("Shutting down gracefully...") + break + except Exception as e: + print(f"Error in processing loop: {e}") + await asyncio.sleep(5) # Brief pause before retry + +async def process_messages_batch(receiver, messages): + """Process a batch of messages with individual error handling""" + for message in messages: + try: + await process_single_message(message) + await receiver.complete_message(message) + except Exception as e: + print(f"Failed to process message {message.message_id}: {e}") + await receiver.abandon_message(message) +``` + +### Mixing sync and async code + +Mixing synchronous and asynchronous Service Bus operations can cause issues. + +**Common problems:** + +1. **Calling async methods without await:** +```python +# WRONG - This returns a coroutine, doesn't actually send +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +sender.send_messages(message) # Missing 'await' + +# CORRECT +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Using sync and async clients together:** +```python +# Avoid mixing sync and async clients in the same application +# Choose one pattern and stick with it + +# Option 1: Pure async +async def async_pattern(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are async + pass + +# Option 2: Pure sync +def sync_pattern(): + with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are sync + pass +``` + +3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** +```python +# Example with FastAPI +from fastapi import FastAPI, BackgroundTasks +from azure.servicebus.aio import ServiceBusClient + +app = FastAPI() + +# Global client for reuse (properly managed) +class ServiceBusManager: + def __init__(self): + self.client = None + + async def start(self): + self.client = ServiceBusClient.from_connection_string(connection_string) + + async def stop(self): + if self.client: + await self.client.close() + +sb_manager = ServiceBusManager() + +@app.on_event("startup") +async def startup_event(): + await sb_manager.start() + +@app.on_event("shutdown") +async def shutdown_event(): + await sb_manager.stop() + +@app.post("/send-message") +async def send_message(message_content: str): + async with sb_manager.client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage(message_content) + await sender.send_messages(message) + return {"status": "sent"} +``` + ## Frequently asked questions ### Q: Why am I getting connection timeout errors? @@ -938,10 +1213,12 @@ Check the `dead_letter_reason` and `dead_letter_error_description` properties on ### Q: How do I process messages faster? **A:** Consider: -- Using concurrent message processing +- Using concurrent message processing (with separate client instances per thread/task) - Optimizing your message processing logic -- Using `prefetch_count` to pre-fetch messages -- Scaling out with multiple receivers +- Using `prefetch_count` to pre-fetch messages (use with caution - see note below) +- Scaling out with multiple receivers (on different clients) + +**Note on prefetch_count:** Be careful when using `prefetch_count` as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. ### Q: What's the difference between `complete_message()` and `abandon_message()`? @@ -949,6 +1226,86 @@ Check the `dead_letter_reason` and `dead_letter_error_description` properties on - `complete_message()`: Removes the message from the queue/subscription (successful processing) - `abandon_message()`: Returns the message to the queue/subscription for reprocessing +**Important:** Due to Python AMQP implementation limitations, these operations return immediately without waiting for service acknowledgment. Implement idempotent processing to handle potential redelivery. + +### Q: How do I handle message ordering? + +**A:** +- Use **sessions** for guaranteed message ordering within a session +- For partitioned entities, messages with the same partition key maintain order +- Regular queues do not guarantee strict FIFO ordering + +```python +# Using sessions for ordered processing +with client.get_queue_receiver(queue_name, session_id="order_123") as session_receiver: + messages = session_receiver.receive_messages(max_message_count=10) + + # Messages within this session are processed in order + for message in messages: + process_message_in_order(message) + session_receiver.complete_message(message) +``` + +### Q: How do I implement retry logic for transient failures? + +**A:** +```python +import time +import random +from azure.servicebus.exceptions import ServiceBusError + +def exponential_backoff_retry(operation, max_retries=3): + """Implement exponential backoff retry for Service Bus operations""" + for attempt in range(max_retries + 1): + try: + return operation() + except ServiceBusError as e: + if attempt == max_retries: + raise + + # Check if error is retryable + if hasattr(e, 'reason'): + retryable_reasons = ['ServiceTimeout', 'ServerBusy', 'ServiceCommunicationProblem'] + if e.reason not in retryable_reasons: + raise + + # Calculate backoff delay + delay = (2 ** attempt) + random.uniform(0, 1) + print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.2f} seconds...") + time.sleep(delay) + +# Usage example +def send_with_retry(sender, message): + return exponential_backoff_retry(lambda: sender.send_messages(message)) +``` + +### Q: How do I monitor message processing performance? + +**A:** +```python +import time +import logging +from contextlib import contextmanager + +@contextmanager +def message_processing_timer(message_id): + """Context manager to time message processing""" + start_time = time.time() + try: + yield + finally: + processing_time = time.time() - start_time + logging.info(f"Message {message_id} processed in {processing_time:.3f}s") + +# Usage +def process_with_monitoring(receiver, message): + with message_processing_timer(message.message_id): + # Your processing logic + result = process_message(message) + receiver.complete_message(message) + return result +``` + ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. From 0cb1fd7548b5634e567c1fa630caac89177707e4 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 04:15:15 +0000 Subject: [PATCH 09/28] Address multiple review comments - remove sections, move sections, update links Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 611 +++++------------- 1 file changed, 164 insertions(+), 447 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 91843b0eebd1..de421acc7370 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -97,7 +97,7 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For information about required ports, see [What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). #### Message Handling Exceptions @@ -169,6 +169,143 @@ There are various timeouts a user should be aware of within the library: > **NOTE:** If processing of a message or session is sufficiently long as to cause timeouts, as an alternative to calling `receiver.renew_message_lock`/`receiver.session.renew_lock` manually, one can leverage the `AutoLockRenewer` functionality. +## Threading and concurrency issues + +### Thread safety limitations + +**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: + +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior + +**Best practices:** + +1. **Use separate client instances per thread/task:** +```python +import threading +from azure.servicebus import ServiceBusClient + +def worker_thread(connection_string, queue_name): + # Create a separate client instance for each thread + client = ServiceBusClient.from_connection_string(connection_string) + with client: + sender = client.get_queue_sender(queue_name) + with sender: + # Perform operations... + pass + +# Start multiple threads with separate clients +threads = [] +for i in range(5): + t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) + threads.append(t) + t.start() + +for t in threads: + t.join() +``` + +2. **Use connection pooling patterns when needed:** +```python +# For high-throughput scenarios, consider using a thread-safe queue +# to manage client instances +import queue +import threading + +client_pool = queue.Queue() + +def get_client(): + try: + return client_pool.get_nowait() + except queue.Empty: + return ServiceBusClient.from_connection_string(connection_string) + +def return_client(client): + try: + client_pool.put_nowait(client) + except queue.Full: + client.close() +``` + +3. **Avoid sharing clients across async tasks:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) + +async def bad_async_pattern(): + # Multiple tasks sharing the same client can cause issues + sender = client.get_queue_sender(queue_name) + # This can lead to race conditions + +# DO THIS INSTEAD +async def good_async_pattern(): + # Each async function should use its own client + async with ServiceBusClient.from_connection_string(connection_string) as client: + sender = client.get_queue_sender(queue_name) + async with sender: + # Perform operations safely + pass +``` + +### Async/await best practices + +When using the async APIs in the Python Service Bus SDK: + +1. **Always use async context managers properly:** +```python +async def proper_async_usage(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello World") + await sender.send_messages(message) + + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + await receiver.complete_message(message) +``` + +2. **Don't mix sync and async code without proper handling:** +```python +# Avoid mixing sync and async incorrectly +async def mixed_code_example(): + # Don't call synchronous methods from async context without wrapping + # client = ServiceBusClient.from_connection_string(conn_str) # This is sync + + # Instead, create clients within async context or use proper wrapping + async with ServiceBusClient.from_connection_string(conn_str) as client: + pass +``` + +3. **Handle async exceptions properly:** +```python +import asyncio +from azure.servicebus import ServiceBusError + +async def handle_async_errors(): + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) + # Process messages... + except ServiceBusError as e: + print(f"Service Bus error: {e}") + except asyncio.TimeoutError: + print("Operation timed out") + except Exception as e: + print(f"Unexpected error: {e}") +``` + +**Common threading/concurrency mistakes to avoid:** + +- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads +- Not properly closing clients and their resources in multi-threaded scenarios +- Using the same connection string with too many concurrent clients (can hit connection limits) +- Mixing blocking and non-blocking operations incorrectly +- Not handling connection failures in multi-threaded scenarios + ## Troubleshooting authentication issues ### Authentication errors @@ -267,7 +404,7 @@ If your environment has strict firewall rules or requires proxy configuration: - Consider using AMQP over WebSockets (port 443) if AMQP ports are blocked **For proxy:** -- Service Bus supports HTTP CONNECT proxy for AMQP over WebSockets +- Service Bus supports HTTP proxy for AMQP over WebSockets - Configure proxy settings in your environment variables or application ### Service busy errors @@ -332,31 +469,6 @@ except MessageAlreadySettled: pass ``` -### Dead letter queue issues - -Messages can be moved to the dead letter queue for various reasons: - -**Common reasons:** -- Message TTL expired -- Max delivery count exceeded -- Message was explicitly dead lettered -- Message processing failed repeatedly - -**Debugging dead letter messages:** -```python -# Receive from dead letter queue -dlq_receiver = servicebus_client.get_queue_receiver( - queue_name="your_queue", - sub_queue=ServiceBusSubQueue.DEAD_LETTER -) - -with dlq_receiver: - messages = dlq_receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Dead letter reason: {message.dead_letter_reason}") - print(f"Dead letter description: {message.dead_letter_error_description}") -``` - ## Troubleshooting session handling issues ### Session lock issues @@ -387,134 +499,6 @@ with receiver: ## Troubleshooting sender issues -### Cannot send batch with multiple partition keys - -When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. - -**Error symptoms:** -- Messages are rejected or go to different partitions than expected -- Inconsistent message ordering - -**Resolution:** -1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** -```python -from azure.servicebus import ServiceBusMessage - -# Correct: All messages have the same session_id -messages = [ - ServiceBusMessage("Message 1", session_id="session1"), - ServiceBusMessage("Message 2", session_id="session1"), - ServiceBusMessage("Message 3", session_id="session1") -] - -with sender: - sender.send_messages(messages) -``` - -2. **For partitioned entities, group messages by partition key:** -```python -# Group messages by partition key before sending -partition1_messages = [ - ServiceBusMessage("Message 1", application_properties={"region": "east"}), - ServiceBusMessage("Message 2", application_properties={"region": "east"}) -] - -partition2_messages = [ - ServiceBusMessage("Message 3", application_properties={"region": "west"}), - ServiceBusMessage("Message 4", application_properties={"region": "west"}) -] - -# Send each group separately -with sender: - sender.send_messages(partition1_messages) - sender.send_messages(partition2_messages) -``` - -### Batch fails to send - -The Service Bus service has size limits for message batches and individual messages. - -**Error symptoms:** -- `MessageSizeExceededError` when sending batches -- Messages larger than expected failing to send - -**Resolution:** -1. **Reduce batch size or message payload:** -```python -from azure.servicebus import ServiceBusMessage -from azure.servicebus.exceptions import MessageSizeExceededError -import json - -def send_large_dataset(sender, data_list, max_batch_size=100): - """Send large datasets in smaller batches""" - for i in range(0, len(data_list), max_batch_size): - batch = data_list[i:i + max_batch_size] - messages = [ServiceBusMessage(json.dumps(item)) for item in batch] - - try: - sender.send_messages(messages) - except MessageSizeExceededError: - # If batch is still too large, send individually - for message in messages: - sender.send_messages(message) -``` - -2. **Check message size limits:** - - Standard tier: 256 KB per message - - Premium tier: 1 MB per message - - Batch limit: 1 MB regardless of tier - -3. **Use message properties for metadata instead of body:** -```python -# Instead of including metadata in message body -large_message = ServiceBusMessage(json.dumps({ - "data": large_data_payload, - "metadata": {"source": "app1", "timestamp": "2023-01-01"} -})) - -# Use application properties for metadata -optimized_message = ServiceBusMessage(large_data_payload) -optimized_message.application_properties = { - "source": "app1", - "timestamp": "2023-01-01" -} -``` - -### Message encoding issues - -Python string encoding can cause issues when sending messages with special characters. - -**Error symptoms:** -- Messages appear corrupted on the receiver side -- Encoding/decoding exceptions - -**Resolution:** -1. **Explicitly handle string encoding:** -```python -import json -from azure.servicebus import ServiceBusMessage - -# For text messages, ensure proper UTF-8 encoding -text_data = "Message with special characters: ñáéíóú" -message = ServiceBusMessage(text_data.encode('utf-8')) - -# For JSON data, use explicit encoding -json_data = {"message": "Data with unicode: ñáéíóú"} -json_string = json.dumps(json_data, ensure_ascii=False) -message = ServiceBusMessage(json_string.encode('utf-8')) - -# Set content type to help receivers -message.content_type = "application/json; charset=utf-8" -``` - -2. **Handle binary data correctly:** -```python -# For binary data, pass bytes directly -binary_data = b"\x00\x01\x02\x03" -message = ServiceBusMessage(binary_data) -message.content_type = "application/octet-stream" -``` - ## Troubleshooting receiver issues ### Number of messages returned doesn't match number requested @@ -584,161 +568,6 @@ def continuous_message_processing(receiver): time.sleep(5) # Brief pause before retry ``` -### Message completion behavior - -**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. - -**What this means:** -- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately -- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled -- This can lead to scenarios where the local operation succeeds but the service operation fails - -**Implications:** -1. **Message state uncertainty:** -```python -# This operation may succeed locally but fail on the service -try: - receiver.complete_message(message) - print("Message completed successfully") # This may be misleading -except Exception as e: - print(f"Local completion failed: {e}") - # But even if no exception, service operation might have failed -``` - -2. **Potential message redelivery:** -- If the service doesn't receive the completion acknowledgment, the message may be redelivered -- This can lead to duplicate processing if not handled properly - -**Mitigation strategies:** -1. **Implement idempotent message processing:** -```python -import hashlib - -processed_messages = set() - -def process_message_idempotently(receiver, message): - """Process messages in an idempotent manner""" - # Create a unique identifier for the message - message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() - - if message_id in processed_messages: - print(f"Message {message_id} already processed, skipping") - receiver.complete_message(message) - return - - try: - # Your message processing logic here - result = process_business_logic(message) - - # Record successful processing before completing - processed_messages.add(message_id) - receiver.complete_message(message) - - return result - except Exception as e: - print(f"Processing failed for message {message_id}: {e}") - receiver.abandon_message(message) - raise -``` - -2. **Use external tracking for critical operations:** -```python -import logging - -def track_message_completion(receiver, message, tracking_store): - """Track message completion in external store""" - message_id = message.message_id - - try: - # Process the message - result = process_message(message) - - # Store completion in external tracking system - tracking_store.mark_completed(message_id, result) - - # Complete the message in Service Bus - receiver.complete_message(message) - - logging.info(f"Message {message_id} processed and completed successfully") - - except Exception as e: - logging.error(f"Failed to process message {message_id}: {e}") - - # Check if we should retry or dead letter - if should_retry(message, e): - receiver.abandon_message(message) - else: - receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) -``` - -3. **Monitor for redelivered messages:** -```python -def handle_potential_redelivery(receiver, message): - """Handle messages that might be redelivered due to completion uncertainty""" - delivery_count = message.delivery_count - - if delivery_count > 1: - logging.warning(f"Message has been delivered {delivery_count} times. " - f"This might indicate completion acknowledgment issues.") - - # Process with extra caution for high delivery count messages - if delivery_count > 3: - # Consider different processing logic or dead lettering - logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") - receiver.dead_letter_message(message, - reason="HighDeliveryCount", - error_description=f"Delivered {delivery_count} times") - return - - # Normal processing - process_message_idempotently(receiver, message) -``` - -### Receive operation hangs - -Receive operations may appear to hang when no messages are available. - -**Symptoms:** -- `receive_messages()` doesn't return for extended periods -- Application appears unresponsive - -**Resolution:** -1. **Set appropriate timeouts:** -```python -# Don't wait indefinitely for messages -messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) - -# For polling scenarios, use shorter timeouts -def poll_for_messages(receiver): - while True: - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - - if messages: - for message in messages: - process_message(message) - receiver.complete_message(message) - else: - print("No messages available, waiting...") - time.sleep(1) -``` - -2. **Use async operations with proper cancellation:** -```python -import asyncio - -async def receive_with_cancellation(receiver): - try: - # Use asyncio timeout for better control - messages = await asyncio.wait_for( - receiver.receive_messages(max_message_count=10, max_wait_time=30), - timeout=35 # Slightly longer than max_wait_time - ) - return messages - except asyncio.TimeoutError: - print("Receive operation timed out") - return [] -``` - ### Messages not being received Messages might not be received due to various configuration or state issues. @@ -785,6 +614,31 @@ messages = receiver.peek_messages(max_message_count=10) print(f"Found {len(messages)} messages in queue without receiving them") ``` +### Dead letter queue issues + +Messages can be moved to the dead letter queue for various reasons: + +**Common reasons:** +- Message TTL expired +- Max delivery count exceeded +- Message was explicitly dead lettered +- Message processing failed repeatedly + +**Debugging dead letter messages:** +```python +# Receive from dead letter queue +dlq_receiver = servicebus_client.get_queue_receiver( + queue_name="your_queue", + sub_queue=ServiceBusSubQueue.DEAD_LETTER +) + +with dlq_receiver: + messages = dlq_receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Dead letter reason: {message.dead_letter_reason}") + print(f"Dead letter description: {message.dead_letter_error_description}") +``` + ## Troubleshooting quota and capacity issues ### Quota exceeded errors @@ -802,143 +656,6 @@ print(f"Found {len(messages)} messages in queue without receiving them") 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace -## Threading and concurrency issues - -### Thread safety limitations - -**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: - -- Connection errors and unexpected exceptions -- Message corruption or loss -- Deadlocks and race conditions -- Unpredictable behavior - -**Best practices:** - -1. **Use separate client instances per thread/task:** -```python -import threading -from azure.servicebus import ServiceBusClient - -def worker_thread(connection_string, queue_name): - # Create a separate client instance for each thread - client = ServiceBusClient.from_connection_string(connection_string) - with client: - sender = client.get_queue_sender(queue_name) - with sender: - # Perform operations... - pass - -# Start multiple threads with separate clients -threads = [] -for i in range(5): - t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) - threads.append(t) - t.start() - -for t in threads: - t.join() -``` - -2. **Use connection pooling patterns when needed:** -```python -# For high-throughput scenarios, consider using a thread-safe queue -# to manage client instances -import queue -import threading - -client_pool = queue.Queue() - -def get_client(): - try: - return client_pool.get_nowait() - except queue.Empty: - return ServiceBusClient.from_connection_string(connection_string) - -def return_client(client): - try: - client_pool.put_nowait(client) - except queue.Full: - client.close() -``` - -3. **Avoid sharing clients across async tasks:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) - -async def bad_async_pattern(): - # Multiple tasks sharing the same client can cause issues - sender = client.get_queue_sender(queue_name) - # This can lead to race conditions - -# DO THIS INSTEAD -async def good_async_pattern(): - # Each async function should use its own client - async with ServiceBusClient.from_connection_string(connection_string) as client: - sender = client.get_queue_sender(queue_name) - async with sender: - # Perform operations safely - pass -``` - -### Async/await best practices - -When using the async APIs in the Python Service Bus SDK: - -1. **Always use async context managers properly:** -```python -async def proper_async_usage(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello World") - await sender.send_messages(message) - - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - await receiver.complete_message(message) -``` - -2. **Don't mix sync and async code without proper handling:** -```python -# Avoid mixing sync and async incorrectly -async def mixed_code_example(): - # Don't call synchronous methods from async context without wrapping - # client = ServiceBusClient.from_connection_string(conn_str) # This is sync - - # Instead, create clients within async context or use proper wrapping - async with ServiceBusClient.from_connection_string(conn_str) as client: - pass -``` - -3. **Handle async exceptions properly:** -```python -import asyncio -from azure.servicebus import ServiceBusError - -async def handle_async_errors(): - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) - # Process messages... - except ServiceBusError as e: - print(f"Service Bus error: {e}") - except asyncio.TimeoutError: - print("Operation timed out") - except Exception as e: - print(f"Unexpected error: {e}") -``` - -**Common threading/concurrency mistakes to avoid:** - -- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads -- Not properly closing clients and their resources in multi-threaded scenarios -- Using the same connection string with too many concurrent clients (can hit connection limits) -- Mixing blocking and non-blocking operations incorrectly -- Not handling connection failures in multi-threaded scenarios - ## Troubleshooting async operations ### Event loop issues From e974f59942c63d4046fef71b902b9a9948c0325e Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 04:20:34 +0000 Subject: [PATCH 10/28] Complete review feedback implementation - remove async sections, update threading guidance Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 393 ++---------------- 1 file changed, 46 insertions(+), 347 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index de421acc7370..f1193d68bd8e 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -180,132 +180,70 @@ There are various timeouts a user should be aware of within the library: - Deadlocks and race conditions - Unpredictable behavior -**Best practices:** +It is up to the running application to use these classes in a thread-safe and coroutine-safe manner. Note: If sending concurrently, ensure that locks are used. -1. **Use separate client instances per thread/task:** +**Synchronous concurrent sending with locks:** ```python import threading -from azure.servicebus import ServiceBusClient +from azure.servicebus import ServiceBusClient, ServiceBusMessage + +FULLY_QUALIFIED_NAMESPACE = "your-namespace.servicebus.windows.net" +QUEUE_NAME = "your-queue" +CONNECTION_STRING = "your-connection-string" -def worker_thread(connection_string, queue_name): - # Create a separate client instance for each thread - client = ServiceBusClient.from_connection_string(connection_string) - with client: - sender = client.get_queue_sender(queue_name) - with sender: - # Perform operations... - pass +# Lock for thread-safe operations +lock = threading.Lock() -# Start multiple threads with separate clients +def send_messages_sync(connection_string, queue_name): + """Send messages using synchronous API with locks""" + with lock: + with ServiceBusClient.from_connection_string(connection_string) as client: + with client.get_queue_sender(queue_name) as sender: + messages = [ServiceBusMessage(f"Message {i}") for i in range(10)] + sender.send_messages(messages) + +# Create and start multiple threads threads = [] -for i in range(5): - t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) +for i in range(3): + t = threading.Thread(target=send_messages_sync, args=(CONNECTION_STRING, QUEUE_NAME)) threads.append(t) t.start() +# Wait for all threads to complete for t in threads: t.join() ``` -2. **Use connection pooling patterns when needed:** -```python -# For high-throughput scenarios, consider using a thread-safe queue -# to manage client instances -import queue -import threading - -client_pool = queue.Queue() - -def get_client(): - try: - return client_pool.get_nowait() - except queue.Empty: - return ServiceBusClient.from_connection_string(connection_string) - -def return_client(client): - try: - client_pool.put_nowait(client) - except queue.Full: - client.close() -``` - -3. **Avoid sharing clients across async tasks:** +**Asynchronous concurrent sending with locks:** ```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) - -async def bad_async_pattern(): - # Multiple tasks sharing the same client can cause issues - sender = client.get_queue_sender(queue_name) - # This can lead to race conditions - -# DO THIS INSTEAD -async def good_async_pattern(): - # Each async function should use its own client - async with ServiceBusClient.from_connection_string(connection_string) as client: - sender = client.get_queue_sender(queue_name) - async with sender: - # Perform operations safely - pass -``` +import asyncio +from azure.identity.aio import DefaultAzureCredential +from azure.servicebus import ServiceBusMessage +from azure.servicebus.aio import ServiceBusClient -### Async/await best practices +FULLY_QUALIFIED_NAMESPACE = "your-namespace.servicebus.windows.net" +QUEUE_NAME = "your-queue" -When using the async APIs in the Python Service Bus SDK: +lock = asyncio.Lock() -1. **Always use async context managers properly:** -```python -async def proper_async_usage(): - async with ServiceBusClient.from_connection_string(connection_string) as client: +async def send_messages_async(client, queue_name): + """Send messages using async API with locks""" + async with lock: async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello World") - await sender.send_messages(message) - - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - await receiver.complete_message(message) -``` + messages = [ServiceBusMessage(f"Hello {i}") for i in range(10)] + await sender.send_messages(messages) -2. **Don't mix sync and async code without proper handling:** -```python -# Avoid mixing sync and async incorrectly -async def mixed_code_example(): - # Don't call synchronous methods from async context without wrapping - # client = ServiceBusClient.from_connection_string(conn_str) # This is sync - - # Instead, create clients within async context or use proper wrapping - async with ServiceBusClient.from_connection_string(conn_str) as client: - pass -``` - -3. **Handle async exceptions properly:** -```python -import asyncio -from azure.servicebus import ServiceBusError +async def main(): + credential = DefaultAzureCredential() + async with ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential) as client: + # Use asyncio.gather for concurrent execution + tasks = [send_messages_async(client, QUEUE_NAME) for _ in range(3)] + await asyncio.gather(*tasks) -async def handle_async_errors(): - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) - # Process messages... - except ServiceBusError as e: - print(f"Service Bus error: {e}") - except asyncio.TimeoutError: - print("Operation timed out") - except Exception as e: - print(f"Unexpected error: {e}") +if __name__ == "__main__": + asyncio.run(main()) ``` -**Common threading/concurrency mistakes to avoid:** - -- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads -- Not properly closing clients and their resources in multi-threaded scenarios -- Using the same connection string with too many concurrent clients (can hit connection limits) -- Mixing blocking and non-blocking operations incorrectly -- Not handling connection failures in multi-threaded scenarios - ## Troubleshooting authentication issues ### Authentication errors @@ -639,6 +577,10 @@ with dlq_receiver: print(f"Dead letter description: {message.dead_letter_error_description}") ``` +### Mixing sync and async code + +Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. + ## Troubleshooting quota and capacity issues ### Quota exceeded errors @@ -656,249 +598,6 @@ with dlq_receiver: 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace -## Troubleshooting async operations - -### Event loop issues - -Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. - -**Common symptoms:** -- `RuntimeError: no running event loop` -- `RuntimeError: cannot be called from a running event loop` -- Async operations hanging indefinitely - -**Resolution:** - -1. **Proper event loop management:** -```python -import asyncio -from azure.servicebus.aio import ServiceBusClient - -async def main(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello async world") - await sender.send_messages(message) - -# Correct way to run async Service Bus code -if __name__ == "__main__": - asyncio.run(main()) -``` - -2. **Handling existing event loops (e.g., in Jupyter notebooks):** -```python -import asyncio -import nest_asyncio - -# In environments like Jupyter where an event loop is already running -nest_asyncio.apply() - -async def notebook_friendly_function(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # Your async Service Bus operations - pass - -# Can be called directly in Jupyter -await notebook_friendly_function() -``` - -3. **Event loop in multi-threaded applications:** -```python -import asyncio -import threading -from concurrent.futures import ThreadPoolExecutor - -def run_async_in_thread(connection_string, queue_name): - """Run async Service Bus operations in a separate thread""" - async def async_operations(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Received: {message}") - await receiver.complete_message(message) - - # Create new event loop for this thread - asyncio.run(async_operations()) - -# Use ThreadPoolExecutor for better management -with ThreadPoolExecutor(max_workers=3) as executor: - futures = [ - executor.submit(run_async_in_thread, connection_string, f"queue_{i}") - for i in range(3) - ] - - for future in futures: - future.result() # Wait for completion -``` - -### Async context manager problems - -Improper use of async context managers can lead to resource leaks and connection issues. - -**Common mistakes:** - -1. **Not using async context managers:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -await sender.send_messages(message) -# Resources not properly closed - -# DO THIS INSTEAD -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Improper exception handling in async context:** -```python -async def proper_exception_handling(): - """Handle exceptions properly in async context managers""" - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - - for message in messages: - try: - # Process message - await process_message_async(message) - await receiver.complete_message(message) - except Exception as processing_error: - print(f"Processing failed: {processing_error}") - await receiver.abandon_message(message) - - except ServiceBusError as sb_error: - print(f"Service Bus error: {sb_error}") - except Exception as general_error: - print(f"Unexpected error: {general_error}") -``` - -3. **Resource cleanup in long-running async operations:** -```python -import asyncio -from contextlib import AsyncExitStack - -async def long_running_processor(): - """Properly manage resources in long-running async operations""" - async with AsyncExitStack() as stack: - client = await stack.enter_async_context( - ServiceBusClient.from_connection_string(connection_string) - ) - receiver = await stack.enter_async_context( - client.get_queue_receiver(queue_name) - ) - - # Long-running processing loop - while True: - try: - messages = await receiver.receive_messages( - max_message_count=10, - max_wait_time=30 - ) - - if not messages: - await asyncio.sleep(1) - continue - - # Process messages with proper error handling - await process_messages_batch(receiver, messages) - - except KeyboardInterrupt: - print("Shutting down gracefully...") - break - except Exception as e: - print(f"Error in processing loop: {e}") - await asyncio.sleep(5) # Brief pause before retry - -async def process_messages_batch(receiver, messages): - """Process a batch of messages with individual error handling""" - for message in messages: - try: - await process_single_message(message) - await receiver.complete_message(message) - except Exception as e: - print(f"Failed to process message {message.message_id}: {e}") - await receiver.abandon_message(message) -``` - -### Mixing sync and async code - -Mixing synchronous and asynchronous Service Bus operations can cause issues. - -**Common problems:** - -1. **Calling async methods without await:** -```python -# WRONG - This returns a coroutine, doesn't actually send -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -sender.send_messages(message) # Missing 'await' - -# CORRECT -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Using sync and async clients together:** -```python -# Avoid mixing sync and async clients in the same application -# Choose one pattern and stick with it - -# Option 1: Pure async -async def async_pattern(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are async - pass - -# Option 2: Pure sync -def sync_pattern(): - with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are sync - pass -``` - -3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** -```python -# Example with FastAPI -from fastapi import FastAPI, BackgroundTasks -from azure.servicebus.aio import ServiceBusClient - -app = FastAPI() - -# Global client for reuse (properly managed) -class ServiceBusManager: - def __init__(self): - self.client = None - - async def start(self): - self.client = ServiceBusClient.from_connection_string(connection_string) - - async def stop(self): - if self.client: - await self.client.close() - -sb_manager = ServiceBusManager() - -@app.on_event("startup") -async def startup_event(): - await sb_manager.start() - -@app.on_event("shutdown") -async def shutdown_event(): - await sb_manager.stop() - -@app.post("/send-message") -async def send_message(message_content: str): - async with sb_manager.client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage(message_content) - await sender.send_messages(message) - return {"status": "sent"} -``` - ## Frequently asked questions ### Q: Why am I getting connection timeout errors? From b01806beb883f04ce47115091bb02da34b0e4fd8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 05:02:41 +0000 Subject: [PATCH 11/28] Address remaining review comments - remove unnecessary sections and reorganize structure Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 20 ++++--------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index f1193d68bd8e..310610bd1993 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -8,6 +8,8 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Enable client logging](#enable-client-logging) * [Common exceptions](#common-exceptions) * [Timeouts](#timeouts) +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) * [Troubleshooting authentication issues](#troubleshooting-authentication-issues) * [Authentication errors](#authentication-errors) * [Authorization errors](#authorization-errors) @@ -20,29 +22,17 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Message lock issues](#message-lock-issues) * [Message size issues](#message-size-issues) * [Message settlement issues](#message-settlement-issues) - * [Dead letter queue issues](#dead-letter-queue-issues) * [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) -* [Troubleshooting sender issues](#troubleshooting-sender-issues) - * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) - * [Batch fails to send](#batch-fails-to-send) - * [Message encoding issues](#message-encoding-issues) * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) - * [Message completion behavior](#message-completion-behavior) - * [Receive operation hangs](#receive-operation-hangs) * [Messages not being received](#messages-not-being-received) + * [Dead letter queue issues](#dead-letter-queue-issues) + * [Mixing sync and async code](#mixing-sync-and-async-code) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) -* [Threading and concurrency issues](#threading-and-concurrency-issues) - * [Thread safety limitations](#thread-safety-limitations) - * [Async/await best practices](#asyncawait-best-practices) -* [Troubleshooting async operations](#troubleshooting-async-operations) - * [Event loop issues](#event-loop-issues) - * [Async context manager problems](#async-context-manager-problems) - * [Mixing sync and async code](#mixing-sync-and-async-code) * [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) @@ -435,8 +425,6 @@ with receiver: 2. Wait for the current session lock to expire before reconnecting 3. Use a different session ID if specific session is not required -## Troubleshooting sender issues - ## Troubleshooting receiver issues ### Number of messages returned doesn't match number requested From 6df3eb4f0cd89a0f2a7c4927a111993a083e23fe Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 05:18:10 +0000 Subject: [PATCH 12/28] Fix async example to match exact format requested in review comments Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 310610bd1993..5582dcaac161 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -211,24 +211,20 @@ from azure.identity.aio import DefaultAzureCredential from azure.servicebus import ServiceBusMessage from azure.servicebus.aio import ServiceBusClient -FULLY_QUALIFIED_NAMESPACE = "your-namespace.servicebus.windows.net" -QUEUE_NAME = "your-queue" +FULLY_QUALIFIED_NAMESPACE = ".servicebus.windows.net" +QUEUE_NAME = "" lock = asyncio.Lock() -async def send_messages_async(client, queue_name): - """Send messages using async API with locks""" +async def send_messages(client): async with lock: - async with client.get_queue_sender(queue_name) as sender: - messages = [ServiceBusMessage(f"Hello {i}") for i in range(10)] - await sender.send_messages(messages) + async with client.get_queue_sender(QUEUE_NAME) as sender: + await asyncio.gather(*(sender.send_messages(ServiceBusMessage("hello")) for _ in range(10))) async def main(): credential = DefaultAzureCredential() async with ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential) as client: - # Use asyncio.gather for concurrent execution - tasks = [send_messages_async(client, QUEUE_NAME) for _ in range(3)] - await asyncio.gather(*tasks) + await send_messages(client) if __name__ == "__main__": asyncio.run(main()) From bafbcb213f6cd89e8d3e1ae3a227513b89073937 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 05:37:29 +0000 Subject: [PATCH 13/28] Fix TROUBLESHOOTING.md formatting and content issues based on review feedback Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 228 ++++-------------- 1 file changed, 50 insertions(+), 178 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 5582dcaac161..211f249a12b4 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -38,15 +38,15 @@ This troubleshooting guide contains instructions to diagnose frequently encounte ## General troubleshooting -Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and [azure.servicebus.exceptions](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions). +Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and Service Bus-specific exceptions in `azure.servicebus.exceptions`. ### Enable client logging This library uses the standard [logging](https://docs.python.org/3/library/logging.html) library for logging. -Basic information about HTTP sessions (URLs, headers, etc.) is logged at `INFO` level. +Basic information about AMQP operations (connections, links, etc.) is logged at `INFO` level. -Detailed `DEBUG` level logging, including request/response bodies and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. +Detailed `DEBUG` level logging, including AMQP frame tracing and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. To enable client logging and AMQP frame level trace: @@ -73,81 +73,42 @@ See full Python SDK logging documentation with examples [here](https://learn.mic ### Common exceptions -The Service Bus APIs generate the following exceptions in `azure.servicebus.exceptions`: +The Service Bus client library raises the following exceptions defined in `azure.servicebus.exceptions`: #### Connection and Authentication Exceptions -- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. - -- **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. - -- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation. It is recommended to check the permission of the credentials. +- **ServiceBusConnectionError:** Connection to the service failed. Check network connectivity and retry. +- **ServiceBusAuthenticationError:** Authentication failed. Verify credentials are correct. +- **ServiceBusAuthorizationError:** Authorization failed. Check that credentials have the required permissions. #### Operation and Timeout Exceptions -- **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. - -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For information about required ports, see [What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). +- **OperationTimeoutError:** Service did not respond within the expected time. Retry the operation. +- **ServiceBusCommunicationError:** Unable to establish connection to Service Bus. Check network connectivity and firewall settings. For firewall configuration, see [What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). #### Message Handling Exceptions -- **MessageSizeExceededError:** This indicates that the message content is larger than the service bus frame size. This could happen when too many service bus messages are sent in a batch or the content passed into the body of a `Message` is too large. It is recommended to reduce the count of messages being sent in a batch or the size of content being passed into a single `ServiceBusMessage`. - -- **MessageAlreadySettled:** This indicates failure to settle the message. This could happen when trying to settle an already-settled message. - -- **MessageLockLostError:** The lock on the message has expired and it has been released back to the queue. It will need to be received again in order to settle it. You should be aware of the lock duration of a message and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. - -- **MessageNotFoundError:** Attempt to receive a message with a particular sequence number. This message isn't found. Make sure the message hasn't been received already. Check the deadletter queue to see if the message has been deadlettered. +- **MessageSizeExceededError:** Message content exceeds size limits. Reduce message size or batch count. +- **MessageAlreadySettled:** Attempt to settle an already-settled message. +- **MessageLockLostError:** Message lock expired. Use `AutoLockRenewer` or process messages faster. +- **MessageNotFoundError:** Message with specified sequence number not found. Check if message was already processed. #### Session Handling Exceptions -- **SessionLockLostError:** The lock on the session has expired. All unsettled messages that have been received can no longer be settled. It is recommended to reconnect to the session if receive messages again if necessary. You should be aware of the lock duration of a session and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the session automatically renewed. - -- **SessionCannotBeLockedError:** Attempt to connect to a session with a specific session ID, but the session is currently locked by another client. Make sure the session is unlocked by other clients. +- **SessionLockLostError:** Session lock expired. Reconnect to the session or use `AutoLockRenewer`. +- **SessionCannotBeLockedError:** Session is locked by another client. Wait for lock to expire. #### Service and Entity Exceptions -- **ServiceBusQuotaExceededError:** The messaging entity has reached its maximum allowable size, or the maximum number of connections to a namespace has been exceeded. Create space in the entity by receiving messages from the entity or its subqueues. - -- **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. - -- **MessagingEntityNotFoundError:** Entity associated with the operation doesn't exist or it has been deleted. Please make sure the entity exists. - -- **MessagingEntityDisabledError:** Request for a runtime operation on a disabled entity. Please activate the entity. +- **ServiceBusQuotaExceededError:** Entity has reached maximum size or connection limit. Create space by receiving messages. +- **ServiceBusServerBusyError:** Service is temporarily overloaded. Implement exponential backoff retry. +- **MessagingEntityNotFoundError:** Entity does not exist or has been deleted. +- **MessagingEntityDisabledError:** Entity is disabled. Enable the entity to perform operations. #### Auto Lock Renewal Exceptions -- **AutoLockRenewFailed:** An attempt to renew a lock on a message or session in the background has failed. This could happen when the receiver used by `AutoLockRenewer` is closed or the lock of the renewable has expired. It is recommended to re-register the renewable message or session by receiving the message or connect to the sessionful entity again. - -- **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. - -#### Python-Specific Considerations - -- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: -```bash -pip install azure-servicebus -``` - -- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: -```python -# Incorrect: passing string instead of ServiceBusMessage -sender.send_messages("Hello World") # This will fail - -# Correct: create ServiceBusMessage objects -from azure.servicebus import ServiceBusMessage -message = ServiceBusMessage("Hello World") -sender.send_messages(message) -``` - -- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: -```python -import socket -try: - # Test DNS resolution - socket.gethostbyname("your-namespace.servicebus.windows.net") -except socket.gaierror as e: - print(f"DNS resolution failed: {e}") -``` +- **AutoLockRenewFailed:** Lock renewal failed. Re-register the renewable message or session. +- **AutoLockRenewTimeout:** Lock renewal timeout exceeded. Extend timeout or re-register the object. ### Timeouts @@ -425,140 +386,53 @@ with receiver: ### Number of messages returned doesn't match number requested -When attempting to receive multiple messages using `receive_messages()` with `max_message_count` greater than 1, you're not guaranteed to receive the exact number requested. - -**Why this happens:** -- Service Bus optimizes for throughput and latency -- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages -- The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones - -**Resolution:** -1. **Don't assume all available messages will be received in one call:** -```python -import time -from azure.servicebus.exceptions import MessagingEntityNotFoundError, MessagingEntityDisabledError - -def receive_all_available_messages(receiver, total_expected=None): - """Receive all available messages from a queue/subscription""" - all_messages = [] - - while True: - # Receive in batches - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - - if not messages: - break # No more messages available - - all_messages.extend(messages) - - # Process messages immediately to avoid lock expiration - for message in messages: - try: - # Process message logic here - print(f"Processing: {message}") - receiver.complete_message(message) - except Exception as e: - print(f"Error processing message: {e}") - receiver.abandon_message(message) - - return all_messages -``` - -2. **Use continuous receiving for stream processing:** -```python -import time +When calling `receive_messages()` with `max_message_count` > 1, you may receive fewer messages than requested. -def continuous_message_processing(receiver): - """Continuously process messages as they arrive""" - while True: - try: - messages = receiver.receive_messages(max_message_count=1, max_wait_time=60) - - for message in messages: - # Process immediately - try: - process_message(message) - receiver.complete_message(message) - except Exception as e: - print(f"Processing failed: {e}") - receiver.abandon_message(message) - - except KeyboardInterrupt: - break - except Exception as e: - print(f"Receive error: {e}") - time.sleep(5) # Brief pause before retry -``` - -### Messages not being received - -Messages might not be received due to various configuration or state issues. +**Cause:** Service Bus waits briefly (20ms) for additional messages after the first. `max_wait_time` only applies to the first message. -**Common causes and resolutions:** +**Resolution:** Call `receive_messages()` in a loop to get all available messages: -1. **Check entity state:** ```python -# Verify the queue/subscription exists and is active -try: - # This will fail if entity doesn't exist - receiver = client.get_queue_receiver(queue_name) - messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) - +all_messages = [] +while True: + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) if not messages: - print("No messages available - check if messages are being sent") - -except MessagingEntityNotFoundError: - print("Queue/subscription does not exist") -except MessagingEntityDisabledError: - print("Queue/subscription is disabled") + break + all_messages.extend(messages) + for message in messages: + receiver.complete_message(message) ``` -2. **Verify message filters (for subscriptions):** -```python -# For topic subscriptions, check if messages match subscription filters -from azure.servicebus.management import ServiceBusAdministrationClient - -admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) - -# Check subscription rules -rules = admin_client.list_rules(topic_name, subscription_name) -for rule in rules: - print(f"Rule: {rule.name}, Filter: {rule.filter}") -``` +### Messages not being received -3. **Check for competing consumers:** -```python -# Multiple receivers on the same queue will compete for messages -# Ensure this is intended behavior or use topic/subscription pattern +**Common causes:** +- Entity doesn't exist or is disabled +- No messages in the queue/subscription +- Message filters excluding messages (subscriptions) +- Lock duration too short -# For debugging, temporarily use peek to see if messages exist -messages = receiver.peek_messages(max_message_count=10) -print(f"Found {len(messages)} messages in queue without receiving them") -``` +**Resolution:** +1. Verify entity exists: `ServiceBusAdministrationClient.get_queue()` or `get_subscription()` +2. Check entity is enabled +3. For subscriptions, verify filter rules allow your messages +4. Increase `max_wait_time` or check message count ### Dead letter queue issues -Messages can be moved to the dead letter queue for various reasons: - -**Common reasons:** -- Message TTL expired -- Max delivery count exceeded -- Message was explicitly dead lettered -- Message processing failed repeatedly +Messages move to the dead letter queue due to TTL expiration, max delivery count exceeded, or explicit dead lettering. -**Debugging dead letter messages:** +**Resolution:** ```python -# Receive from dead letter queue -dlq_receiver = servicebus_client.get_queue_receiver( +# Access dead letter queue to inspect messages +dlq_receiver = client.get_queue_receiver( queue_name="your_queue", sub_queue=ServiceBusSubQueue.DEAD_LETTER ) -with dlq_receiver: - messages = dlq_receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Dead letter reason: {message.dead_letter_reason}") - print(f"Dead letter description: {message.dead_letter_error_description}") +messages = dlq_receiver.receive_messages(max_message_count=10) +for message in messages: + print(f"Reason: {message.dead_letter_reason}") + print(f"Description: {message.dead_letter_error_description}") ``` ### Mixing sync and async code @@ -731,5 +605,3 @@ When filing GitHub issues for Service Bus, please include: 5. **Error details:** Complete exception stack trace and error messages The more information provided, the faster we can help resolve your issue. - -Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. From 6708ca5b121fb6de909a7db08402805a940669a8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 08:06:13 +0000 Subject: [PATCH 14/28] Revert previous 5 commits as requested Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 974 +++++++++++++++--- 1 file changed, 851 insertions(+), 123 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 211f249a12b4..91843b0eebd1 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -8,8 +8,6 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Enable client logging](#enable-client-logging) * [Common exceptions](#common-exceptions) * [Timeouts](#timeouts) -* [Threading and concurrency issues](#threading-and-concurrency-issues) - * [Thread safety limitations](#thread-safety-limitations) * [Troubleshooting authentication issues](#troubleshooting-authentication-issues) * [Authentication errors](#authentication-errors) * [Authorization errors](#authorization-errors) @@ -22,31 +20,43 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Message lock issues](#message-lock-issues) * [Message size issues](#message-size-issues) * [Message settlement issues](#message-settlement-issues) + * [Dead letter queue issues](#dead-letter-queue-issues) * [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) +* [Troubleshooting sender issues](#troubleshooting-sender-issues) + * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) + * [Batch fails to send](#batch-fails-to-send) + * [Message encoding issues](#message-encoding-issues) * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) + * [Message completion behavior](#message-completion-behavior) + * [Receive operation hangs](#receive-operation-hangs) * [Messages not being received](#messages-not-being-received) - * [Dead letter queue issues](#dead-letter-queue-issues) - * [Mixing sync and async code](#mixing-sync-and-async-code) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) + * [Async/await best practices](#asyncawait-best-practices) +* [Troubleshooting async operations](#troubleshooting-async-operations) + * [Event loop issues](#event-loop-issues) + * [Async context manager problems](#async-context-manager-problems) + * [Mixing sync and async code](#mixing-sync-and-async-code) * [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) ## General troubleshooting -Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and Service Bus-specific exceptions in `azure.servicebus.exceptions`. +Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and [azure.servicebus.exceptions](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions). ### Enable client logging This library uses the standard [logging](https://docs.python.org/3/library/logging.html) library for logging. -Basic information about AMQP operations (connections, links, etc.) is logged at `INFO` level. +Basic information about HTTP sessions (URLs, headers, etc.) is logged at `INFO` level. -Detailed `DEBUG` level logging, including AMQP frame tracing and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. +Detailed `DEBUG` level logging, including request/response bodies and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. To enable client logging and AMQP frame level trace: @@ -73,123 +83,91 @@ See full Python SDK logging documentation with examples [here](https://learn.mic ### Common exceptions -The Service Bus client library raises the following exceptions defined in `azure.servicebus.exceptions`: +The Service Bus APIs generate the following exceptions in `azure.servicebus.exceptions`: #### Connection and Authentication Exceptions -- **ServiceBusConnectionError:** Connection to the service failed. Check network connectivity and retry. -- **ServiceBusAuthenticationError:** Authentication failed. Verify credentials are correct. -- **ServiceBusAuthorizationError:** Authorization failed. Check that credentials have the required permissions. +- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. -#### Operation and Timeout Exceptions +- **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. -- **OperationTimeoutError:** Service did not respond within the expected time. Retry the operation. -- **ServiceBusCommunicationError:** Unable to establish connection to Service Bus. Check network connectivity and firewall settings. For firewall configuration, see [What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). +- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation. It is recommended to check the permission of the credentials. -#### Message Handling Exceptions +#### Operation and Timeout Exceptions -- **MessageSizeExceededError:** Message content exceeds size limits. Reduce message size or batch count. -- **MessageAlreadySettled:** Attempt to settle an already-settled message. -- **MessageLockLostError:** Message lock expired. Use `AutoLockRenewer` or process messages faster. -- **MessageNotFoundError:** Message with specified sequence number not found. Check if message was already processed. +- **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. -#### Session Handling Exceptions +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. -- **SessionLockLostError:** Session lock expired. Reconnect to the session or use `AutoLockRenewer`. -- **SessionCannotBeLockedError:** Session is locked by another client. Wait for lock to expire. +#### Message Handling Exceptions -#### Service and Entity Exceptions +- **MessageSizeExceededError:** This indicates that the message content is larger than the service bus frame size. This could happen when too many service bus messages are sent in a batch or the content passed into the body of a `Message` is too large. It is recommended to reduce the count of messages being sent in a batch or the size of content being passed into a single `ServiceBusMessage`. -- **ServiceBusQuotaExceededError:** Entity has reached maximum size or connection limit. Create space by receiving messages. -- **ServiceBusServerBusyError:** Service is temporarily overloaded. Implement exponential backoff retry. -- **MessagingEntityNotFoundError:** Entity does not exist or has been deleted. -- **MessagingEntityDisabledError:** Entity is disabled. Enable the entity to perform operations. +- **MessageAlreadySettled:** This indicates failure to settle the message. This could happen when trying to settle an already-settled message. -#### Auto Lock Renewal Exceptions +- **MessageLockLostError:** The lock on the message has expired and it has been released back to the queue. It will need to be received again in order to settle it. You should be aware of the lock duration of a message and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. -- **AutoLockRenewFailed:** Lock renewal failed. Re-register the renewable message or session. -- **AutoLockRenewTimeout:** Lock renewal timeout exceeded. Extend timeout or re-register the object. +- **MessageNotFoundError:** Attempt to receive a message with a particular sequence number. This message isn't found. Make sure the message hasn't been received already. Check the deadletter queue to see if the message has been deadlettered. -### Timeouts - -There are various timeouts a user should be aware of within the library: +#### Session Handling Exceptions -- **10 minute service side link closure:** A link, once opened, will be closed after 10 minutes idle to protect the service against resource leakage. This should largely be transparent to a user, but if you notice a reconnect occurring after such a duration, this is why. Performing any operations, including management operations, on the link will extend this timeout. +- **SessionLockLostError:** The lock on the session has expired. All unsettled messages that have been received can no longer be settled. It is recommended to reconnect to the session if receive messages again if necessary. You should be aware of the lock duration of a session and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the session automatically renewed. -- **max_wait_time:** Provided on creation of a receiver or when calling `receive_messages()`, the time after which receiving messages will halt after no traffic. This applies both to the imperative `receive_messages()` function as well as the length a generator-style receive will run for before exiting if there are no messages. Passing None (default) will wait forever, up until the 10 minute threshold if no other action is taken. +- **SessionCannotBeLockedError:** Attempt to connect to a session with a specific session ID, but the session is currently locked by another client. Make sure the session is unlocked by other clients. -> **NOTE:** If processing of a message or session is sufficiently long as to cause timeouts, as an alternative to calling `receiver.renew_message_lock`/`receiver.session.renew_lock` manually, one can leverage the `AutoLockRenewer` functionality. +#### Service and Entity Exceptions -## Threading and concurrency issues +- **ServiceBusQuotaExceededError:** The messaging entity has reached its maximum allowable size, or the maximum number of connections to a namespace has been exceeded. Create space in the entity by receiving messages from the entity or its subqueues. -### Thread safety limitations +- **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. -**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: +- **MessagingEntityNotFoundError:** Entity associated with the operation doesn't exist or it has been deleted. Please make sure the entity exists. -- Connection errors and unexpected exceptions -- Message corruption or loss -- Deadlocks and race conditions -- Unpredictable behavior +- **MessagingEntityDisabledError:** Request for a runtime operation on a disabled entity. Please activate the entity. -It is up to the running application to use these classes in a thread-safe and coroutine-safe manner. Note: If sending concurrently, ensure that locks are used. +#### Auto Lock Renewal Exceptions -**Synchronous concurrent sending with locks:** -```python -import threading -from azure.servicebus import ServiceBusClient, ServiceBusMessage +- **AutoLockRenewFailed:** An attempt to renew a lock on a message or session in the background has failed. This could happen when the receiver used by `AutoLockRenewer` is closed or the lock of the renewable has expired. It is recommended to re-register the renewable message or session by receiving the message or connect to the sessionful entity again. -FULLY_QUALIFIED_NAMESPACE = "your-namespace.servicebus.windows.net" -QUEUE_NAME = "your-queue" -CONNECTION_STRING = "your-connection-string" +- **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. -# Lock for thread-safe operations -lock = threading.Lock() +#### Python-Specific Considerations -def send_messages_sync(connection_string, queue_name): - """Send messages using synchronous API with locks""" - with lock: - with ServiceBusClient.from_connection_string(connection_string) as client: - with client.get_queue_sender(queue_name) as sender: - messages = [ServiceBusMessage(f"Message {i}") for i in range(10)] - sender.send_messages(messages) +- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: +```bash +pip install azure-servicebus +``` -# Create and start multiple threads -threads = [] -for i in range(3): - t = threading.Thread(target=send_messages_sync, args=(CONNECTION_STRING, QUEUE_NAME)) - threads.append(t) - t.start() +- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: +```python +# Incorrect: passing string instead of ServiceBusMessage +sender.send_messages("Hello World") # This will fail -# Wait for all threads to complete -for t in threads: - t.join() +# Correct: create ServiceBusMessage objects +from azure.servicebus import ServiceBusMessage +message = ServiceBusMessage("Hello World") +sender.send_messages(message) ``` -**Asynchronous concurrent sending with locks:** +- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: ```python -import asyncio -from azure.identity.aio import DefaultAzureCredential -from azure.servicebus import ServiceBusMessage -from azure.servicebus.aio import ServiceBusClient +import socket +try: + # Test DNS resolution + socket.gethostbyname("your-namespace.servicebus.windows.net") +except socket.gaierror as e: + print(f"DNS resolution failed: {e}") +``` -FULLY_QUALIFIED_NAMESPACE = ".servicebus.windows.net" -QUEUE_NAME = "" +### Timeouts -lock = asyncio.Lock() +There are various timeouts a user should be aware of within the library: -async def send_messages(client): - async with lock: - async with client.get_queue_sender(QUEUE_NAME) as sender: - await asyncio.gather(*(sender.send_messages(ServiceBusMessage("hello")) for _ in range(10))) +- **10 minute service side link closure:** A link, once opened, will be closed after 10 minutes idle to protect the service against resource leakage. This should largely be transparent to a user, but if you notice a reconnect occurring after such a duration, this is why. Performing any operations, including management operations, on the link will extend this timeout. -async def main(): - credential = DefaultAzureCredential() - async with ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential) as client: - await send_messages(client) +- **max_wait_time:** Provided on creation of a receiver or when calling `receive_messages()`, the time after which receiving messages will halt after no traffic. This applies both to the imperative `receive_messages()` function as well as the length a generator-style receive will run for before exiting if there are no messages. Passing None (default) will wait forever, up until the 10 minute threshold if no other action is taken. -if __name__ == "__main__": - asyncio.run(main()) -``` +> **NOTE:** If processing of a message or session is sufficiently long as to cause timeouts, as an alternative to calling `receiver.renew_message_lock`/`receiver.session.renew_lock` manually, one can leverage the `AutoLockRenewer` functionality. ## Troubleshooting authentication issues @@ -289,7 +267,7 @@ If your environment has strict firewall rules or requires proxy configuration: - Consider using AMQP over WebSockets (port 443) if AMQP ports are blocked **For proxy:** -- Service Bus supports HTTP proxy for AMQP over WebSockets +- Service Bus supports HTTP CONNECT proxy for AMQP over WebSockets - Configure proxy settings in your environment variables or application ### Service busy errors @@ -354,6 +332,31 @@ except MessageAlreadySettled: pass ``` +### Dead letter queue issues + +Messages can be moved to the dead letter queue for various reasons: + +**Common reasons:** +- Message TTL expired +- Max delivery count exceeded +- Message was explicitly dead lettered +- Message processing failed repeatedly + +**Debugging dead letter messages:** +```python +# Receive from dead letter queue +dlq_receiver = servicebus_client.get_queue_receiver( + queue_name="your_queue", + sub_queue=ServiceBusSubQueue.DEAD_LETTER +) + +with dlq_receiver: + messages = dlq_receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Dead letter reason: {message.dead_letter_reason}") + print(f"Dead letter description: {message.dead_letter_error_description}") +``` + ## Troubleshooting session handling issues ### Session lock issues @@ -382,62 +385,405 @@ with receiver: 2. Wait for the current session lock to expire before reconnecting 3. Use a different session ID if specific session is not required +## Troubleshooting sender issues + +### Cannot send batch with multiple partition keys + +When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. + +**Error symptoms:** +- Messages are rejected or go to different partitions than expected +- Inconsistent message ordering + +**Resolution:** +1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** +```python +from azure.servicebus import ServiceBusMessage + +# Correct: All messages have the same session_id +messages = [ + ServiceBusMessage("Message 1", session_id="session1"), + ServiceBusMessage("Message 2", session_id="session1"), + ServiceBusMessage("Message 3", session_id="session1") +] + +with sender: + sender.send_messages(messages) +``` + +2. **For partitioned entities, group messages by partition key:** +```python +# Group messages by partition key before sending +partition1_messages = [ + ServiceBusMessage("Message 1", application_properties={"region": "east"}), + ServiceBusMessage("Message 2", application_properties={"region": "east"}) +] + +partition2_messages = [ + ServiceBusMessage("Message 3", application_properties={"region": "west"}), + ServiceBusMessage("Message 4", application_properties={"region": "west"}) +] + +# Send each group separately +with sender: + sender.send_messages(partition1_messages) + sender.send_messages(partition2_messages) +``` + +### Batch fails to send + +The Service Bus service has size limits for message batches and individual messages. + +**Error symptoms:** +- `MessageSizeExceededError` when sending batches +- Messages larger than expected failing to send + +**Resolution:** +1. **Reduce batch size or message payload:** +```python +from azure.servicebus import ServiceBusMessage +from azure.servicebus.exceptions import MessageSizeExceededError +import json + +def send_large_dataset(sender, data_list, max_batch_size=100): + """Send large datasets in smaller batches""" + for i in range(0, len(data_list), max_batch_size): + batch = data_list[i:i + max_batch_size] + messages = [ServiceBusMessage(json.dumps(item)) for item in batch] + + try: + sender.send_messages(messages) + except MessageSizeExceededError: + # If batch is still too large, send individually + for message in messages: + sender.send_messages(message) +``` + +2. **Check message size limits:** + - Standard tier: 256 KB per message + - Premium tier: 1 MB per message + - Batch limit: 1 MB regardless of tier + +3. **Use message properties for metadata instead of body:** +```python +# Instead of including metadata in message body +large_message = ServiceBusMessage(json.dumps({ + "data": large_data_payload, + "metadata": {"source": "app1", "timestamp": "2023-01-01"} +})) + +# Use application properties for metadata +optimized_message = ServiceBusMessage(large_data_payload) +optimized_message.application_properties = { + "source": "app1", + "timestamp": "2023-01-01" +} +``` + +### Message encoding issues + +Python string encoding can cause issues when sending messages with special characters. + +**Error symptoms:** +- Messages appear corrupted on the receiver side +- Encoding/decoding exceptions + +**Resolution:** +1. **Explicitly handle string encoding:** +```python +import json +from azure.servicebus import ServiceBusMessage + +# For text messages, ensure proper UTF-8 encoding +text_data = "Message with special characters: ñáéíóú" +message = ServiceBusMessage(text_data.encode('utf-8')) + +# For JSON data, use explicit encoding +json_data = {"message": "Data with unicode: ñáéíóú"} +json_string = json.dumps(json_data, ensure_ascii=False) +message = ServiceBusMessage(json_string.encode('utf-8')) + +# Set content type to help receivers +message.content_type = "application/json; charset=utf-8" +``` + +2. **Handle binary data correctly:** +```python +# For binary data, pass bytes directly +binary_data = b"\x00\x01\x02\x03" +message = ServiceBusMessage(binary_data) +message.content_type = "application/octet-stream" +``` + ## Troubleshooting receiver issues ### Number of messages returned doesn't match number requested -When calling `receive_messages()` with `max_message_count` > 1, you may receive fewer messages than requested. +When attempting to receive multiple messages using `receive_messages()` with `max_message_count` greater than 1, you're not guaranteed to receive the exact number requested. + +**Why this happens:** +- Service Bus optimizes for throughput and latency +- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages +- The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones + +**Resolution:** +1. **Don't assume all available messages will be received in one call:** +```python +import time +from azure.servicebus.exceptions import MessagingEntityNotFoundError, MessagingEntityDisabledError + +def receive_all_available_messages(receiver, total_expected=None): + """Receive all available messages from a queue/subscription""" + all_messages = [] + + while True: + # Receive in batches + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if not messages: + break # No more messages available + + all_messages.extend(messages) + + # Process messages immediately to avoid lock expiration + for message in messages: + try: + # Process message logic here + print(f"Processing: {message}") + receiver.complete_message(message) + except Exception as e: + print(f"Error processing message: {e}") + receiver.abandon_message(message) + + return all_messages +``` + +2. **Use continuous receiving for stream processing:** +```python +import time -**Cause:** Service Bus waits briefly (20ms) for additional messages after the first. `max_wait_time` only applies to the first message. +def continuous_message_processing(receiver): + """Continuously process messages as they arrive""" + while True: + try: + messages = receiver.receive_messages(max_message_count=1, max_wait_time=60) + + for message in messages: + # Process immediately + try: + process_message(message) + receiver.complete_message(message) + except Exception as e: + print(f"Processing failed: {e}") + receiver.abandon_message(message) + + except KeyboardInterrupt: + break + except Exception as e: + print(f"Receive error: {e}") + time.sleep(5) # Brief pause before retry +``` -**Resolution:** Call `receive_messages()` in a loop to get all available messages: +### Message completion behavior +**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. + +**What this means:** +- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately +- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled +- This can lead to scenarios where the local operation succeeds but the service operation fails + +**Implications:** +1. **Message state uncertainty:** ```python -all_messages = [] -while True: - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - if not messages: - break - all_messages.extend(messages) - for message in messages: +# This operation may succeed locally but fail on the service +try: + receiver.complete_message(message) + print("Message completed successfully") # This may be misleading +except Exception as e: + print(f"Local completion failed: {e}") + # But even if no exception, service operation might have failed +``` + +2. **Potential message redelivery:** +- If the service doesn't receive the completion acknowledgment, the message may be redelivered +- This can lead to duplicate processing if not handled properly + +**Mitigation strategies:** +1. **Implement idempotent message processing:** +```python +import hashlib + +processed_messages = set() + +def process_message_idempotently(receiver, message): + """Process messages in an idempotent manner""" + # Create a unique identifier for the message + message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() + + if message_id in processed_messages: + print(f"Message {message_id} already processed, skipping") + receiver.complete_message(message) + return + + try: + # Your message processing logic here + result = process_business_logic(message) + + # Record successful processing before completing + processed_messages.add(message_id) receiver.complete_message(message) + + return result + except Exception as e: + print(f"Processing failed for message {message_id}: {e}") + receiver.abandon_message(message) + raise ``` -### Messages not being received +2. **Use external tracking for critical operations:** +```python +import logging -**Common causes:** -- Entity doesn't exist or is disabled -- No messages in the queue/subscription -- Message filters excluding messages (subscriptions) -- Lock duration too short +def track_message_completion(receiver, message, tracking_store): + """Track message completion in external store""" + message_id = message.message_id + + try: + # Process the message + result = process_message(message) + + # Store completion in external tracking system + tracking_store.mark_completed(message_id, result) + + # Complete the message in Service Bus + receiver.complete_message(message) + + logging.info(f"Message {message_id} processed and completed successfully") + + except Exception as e: + logging.error(f"Failed to process message {message_id}: {e}") + + # Check if we should retry or dead letter + if should_retry(message, e): + receiver.abandon_message(message) + else: + receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) +``` -**Resolution:** -1. Verify entity exists: `ServiceBusAdministrationClient.get_queue()` or `get_subscription()` -2. Check entity is enabled -3. For subscriptions, verify filter rules allow your messages -4. Increase `max_wait_time` or check message count +3. **Monitor for redelivered messages:** +```python +def handle_potential_redelivery(receiver, message): + """Handle messages that might be redelivered due to completion uncertainty""" + delivery_count = message.delivery_count + + if delivery_count > 1: + logging.warning(f"Message has been delivered {delivery_count} times. " + f"This might indicate completion acknowledgment issues.") + + # Process with extra caution for high delivery count messages + if delivery_count > 3: + # Consider different processing logic or dead lettering + logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") + receiver.dead_letter_message(message, + reason="HighDeliveryCount", + error_description=f"Delivered {delivery_count} times") + return + + # Normal processing + process_message_idempotently(receiver, message) +``` -### Dead letter queue issues +### Receive operation hangs -Messages move to the dead letter queue due to TTL expiration, max delivery count exceeded, or explicit dead lettering. +Receive operations may appear to hang when no messages are available. + +**Symptoms:** +- `receive_messages()` doesn't return for extended periods +- Application appears unresponsive **Resolution:** +1. **Set appropriate timeouts:** ```python -# Access dead letter queue to inspect messages -dlq_receiver = client.get_queue_receiver( - queue_name="your_queue", - sub_queue=ServiceBusSubQueue.DEAD_LETTER -) +# Don't wait indefinitely for messages +messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) + +# For polling scenarios, use shorter timeouts +def poll_for_messages(receiver): + while True: + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if messages: + for message in messages: + process_message(message) + receiver.complete_message(message) + else: + print("No messages available, waiting...") + time.sleep(1) +``` -messages = dlq_receiver.receive_messages(max_message_count=10) -for message in messages: - print(f"Reason: {message.dead_letter_reason}") - print(f"Description: {message.dead_letter_error_description}") +2. **Use async operations with proper cancellation:** +```python +import asyncio + +async def receive_with_cancellation(receiver): + try: + # Use asyncio timeout for better control + messages = await asyncio.wait_for( + receiver.receive_messages(max_message_count=10, max_wait_time=30), + timeout=35 # Slightly longer than max_wait_time + ) + return messages + except asyncio.TimeoutError: + print("Receive operation timed out") + return [] ``` -### Mixing sync and async code +### Messages not being received + +Messages might not be received due to various configuration or state issues. + +**Common causes and resolutions:** + +1. **Check entity state:** +```python +# Verify the queue/subscription exists and is active +try: + # This will fail if entity doesn't exist + receiver = client.get_queue_receiver(queue_name) + messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) + + if not messages: + print("No messages available - check if messages are being sent") + +except MessagingEntityNotFoundError: + print("Queue/subscription does not exist") +except MessagingEntityDisabledError: + print("Queue/subscription is disabled") +``` + +2. **Verify message filters (for subscriptions):** +```python +# For topic subscriptions, check if messages match subscription filters +from azure.servicebus.management import ServiceBusAdministrationClient -Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. +admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) + +# Check subscription rules +rules = admin_client.list_rules(topic_name, subscription_name) +for rule in rules: + print(f"Rule: {rule.name}, Filter: {rule.filter}") +``` + +3. **Check for competing consumers:** +```python +# Multiple receivers on the same queue will compete for messages +# Ensure this is intended behavior or use topic/subscription pattern + +# For debugging, temporarily use peek to see if messages exist +messages = receiver.peek_messages(max_message_count=10) +print(f"Found {len(messages)} messages in queue without receiving them") +``` ## Troubleshooting quota and capacity issues @@ -456,6 +802,386 @@ Mixing synchronous and asynchronous Service Bus operations can cause issues such 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace +## Threading and concurrency issues + +### Thread safety limitations + +**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: + +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior + +**Best practices:** + +1. **Use separate client instances per thread/task:** +```python +import threading +from azure.servicebus import ServiceBusClient + +def worker_thread(connection_string, queue_name): + # Create a separate client instance for each thread + client = ServiceBusClient.from_connection_string(connection_string) + with client: + sender = client.get_queue_sender(queue_name) + with sender: + # Perform operations... + pass + +# Start multiple threads with separate clients +threads = [] +for i in range(5): + t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) + threads.append(t) + t.start() + +for t in threads: + t.join() +``` + +2. **Use connection pooling patterns when needed:** +```python +# For high-throughput scenarios, consider using a thread-safe queue +# to manage client instances +import queue +import threading + +client_pool = queue.Queue() + +def get_client(): + try: + return client_pool.get_nowait() + except queue.Empty: + return ServiceBusClient.from_connection_string(connection_string) + +def return_client(client): + try: + client_pool.put_nowait(client) + except queue.Full: + client.close() +``` + +3. **Avoid sharing clients across async tasks:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) + +async def bad_async_pattern(): + # Multiple tasks sharing the same client can cause issues + sender = client.get_queue_sender(queue_name) + # This can lead to race conditions + +# DO THIS INSTEAD +async def good_async_pattern(): + # Each async function should use its own client + async with ServiceBusClient.from_connection_string(connection_string) as client: + sender = client.get_queue_sender(queue_name) + async with sender: + # Perform operations safely + pass +``` + +### Async/await best practices + +When using the async APIs in the Python Service Bus SDK: + +1. **Always use async context managers properly:** +```python +async def proper_async_usage(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello World") + await sender.send_messages(message) + + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + await receiver.complete_message(message) +``` + +2. **Don't mix sync and async code without proper handling:** +```python +# Avoid mixing sync and async incorrectly +async def mixed_code_example(): + # Don't call synchronous methods from async context without wrapping + # client = ServiceBusClient.from_connection_string(conn_str) # This is sync + + # Instead, create clients within async context or use proper wrapping + async with ServiceBusClient.from_connection_string(conn_str) as client: + pass +``` + +3. **Handle async exceptions properly:** +```python +import asyncio +from azure.servicebus import ServiceBusError + +async def handle_async_errors(): + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) + # Process messages... + except ServiceBusError as e: + print(f"Service Bus error: {e}") + except asyncio.TimeoutError: + print("Operation timed out") + except Exception as e: + print(f"Unexpected error: {e}") +``` + +**Common threading/concurrency mistakes to avoid:** + +- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads +- Not properly closing clients and their resources in multi-threaded scenarios +- Using the same connection string with too many concurrent clients (can hit connection limits) +- Mixing blocking and non-blocking operations incorrectly +- Not handling connection failures in multi-threaded scenarios + +## Troubleshooting async operations + +### Event loop issues + +Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. + +**Common symptoms:** +- `RuntimeError: no running event loop` +- `RuntimeError: cannot be called from a running event loop` +- Async operations hanging indefinitely + +**Resolution:** + +1. **Proper event loop management:** +```python +import asyncio +from azure.servicebus.aio import ServiceBusClient + +async def main(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello async world") + await sender.send_messages(message) + +# Correct way to run async Service Bus code +if __name__ == "__main__": + asyncio.run(main()) +``` + +2. **Handling existing event loops (e.g., in Jupyter notebooks):** +```python +import asyncio +import nest_asyncio + +# In environments like Jupyter where an event loop is already running +nest_asyncio.apply() + +async def notebook_friendly_function(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # Your async Service Bus operations + pass + +# Can be called directly in Jupyter +await notebook_friendly_function() +``` + +3. **Event loop in multi-threaded applications:** +```python +import asyncio +import threading +from concurrent.futures import ThreadPoolExecutor + +def run_async_in_thread(connection_string, queue_name): + """Run async Service Bus operations in a separate thread""" + async def async_operations(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Received: {message}") + await receiver.complete_message(message) + + # Create new event loop for this thread + asyncio.run(async_operations()) + +# Use ThreadPoolExecutor for better management +with ThreadPoolExecutor(max_workers=3) as executor: + futures = [ + executor.submit(run_async_in_thread, connection_string, f"queue_{i}") + for i in range(3) + ] + + for future in futures: + future.result() # Wait for completion +``` + +### Async context manager problems + +Improper use of async context managers can lead to resource leaks and connection issues. + +**Common mistakes:** + +1. **Not using async context managers:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +await sender.send_messages(message) +# Resources not properly closed + +# DO THIS INSTEAD +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Improper exception handling in async context:** +```python +async def proper_exception_handling(): + """Handle exceptions properly in async context managers""" + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + + for message in messages: + try: + # Process message + await process_message_async(message) + await receiver.complete_message(message) + except Exception as processing_error: + print(f"Processing failed: {processing_error}") + await receiver.abandon_message(message) + + except ServiceBusError as sb_error: + print(f"Service Bus error: {sb_error}") + except Exception as general_error: + print(f"Unexpected error: {general_error}") +``` + +3. **Resource cleanup in long-running async operations:** +```python +import asyncio +from contextlib import AsyncExitStack + +async def long_running_processor(): + """Properly manage resources in long-running async operations""" + async with AsyncExitStack() as stack: + client = await stack.enter_async_context( + ServiceBusClient.from_connection_string(connection_string) + ) + receiver = await stack.enter_async_context( + client.get_queue_receiver(queue_name) + ) + + # Long-running processing loop + while True: + try: + messages = await receiver.receive_messages( + max_message_count=10, + max_wait_time=30 + ) + + if not messages: + await asyncio.sleep(1) + continue + + # Process messages with proper error handling + await process_messages_batch(receiver, messages) + + except KeyboardInterrupt: + print("Shutting down gracefully...") + break + except Exception as e: + print(f"Error in processing loop: {e}") + await asyncio.sleep(5) # Brief pause before retry + +async def process_messages_batch(receiver, messages): + """Process a batch of messages with individual error handling""" + for message in messages: + try: + await process_single_message(message) + await receiver.complete_message(message) + except Exception as e: + print(f"Failed to process message {message.message_id}: {e}") + await receiver.abandon_message(message) +``` + +### Mixing sync and async code + +Mixing synchronous and asynchronous Service Bus operations can cause issues. + +**Common problems:** + +1. **Calling async methods without await:** +```python +# WRONG - This returns a coroutine, doesn't actually send +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +sender.send_messages(message) # Missing 'await' + +# CORRECT +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Using sync and async clients together:** +```python +# Avoid mixing sync and async clients in the same application +# Choose one pattern and stick with it + +# Option 1: Pure async +async def async_pattern(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are async + pass + +# Option 2: Pure sync +def sync_pattern(): + with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are sync + pass +``` + +3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** +```python +# Example with FastAPI +from fastapi import FastAPI, BackgroundTasks +from azure.servicebus.aio import ServiceBusClient + +app = FastAPI() + +# Global client for reuse (properly managed) +class ServiceBusManager: + def __init__(self): + self.client = None + + async def start(self): + self.client = ServiceBusClient.from_connection_string(connection_string) + + async def stop(self): + if self.client: + await self.client.close() + +sb_manager = ServiceBusManager() + +@app.on_event("startup") +async def startup_event(): + await sb_manager.start() + +@app.on_event("shutdown") +async def shutdown_event(): + await sb_manager.stop() + +@app.post("/send-message") +async def send_message(message_content: str): + async with sb_manager.client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage(message_content) + await sender.send_messages(message) + return {"status": "sent"} +``` + ## Frequently asked questions ### Q: Why am I getting connection timeout errors? @@ -605,3 +1331,5 @@ When filing GitHub issues for Service Bus, please include: 5. **Error details:** Complete exception stack trace and error messages The more information provided, the faster we can help resolve your issue. + +Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. From 444956c89839d5f7fb0e09c2cb4953d00c2621c9 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 18:29:56 +0000 Subject: [PATCH 15/28] Significantly reduce TROUBLESHOOTING.md length and focus on essential troubleshooting Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 841 +----------------- 1 file changed, 28 insertions(+), 813 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 91843b0eebd1..213d4d33745b 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -36,13 +36,6 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) -* [Threading and concurrency issues](#threading-and-concurrency-issues) - * [Thread safety limitations](#thread-safety-limitations) - * [Async/await best practices](#asyncawait-best-practices) -* [Troubleshooting async operations](#troubleshooting-async-operations) - * [Event loop issues](#event-loop-issues) - * [Async context manager problems](#async-context-manager-problems) - * [Mixing sync and async code](#mixing-sync-and-async-code) * [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) @@ -391,399 +384,66 @@ with receiver: When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. -**Error symptoms:** -- Messages are rejected or go to different partitions than expected -- Inconsistent message ordering - **Resolution:** -1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** -```python -from azure.servicebus import ServiceBusMessage - -# Correct: All messages have the same session_id -messages = [ - ServiceBusMessage("Message 1", session_id="session1"), - ServiceBusMessage("Message 2", session_id="session1"), - ServiceBusMessage("Message 3", session_id="session1") -] - -with sender: - sender.send_messages(messages) -``` - -2. **For partitioned entities, group messages by partition key:** -```python -# Group messages by partition key before sending -partition1_messages = [ - ServiceBusMessage("Message 1", application_properties={"region": "east"}), - ServiceBusMessage("Message 2", application_properties={"region": "east"}) -] - -partition2_messages = [ - ServiceBusMessage("Message 3", application_properties={"region": "west"}), - ServiceBusMessage("Message 4", application_properties={"region": "west"}) -] - -# Send each group separately -with sender: - sender.send_messages(partition1_messages) - sender.send_messages(partition2_messages) -``` +1. For session-enabled entities, ensure all messages in a batch have the same session ID +2. For partitioned entities, group messages by partition key before sending them in separate batches ### Batch fails to send -The Service Bus service has size limits for message batches and individual messages. - -**Error symptoms:** -- `MessageSizeExceededError` when sending batches -- Messages larger than expected failing to send - -**Resolution:** -1. **Reduce batch size or message payload:** -```python -from azure.servicebus import ServiceBusMessage -from azure.servicebus.exceptions import MessageSizeExceededError -import json - -def send_large_dataset(sender, data_list, max_batch_size=100): - """Send large datasets in smaller batches""" - for i in range(0, len(data_list), max_batch_size): - batch = data_list[i:i + max_batch_size] - messages = [ServiceBusMessage(json.dumps(item)) for item in batch] - - try: - sender.send_messages(messages) - except MessageSizeExceededError: - # If batch is still too large, send individually - for message in messages: - sender.send_messages(message) -``` - -2. **Check message size limits:** - - Standard tier: 256 KB per message - - Premium tier: 1 MB per message - - Batch limit: 1 MB regardless of tier - -3. **Use message properties for metadata instead of body:** -```python -# Instead of including metadata in message body -large_message = ServiceBusMessage(json.dumps({ - "data": large_data_payload, - "metadata": {"source": "app1", "timestamp": "2023-01-01"} -})) - -# Use application properties for metadata -optimized_message = ServiceBusMessage(large_data_payload) -optimized_message.application_properties = { - "source": "app1", - "timestamp": "2023-01-01" -} -``` +**MessageSizeExceededError resolution:** +1. Reduce batch size or message payload +2. Check message size limits: Standard tier (256 KB), Premium tier (1 MB), Batch limit (1 MB regardless of tier) +3. Use message properties for metadata instead of including everything in the message body ### Message encoding issues -Python string encoding can cause issues when sending messages with special characters. - -**Error symptoms:** -- Messages appear corrupted on the receiver side -- Encoding/decoding exceptions - **Resolution:** -1. **Explicitly handle string encoding:** -```python -import json -from azure.servicebus import ServiceBusMessage - -# For text messages, ensure proper UTF-8 encoding -text_data = "Message with special characters: ñáéíóú" -message = ServiceBusMessage(text_data.encode('utf-8')) - -# For JSON data, use explicit encoding -json_data = {"message": "Data with unicode: ñáéíóú"} -json_string = json.dumps(json_data, ensure_ascii=False) -message = ServiceBusMessage(json_string.encode('utf-8')) - -# Set content type to help receivers -message.content_type = "application/json; charset=utf-8" -``` - -2. **Handle binary data correctly:** -```python -# For binary data, pass bytes directly -binary_data = b"\x00\x01\x02\x03" -message = ServiceBusMessage(binary_data) -message.content_type = "application/octet-stream" -``` +1. Explicitly handle string encoding using UTF-8 +2. For JSON data, use `json.dumps()` with proper encoding +3. For binary data, pass bytes directly and set appropriate content type ## Troubleshooting receiver issues ### Number of messages returned doesn't match number requested -When attempting to receive multiple messages using `receive_messages()` with `max_message_count` greater than 1, you're not guaranteed to receive the exact number requested. +When using `receive_messages()` with `max_message_count` > 1, you may not receive the exact number requested. **Why this happens:** - Service Bus optimizes for throughput and latency -- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages -- The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones +- After the first message, the receiver waits only briefly for additional messages +- `max_wait_time` controls how long to wait for the **first** message, not subsequent ones **Resolution:** -1. **Don't assume all available messages will be received in one call:** -```python -import time -from azure.servicebus.exceptions import MessagingEntityNotFoundError, MessagingEntityDisabledError - -def receive_all_available_messages(receiver, total_expected=None): - """Receive all available messages from a queue/subscription""" - all_messages = [] - - while True: - # Receive in batches - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - - if not messages: - break # No more messages available - - all_messages.extend(messages) - - # Process messages immediately to avoid lock expiration - for message in messages: - try: - # Process message logic here - print(f"Processing: {message}") - receiver.complete_message(message) - except Exception as e: - print(f"Error processing message: {e}") - receiver.abandon_message(message) - - return all_messages -``` - -2. **Use continuous receiving for stream processing:** -```python -import time - -def continuous_message_processing(receiver): - """Continuously process messages as they arrive""" - while True: - try: - messages = receiver.receive_messages(max_message_count=1, max_wait_time=60) - - for message in messages: - # Process immediately - try: - process_message(message) - receiver.complete_message(message) - except Exception as e: - print(f"Processing failed: {e}") - receiver.abandon_message(message) - - except KeyboardInterrupt: - break - except Exception as e: - print(f"Receive error: {e}") - time.sleep(5) # Brief pause before retry -``` +Don't assume all available messages will be received in one call. Use loops to receive all available messages or implement continuous receiving patterns. ### Message completion behavior -**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. +**Important limitation:** The Python AMQP implementation does not wait for dispositions from the service to acknowledge message completion operations. **What this means:** -- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately -- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled -- This can lead to scenarios where the local operation succeeds but the service operation fails - -**Implications:** -1. **Message state uncertainty:** -```python -# This operation may succeed locally but fail on the service -try: - receiver.complete_message(message) - print("Message completed successfully") # This may be misleading -except Exception as e: - print(f"Local completion failed: {e}") - # But even if no exception, service operation might have failed -``` - -2. **Potential message redelivery:** -- If the service doesn't receive the completion acknowledgment, the message may be redelivered -- This can lead to duplicate processing if not handled properly +- `complete_message()`, `abandon_message()`, or `dead_letter_message()` return immediately +- The SDK does not wait for confirmation from Service Bus that the message was actually settled +- This can lead to scenarios where local operation succeeds but service operation fails **Mitigation strategies:** -1. **Implement idempotent message processing:** -```python -import hashlib - -processed_messages = set() - -def process_message_idempotently(receiver, message): - """Process messages in an idempotent manner""" - # Create a unique identifier for the message - message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() - - if message_id in processed_messages: - print(f"Message {message_id} already processed, skipping") - receiver.complete_message(message) - return - - try: - # Your message processing logic here - result = process_business_logic(message) - - # Record successful processing before completing - processed_messages.add(message_id) - receiver.complete_message(message) - - return result - except Exception as e: - print(f"Processing failed for message {message_id}: {e}") - receiver.abandon_message(message) - raise -``` - -2. **Use external tracking for critical operations:** -```python -import logging - -def track_message_completion(receiver, message, tracking_store): - """Track message completion in external store""" - message_id = message.message_id - - try: - # Process the message - result = process_message(message) - - # Store completion in external tracking system - tracking_store.mark_completed(message_id, result) - - # Complete the message in Service Bus - receiver.complete_message(message) - - logging.info(f"Message {message_id} processed and completed successfully") - - except Exception as e: - logging.error(f"Failed to process message {message_id}: {e}") - - # Check if we should retry or dead letter - if should_retry(message, e): - receiver.abandon_message(message) - else: - receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) -``` - -3. **Monitor for redelivered messages:** -```python -def handle_potential_redelivery(receiver, message): - """Handle messages that might be redelivered due to completion uncertainty""" - delivery_count = message.delivery_count - - if delivery_count > 1: - logging.warning(f"Message has been delivered {delivery_count} times. " - f"This might indicate completion acknowledgment issues.") - - # Process with extra caution for high delivery count messages - if delivery_count > 3: - # Consider different processing logic or dead lettering - logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") - receiver.dead_letter_message(message, - reason="HighDeliveryCount", - error_description=f"Delivered {delivery_count} times") - return - - # Normal processing - process_message_idempotently(receiver, message) -``` +1. Implement idempotent message processing to handle potential redelivery +2. Use external tracking for critical operations +3. Monitor for redelivered messages (check `delivery_count` property) ### Receive operation hangs -Receive operations may appear to hang when no messages are available. - -**Symptoms:** -- `receive_messages()` doesn't return for extended periods -- Application appears unresponsive - **Resolution:** -1. **Set appropriate timeouts:** -```python -# Don't wait indefinitely for messages -messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) - -# For polling scenarios, use shorter timeouts -def poll_for_messages(receiver): - while True: - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - - if messages: - for message in messages: - process_message(message) - receiver.complete_message(message) - else: - print("No messages available, waiting...") - time.sleep(1) -``` - -2. **Use async operations with proper cancellation:** -```python -import asyncio - -async def receive_with_cancellation(receiver): - try: - # Use asyncio timeout for better control - messages = await asyncio.wait_for( - receiver.receive_messages(max_message_count=10, max_wait_time=30), - timeout=35 # Slightly longer than max_wait_time - ) - return messages - except asyncio.TimeoutError: - print("Receive operation timed out") - return [] -``` +1. Set appropriate timeouts: `max_wait_time=30` instead of None +2. For polling scenarios, use shorter timeouts with retry loops +3. Use async operations with proper cancellation for better control ### Messages not being received -Messages might not be received due to various configuration or state issues. - **Common causes and resolutions:** - -1. **Check entity state:** -```python -# Verify the queue/subscription exists and is active -try: - # This will fail if entity doesn't exist - receiver = client.get_queue_receiver(queue_name) - messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) - - if not messages: - print("No messages available - check if messages are being sent") - -except MessagingEntityNotFoundError: - print("Queue/subscription does not exist") -except MessagingEntityDisabledError: - print("Queue/subscription is disabled") -``` - -2. **Verify message filters (for subscriptions):** -```python -# For topic subscriptions, check if messages match subscription filters -from azure.servicebus.management import ServiceBusAdministrationClient - -admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) - -# Check subscription rules -rules = admin_client.list_rules(topic_name, subscription_name) -for rule in rules: - print(f"Rule: {rule.name}, Filter: {rule.filter}") -``` - -3. **Check for competing consumers:** -```python -# Multiple receivers on the same queue will compete for messages -# Ensure this is intended behavior or use topic/subscription pattern - -# For debugging, temporarily use peek to see if messages exist -messages = receiver.peek_messages(max_message_count=10) -print(f"Found {len(messages)} messages in queue without receiving them") -``` +1. Check entity state - verify queue/subscription exists and is active +2. For subscriptions, verify message filters and subscription rules +3. Check for competing consumers on the same queue/subscription +4. Use `peek_messages()` to see if messages exist without receiving them ## Troubleshooting quota and capacity issues @@ -802,385 +462,7 @@ print(f"Found {len(messages)} messages in queue without receiving them") 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace -## Threading and concurrency issues - -### Thread safety limitations - -**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: -- Connection errors and unexpected exceptions -- Message corruption or loss -- Deadlocks and race conditions -- Unpredictable behavior - -**Best practices:** - -1. **Use separate client instances per thread/task:** -```python -import threading -from azure.servicebus import ServiceBusClient - -def worker_thread(connection_string, queue_name): - # Create a separate client instance for each thread - client = ServiceBusClient.from_connection_string(connection_string) - with client: - sender = client.get_queue_sender(queue_name) - with sender: - # Perform operations... - pass - -# Start multiple threads with separate clients -threads = [] -for i in range(5): - t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) - threads.append(t) - t.start() - -for t in threads: - t.join() -``` - -2. **Use connection pooling patterns when needed:** -```python -# For high-throughput scenarios, consider using a thread-safe queue -# to manage client instances -import queue -import threading - -client_pool = queue.Queue() - -def get_client(): - try: - return client_pool.get_nowait() - except queue.Empty: - return ServiceBusClient.from_connection_string(connection_string) - -def return_client(client): - try: - client_pool.put_nowait(client) - except queue.Full: - client.close() -``` - -3. **Avoid sharing clients across async tasks:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) - -async def bad_async_pattern(): - # Multiple tasks sharing the same client can cause issues - sender = client.get_queue_sender(queue_name) - # This can lead to race conditions - -# DO THIS INSTEAD -async def good_async_pattern(): - # Each async function should use its own client - async with ServiceBusClient.from_connection_string(connection_string) as client: - sender = client.get_queue_sender(queue_name) - async with sender: - # Perform operations safely - pass -``` - -### Async/await best practices - -When using the async APIs in the Python Service Bus SDK: - -1. **Always use async context managers properly:** -```python -async def proper_async_usage(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello World") - await sender.send_messages(message) - - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - await receiver.complete_message(message) -``` - -2. **Don't mix sync and async code without proper handling:** -```python -# Avoid mixing sync and async incorrectly -async def mixed_code_example(): - # Don't call synchronous methods from async context without wrapping - # client = ServiceBusClient.from_connection_string(conn_str) # This is sync - - # Instead, create clients within async context or use proper wrapping - async with ServiceBusClient.from_connection_string(conn_str) as client: - pass -``` - -3. **Handle async exceptions properly:** -```python -import asyncio -from azure.servicebus import ServiceBusError - -async def handle_async_errors(): - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) - # Process messages... - except ServiceBusError as e: - print(f"Service Bus error: {e}") - except asyncio.TimeoutError: - print("Operation timed out") - except Exception as e: - print(f"Unexpected error: {e}") -``` - -**Common threading/concurrency mistakes to avoid:** - -- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads -- Not properly closing clients and their resources in multi-threaded scenarios -- Using the same connection string with too many concurrent clients (can hit connection limits) -- Mixing blocking and non-blocking operations incorrectly -- Not handling connection failures in multi-threaded scenarios - -## Troubleshooting async operations - -### Event loop issues - -Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. - -**Common symptoms:** -- `RuntimeError: no running event loop` -- `RuntimeError: cannot be called from a running event loop` -- Async operations hanging indefinitely - -**Resolution:** - -1. **Proper event loop management:** -```python -import asyncio -from azure.servicebus.aio import ServiceBusClient - -async def main(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello async world") - await sender.send_messages(message) - -# Correct way to run async Service Bus code -if __name__ == "__main__": - asyncio.run(main()) -``` - -2. **Handling existing event loops (e.g., in Jupyter notebooks):** -```python -import asyncio -import nest_asyncio - -# In environments like Jupyter where an event loop is already running -nest_asyncio.apply() - -async def notebook_friendly_function(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # Your async Service Bus operations - pass - -# Can be called directly in Jupyter -await notebook_friendly_function() -``` - -3. **Event loop in multi-threaded applications:** -```python -import asyncio -import threading -from concurrent.futures import ThreadPoolExecutor - -def run_async_in_thread(connection_string, queue_name): - """Run async Service Bus operations in a separate thread""" - async def async_operations(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Received: {message}") - await receiver.complete_message(message) - - # Create new event loop for this thread - asyncio.run(async_operations()) - -# Use ThreadPoolExecutor for better management -with ThreadPoolExecutor(max_workers=3) as executor: - futures = [ - executor.submit(run_async_in_thread, connection_string, f"queue_{i}") - for i in range(3) - ] - - for future in futures: - future.result() # Wait for completion -``` - -### Async context manager problems - -Improper use of async context managers can lead to resource leaks and connection issues. - -**Common mistakes:** - -1. **Not using async context managers:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -await sender.send_messages(message) -# Resources not properly closed - -# DO THIS INSTEAD -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Improper exception handling in async context:** -```python -async def proper_exception_handling(): - """Handle exceptions properly in async context managers""" - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - - for message in messages: - try: - # Process message - await process_message_async(message) - await receiver.complete_message(message) - except Exception as processing_error: - print(f"Processing failed: {processing_error}") - await receiver.abandon_message(message) - - except ServiceBusError as sb_error: - print(f"Service Bus error: {sb_error}") - except Exception as general_error: - print(f"Unexpected error: {general_error}") -``` - -3. **Resource cleanup in long-running async operations:** -```python -import asyncio -from contextlib import AsyncExitStack - -async def long_running_processor(): - """Properly manage resources in long-running async operations""" - async with AsyncExitStack() as stack: - client = await stack.enter_async_context( - ServiceBusClient.from_connection_string(connection_string) - ) - receiver = await stack.enter_async_context( - client.get_queue_receiver(queue_name) - ) - - # Long-running processing loop - while True: - try: - messages = await receiver.receive_messages( - max_message_count=10, - max_wait_time=30 - ) - - if not messages: - await asyncio.sleep(1) - continue - - # Process messages with proper error handling - await process_messages_batch(receiver, messages) - - except KeyboardInterrupt: - print("Shutting down gracefully...") - break - except Exception as e: - print(f"Error in processing loop: {e}") - await asyncio.sleep(5) # Brief pause before retry - -async def process_messages_batch(receiver, messages): - """Process a batch of messages with individual error handling""" - for message in messages: - try: - await process_single_message(message) - await receiver.complete_message(message) - except Exception as e: - print(f"Failed to process message {message.message_id}: {e}") - await receiver.abandon_message(message) -``` - -### Mixing sync and async code - -Mixing synchronous and asynchronous Service Bus operations can cause issues. - -**Common problems:** - -1. **Calling async methods without await:** -```python -# WRONG - This returns a coroutine, doesn't actually send -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -sender.send_messages(message) # Missing 'await' - -# CORRECT -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Using sync and async clients together:** -```python -# Avoid mixing sync and async clients in the same application -# Choose one pattern and stick with it - -# Option 1: Pure async -async def async_pattern(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are async - pass - -# Option 2: Pure sync -def sync_pattern(): - with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are sync - pass -``` - -3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** -```python -# Example with FastAPI -from fastapi import FastAPI, BackgroundTasks -from azure.servicebus.aio import ServiceBusClient - -app = FastAPI() - -# Global client for reuse (properly managed) -class ServiceBusManager: - def __init__(self): - self.client = None - - async def start(self): - self.client = ServiceBusClient.from_connection_string(connection_string) - - async def stop(self): - if self.client: - await self.client.close() - -sb_manager = ServiceBusManager() - -@app.on_event("startup") -async def startup_event(): - await sb_manager.start() - -@app.on_event("shutdown") -async def shutdown_event(): - await sb_manager.stop() - -@app.post("/send-message") -async def send_message(message_content: str): - async with sb_manager.client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage(message_content) - await sender.send_messages(message) - return {"status": "sent"} -``` ## Frequently asked questions @@ -1235,76 +517,9 @@ Check the `dead_letter_reason` and `dead_letter_error_description` properties on - For partitioned entities, messages with the same partition key maintain order - Regular queues do not guarantee strict FIFO ordering -```python -# Using sessions for ordered processing -with client.get_queue_receiver(queue_name, session_id="order_123") as session_receiver: - messages = session_receiver.receive_messages(max_message_count=10) - - # Messages within this session are processed in order - for message in messages: - process_message_in_order(message) - session_receiver.complete_message(message) -``` - ### Q: How do I implement retry logic for transient failures? -**A:** -```python -import time -import random -from azure.servicebus.exceptions import ServiceBusError - -def exponential_backoff_retry(operation, max_retries=3): - """Implement exponential backoff retry for Service Bus operations""" - for attempt in range(max_retries + 1): - try: - return operation() - except ServiceBusError as e: - if attempt == max_retries: - raise - - # Check if error is retryable - if hasattr(e, 'reason'): - retryable_reasons = ['ServiceTimeout', 'ServerBusy', 'ServiceCommunicationProblem'] - if e.reason not in retryable_reasons: - raise - - # Calculate backoff delay - delay = (2 ** attempt) + random.uniform(0, 1) - print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.2f} seconds...") - time.sleep(delay) - -# Usage example -def send_with_retry(sender, message): - return exponential_backoff_retry(lambda: sender.send_messages(message)) -``` - -### Q: How do I monitor message processing performance? - -**A:** -```python -import time -import logging -from contextlib import contextmanager - -@contextmanager -def message_processing_timer(message_id): - """Context manager to time message processing""" - start_time = time.time() - try: - yield - finally: - processing_time = time.time() - start_time - logging.info(f"Message {message_id} processed in {processing_time:.3f}s") - -# Usage -def process_with_monitoring(receiver, message): - with message_processing_timer(message.message_id): - # Your processing logic - result = process_message(message) - receiver.complete_message(message) - return result -``` +**A:** Implement exponential backoff retry for Service Bus operations. Check if errors are retryable (ServiceTimeout, ServerBusy, ServiceCommunicationProblem) before retrying with increasing delays. ## Get additional help From d212153374775b1a7a6b5d8d1ababb1713d3f8e8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 18:45:52 +0000 Subject: [PATCH 16/28] Remove sections entirely rather than condensing - address review feedback Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 112 +----------------- 1 file changed, 4 insertions(+), 108 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 213d4d33745b..9849e734917b 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -25,18 +25,18 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) * [Troubleshooting sender issues](#troubleshooting-sender-issues) - * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) + * [Batch fails to send](#batch-fails-to-send) - * [Message encoding issues](#message-encoding-issues) + * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) - * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) + * [Message completion behavior](#message-completion-behavior) * [Receive operation hangs](#receive-operation-hangs) * [Messages not being received](#messages-not-being-received) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) -* [Frequently asked questions](#frequently-asked-questions) + * [Get additional help](#get-additional-help) ## General troubleshooting @@ -124,33 +124,7 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. -#### Python-Specific Considerations - -- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: -```bash -pip install azure-servicebus -``` - -- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: -```python -# Incorrect: passing string instead of ServiceBusMessage -sender.send_messages("Hello World") # This will fail - -# Correct: create ServiceBusMessage objects -from azure.servicebus import ServiceBusMessage -message = ServiceBusMessage("Hello World") -sender.send_messages(message) -``` -- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: -```python -import socket -try: - # Test DNS resolution - socket.gethostbyname("your-namespace.servicebus.windows.net") -except socket.gaierror as e: - print(f"DNS resolution failed: {e}") -``` ### Timeouts @@ -380,13 +354,7 @@ with receiver: ## Troubleshooting sender issues -### Cannot send batch with multiple partition keys -When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. - -**Resolution:** -1. For session-enabled entities, ensure all messages in a batch have the same session ID -2. For partitioned entities, group messages by partition key before sending them in separate batches ### Batch fails to send @@ -395,26 +363,10 @@ When sending to a partition-enabled entity, all messages included in a single se 2. Check message size limits: Standard tier (256 KB), Premium tier (1 MB), Batch limit (1 MB regardless of tier) 3. Use message properties for metadata instead of including everything in the message body -### Message encoding issues - -**Resolution:** -1. Explicitly handle string encoding using UTF-8 -2. For JSON data, use `json.dumps()` with proper encoding -3. For binary data, pass bytes directly and set appropriate content type ## Troubleshooting receiver issues -### Number of messages returned doesn't match number requested - -When using `receive_messages()` with `max_message_count` > 1, you may not receive the exact number requested. -**Why this happens:** -- Service Bus optimizes for throughput and latency -- After the first message, the receiver waits only briefly for additional messages -- `max_wait_time` controls how long to wait for the **first** message, not subsequent ones - -**Resolution:** -Don't assume all available messages will be received in one call. Use loops to receive all available messages or implement continuous receiving patterns. ### Message completion behavior @@ -435,7 +387,6 @@ Don't assume all available messages will be received in one call. Use loops to r **Resolution:** 1. Set appropriate timeouts: `max_wait_time=30` instead of None 2. For polling scenarios, use shorter timeouts with retry loops -3. Use async operations with proper cancellation for better control ### Messages not being received @@ -464,62 +415,7 @@ Don't assume all available messages will be received in one call. Use loops to r -## Frequently asked questions - -### Q: Why am I getting connection timeout errors? - -**A:** Connection timeouts can occur due to: -- Network connectivity issues -- Firewall blocking AMQP ports (5671-5672) -- DNS resolution problems - -Try using AMQP over WebSockets (port 443) or check your network configuration. - -### Q: How do I handle transient errors? - -**A:** Implement retry logic with exponential backoff for transient errors like: -- `ServiceBusConnectionError` -- `OperationTimeoutError` -- `ServiceBusServerBusyError` - -### Q: Why are my messages going to the dead letter queue? - -**A:** Common reasons include: -- Message TTL expiration -- Maximum delivery count exceeded -- Explicit dead lettering in message processing logic -- Poison message detection - -Check the `dead_letter_reason` and `dead_letter_error_description` properties on dead lettered messages. - -### Q: How do I process messages faster? - -**A:** Consider: -- Using concurrent message processing (with separate client instances per thread/task) -- Optimizing your message processing logic -- Using `prefetch_count` to pre-fetch messages (use with caution - see note below) -- Scaling out with multiple receivers (on different clients) - -**Note on prefetch_count:** Be careful when using `prefetch_count` as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. - -### Q: What's the difference between `complete_message()` and `abandon_message()`? - -**A:** -- `complete_message()`: Removes the message from the queue/subscription (successful processing) -- `abandon_message()`: Returns the message to the queue/subscription for reprocessing - -**Important:** Due to Python AMQP implementation limitations, these operations return immediately without waiting for service acknowledgment. Implement idempotent processing to handle potential redelivery. - -### Q: How do I handle message ordering? - -**A:** -- Use **sessions** for guaranteed message ordering within a session -- For partitioned entities, messages with the same partition key maintain order -- Regular queues do not guarantee strict FIFO ordering - -### Q: How do I implement retry logic for transient failures? -**A:** Implement exponential backoff retry for Service Bus operations. Check if errors are retryable (ServiceTimeout, ServerBusy, ServiceCommunicationProblem) before retrying with increasing delays. ## Get additional help From a7219a55b2c99703df0995c3efeb41478bd29b83 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 18:48:51 +0000 Subject: [PATCH 17/28] Fix formatting issues and clean up extra blank lines Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 9849e734917b..b2afa32cb91e 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -25,18 +25,14 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) * [Troubleshooting sender issues](#troubleshooting-sender-issues) - * [Batch fails to send](#batch-fails-to-send) - * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) - * [Message completion behavior](#message-completion-behavior) * [Receive operation hangs](#receive-operation-hangs) * [Messages not being received](#messages-not-being-received) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) - * [Get additional help](#get-additional-help) ## General troubleshooting @@ -124,8 +120,6 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. - - ### Timeouts There are various timeouts a user should be aware of within the library: @@ -173,7 +167,7 @@ Authorization errors occur when the authenticated identity doesn't have sufficie **Required permissions for Service Bus operations:** - **Send:** Required to send messages to queues/topics -- **Listen:** Required to receive messages from queues/subscriptions +- **Listen:** Required to receive messages from queues/subscriptions - **Manage:** Required for management operations (create/delete entities) **Resolution:** @@ -313,7 +307,7 @@ Messages can be moved to the dead letter queue for various reasons: ```python # Receive from dead letter queue dlq_receiver = servicebus_client.get_queue_receiver( - queue_name="your_queue", + queue_name="your_queue", sub_queue=ServiceBusSubQueue.DEAD_LETTER ) @@ -354,8 +348,6 @@ with receiver: ## Troubleshooting sender issues - - ### Batch fails to send **MessageSizeExceededError resolution:** @@ -366,8 +358,6 @@ with receiver: ## Troubleshooting receiver issues - - ### Message completion behavior **Important limitation:** The Python AMQP implementation does not wait for dispositions from the service to acknowledge message completion operations. @@ -413,10 +403,6 @@ with receiver: 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace - - - - ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. From 348f86a17c516942da8ffdf55ef571d2a6c82d6f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 9 Jun 2025 20:20:02 +0000 Subject: [PATCH 18/28] Undo last 3 commits - restore TROUBLESHOOTING.md to previous state Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 941 +++++++++++++++++- 1 file changed, 922 insertions(+), 19 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index b2afa32cb91e..91843b0eebd1 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -25,14 +25,25 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) * [Troubleshooting sender issues](#troubleshooting-sender-issues) + * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) * [Batch fails to send](#batch-fails-to-send) + * [Message encoding issues](#message-encoding-issues) * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) + * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) * [Message completion behavior](#message-completion-behavior) * [Receive operation hangs](#receive-operation-hangs) * [Messages not being received](#messages-not-being-received) * [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) * [Quota exceeded errors](#quota-exceeded-errors) * [Entity not found errors](#entity-not-found-errors) +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) + * [Async/await best practices](#asyncawait-best-practices) +* [Troubleshooting async operations](#troubleshooting-async-operations) + * [Event loop issues](#event-loop-issues) + * [Async context manager problems](#async-context-manager-problems) + * [Mixing sync and async code](#mixing-sync-and-async-code) +* [Frequently asked questions](#frequently-asked-questions) * [Get additional help](#get-additional-help) ## General troubleshooting @@ -120,6 +131,34 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. +#### Python-Specific Considerations + +- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: +```bash +pip install azure-servicebus +``` + +- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: +```python +# Incorrect: passing string instead of ServiceBusMessage +sender.send_messages("Hello World") # This will fail + +# Correct: create ServiceBusMessage objects +from azure.servicebus import ServiceBusMessage +message = ServiceBusMessage("Hello World") +sender.send_messages(message) +``` + +- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: +```python +import socket +try: + # Test DNS resolution + socket.gethostbyname("your-namespace.servicebus.windows.net") +except socket.gaierror as e: + print(f"DNS resolution failed: {e}") +``` + ### Timeouts There are various timeouts a user should be aware of within the library: @@ -167,7 +206,7 @@ Authorization errors occur when the authenticated identity doesn't have sufficie **Required permissions for Service Bus operations:** - **Send:** Required to send messages to queues/topics -- **Listen:** Required to receive messages from queues/subscriptions +- **Listen:** Required to receive messages from queues/subscriptions - **Manage:** Required for management operations (create/delete entities) **Resolution:** @@ -307,7 +346,7 @@ Messages can be moved to the dead letter queue for various reasons: ```python # Receive from dead letter queue dlq_receiver = servicebus_client.get_queue_receiver( - queue_name="your_queue", + queue_name="your_queue", sub_queue=ServiceBusSubQueue.DEAD_LETTER ) @@ -348,43 +387,403 @@ with receiver: ## Troubleshooting sender issues +### Cannot send batch with multiple partition keys + +When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. + +**Error symptoms:** +- Messages are rejected or go to different partitions than expected +- Inconsistent message ordering + +**Resolution:** +1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** +```python +from azure.servicebus import ServiceBusMessage + +# Correct: All messages have the same session_id +messages = [ + ServiceBusMessage("Message 1", session_id="session1"), + ServiceBusMessage("Message 2", session_id="session1"), + ServiceBusMessage("Message 3", session_id="session1") +] + +with sender: + sender.send_messages(messages) +``` + +2. **For partitioned entities, group messages by partition key:** +```python +# Group messages by partition key before sending +partition1_messages = [ + ServiceBusMessage("Message 1", application_properties={"region": "east"}), + ServiceBusMessage("Message 2", application_properties={"region": "east"}) +] + +partition2_messages = [ + ServiceBusMessage("Message 3", application_properties={"region": "west"}), + ServiceBusMessage("Message 4", application_properties={"region": "west"}) +] + +# Send each group separately +with sender: + sender.send_messages(partition1_messages) + sender.send_messages(partition2_messages) +``` + ### Batch fails to send -**MessageSizeExceededError resolution:** -1. Reduce batch size or message payload -2. Check message size limits: Standard tier (256 KB), Premium tier (1 MB), Batch limit (1 MB regardless of tier) -3. Use message properties for metadata instead of including everything in the message body +The Service Bus service has size limits for message batches and individual messages. +**Error symptoms:** +- `MessageSizeExceededError` when sending batches +- Messages larger than expected failing to send + +**Resolution:** +1. **Reduce batch size or message payload:** +```python +from azure.servicebus import ServiceBusMessage +from azure.servicebus.exceptions import MessageSizeExceededError +import json + +def send_large_dataset(sender, data_list, max_batch_size=100): + """Send large datasets in smaller batches""" + for i in range(0, len(data_list), max_batch_size): + batch = data_list[i:i + max_batch_size] + messages = [ServiceBusMessage(json.dumps(item)) for item in batch] + + try: + sender.send_messages(messages) + except MessageSizeExceededError: + # If batch is still too large, send individually + for message in messages: + sender.send_messages(message) +``` + +2. **Check message size limits:** + - Standard tier: 256 KB per message + - Premium tier: 1 MB per message + - Batch limit: 1 MB regardless of tier + +3. **Use message properties for metadata instead of body:** +```python +# Instead of including metadata in message body +large_message = ServiceBusMessage(json.dumps({ + "data": large_data_payload, + "metadata": {"source": "app1", "timestamp": "2023-01-01"} +})) + +# Use application properties for metadata +optimized_message = ServiceBusMessage(large_data_payload) +optimized_message.application_properties = { + "source": "app1", + "timestamp": "2023-01-01" +} +``` + +### Message encoding issues + +Python string encoding can cause issues when sending messages with special characters. + +**Error symptoms:** +- Messages appear corrupted on the receiver side +- Encoding/decoding exceptions + +**Resolution:** +1. **Explicitly handle string encoding:** +```python +import json +from azure.servicebus import ServiceBusMessage + +# For text messages, ensure proper UTF-8 encoding +text_data = "Message with special characters: ñáéíóú" +message = ServiceBusMessage(text_data.encode('utf-8')) + +# For JSON data, use explicit encoding +json_data = {"message": "Data with unicode: ñáéíóú"} +json_string = json.dumps(json_data, ensure_ascii=False) +message = ServiceBusMessage(json_string.encode('utf-8')) + +# Set content type to help receivers +message.content_type = "application/json; charset=utf-8" +``` + +2. **Handle binary data correctly:** +```python +# For binary data, pass bytes directly +binary_data = b"\x00\x01\x02\x03" +message = ServiceBusMessage(binary_data) +message.content_type = "application/octet-stream" +``` ## Troubleshooting receiver issues +### Number of messages returned doesn't match number requested + +When attempting to receive multiple messages using `receive_messages()` with `max_message_count` greater than 1, you're not guaranteed to receive the exact number requested. + +**Why this happens:** +- Service Bus optimizes for throughput and latency +- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages +- The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones + +**Resolution:** +1. **Don't assume all available messages will be received in one call:** +```python +import time +from azure.servicebus.exceptions import MessagingEntityNotFoundError, MessagingEntityDisabledError + +def receive_all_available_messages(receiver, total_expected=None): + """Receive all available messages from a queue/subscription""" + all_messages = [] + + while True: + # Receive in batches + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if not messages: + break # No more messages available + + all_messages.extend(messages) + + # Process messages immediately to avoid lock expiration + for message in messages: + try: + # Process message logic here + print(f"Processing: {message}") + receiver.complete_message(message) + except Exception as e: + print(f"Error processing message: {e}") + receiver.abandon_message(message) + + return all_messages +``` + +2. **Use continuous receiving for stream processing:** +```python +import time + +def continuous_message_processing(receiver): + """Continuously process messages as they arrive""" + while True: + try: + messages = receiver.receive_messages(max_message_count=1, max_wait_time=60) + + for message in messages: + # Process immediately + try: + process_message(message) + receiver.complete_message(message) + except Exception as e: + print(f"Processing failed: {e}") + receiver.abandon_message(message) + + except KeyboardInterrupt: + break + except Exception as e: + print(f"Receive error: {e}") + time.sleep(5) # Brief pause before retry +``` + ### Message completion behavior -**Important limitation:** The Python AMQP implementation does not wait for dispositions from the service to acknowledge message completion operations. +**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. **What this means:** -- `complete_message()`, `abandon_message()`, or `dead_letter_message()` return immediately -- The SDK does not wait for confirmation from Service Bus that the message was actually settled -- This can lead to scenarios where local operation succeeds but service operation fails +- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately +- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled +- This can lead to scenarios where the local operation succeeds but the service operation fails + +**Implications:** +1. **Message state uncertainty:** +```python +# This operation may succeed locally but fail on the service +try: + receiver.complete_message(message) + print("Message completed successfully") # This may be misleading +except Exception as e: + print(f"Local completion failed: {e}") + # But even if no exception, service operation might have failed +``` + +2. **Potential message redelivery:** +- If the service doesn't receive the completion acknowledgment, the message may be redelivered +- This can lead to duplicate processing if not handled properly **Mitigation strategies:** -1. Implement idempotent message processing to handle potential redelivery -2. Use external tracking for critical operations -3. Monitor for redelivered messages (check `delivery_count` property) +1. **Implement idempotent message processing:** +```python +import hashlib + +processed_messages = set() + +def process_message_idempotently(receiver, message): + """Process messages in an idempotent manner""" + # Create a unique identifier for the message + message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() + + if message_id in processed_messages: + print(f"Message {message_id} already processed, skipping") + receiver.complete_message(message) + return + + try: + # Your message processing logic here + result = process_business_logic(message) + + # Record successful processing before completing + processed_messages.add(message_id) + receiver.complete_message(message) + + return result + except Exception as e: + print(f"Processing failed for message {message_id}: {e}") + receiver.abandon_message(message) + raise +``` + +2. **Use external tracking for critical operations:** +```python +import logging + +def track_message_completion(receiver, message, tracking_store): + """Track message completion in external store""" + message_id = message.message_id + + try: + # Process the message + result = process_message(message) + + # Store completion in external tracking system + tracking_store.mark_completed(message_id, result) + + # Complete the message in Service Bus + receiver.complete_message(message) + + logging.info(f"Message {message_id} processed and completed successfully") + + except Exception as e: + logging.error(f"Failed to process message {message_id}: {e}") + + # Check if we should retry or dead letter + if should_retry(message, e): + receiver.abandon_message(message) + else: + receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) +``` + +3. **Monitor for redelivered messages:** +```python +def handle_potential_redelivery(receiver, message): + """Handle messages that might be redelivered due to completion uncertainty""" + delivery_count = message.delivery_count + + if delivery_count > 1: + logging.warning(f"Message has been delivered {delivery_count} times. " + f"This might indicate completion acknowledgment issues.") + + # Process with extra caution for high delivery count messages + if delivery_count > 3: + # Consider different processing logic or dead lettering + logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") + receiver.dead_letter_message(message, + reason="HighDeliveryCount", + error_description=f"Delivered {delivery_count} times") + return + + # Normal processing + process_message_idempotently(receiver, message) +``` ### Receive operation hangs +Receive operations may appear to hang when no messages are available. + +**Symptoms:** +- `receive_messages()` doesn't return for extended periods +- Application appears unresponsive + **Resolution:** -1. Set appropriate timeouts: `max_wait_time=30` instead of None -2. For polling scenarios, use shorter timeouts with retry loops +1. **Set appropriate timeouts:** +```python +# Don't wait indefinitely for messages +messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) + +# For polling scenarios, use shorter timeouts +def poll_for_messages(receiver): + while True: + messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) + + if messages: + for message in messages: + process_message(message) + receiver.complete_message(message) + else: + print("No messages available, waiting...") + time.sleep(1) +``` + +2. **Use async operations with proper cancellation:** +```python +import asyncio + +async def receive_with_cancellation(receiver): + try: + # Use asyncio timeout for better control + messages = await asyncio.wait_for( + receiver.receive_messages(max_message_count=10, max_wait_time=30), + timeout=35 # Slightly longer than max_wait_time + ) + return messages + except asyncio.TimeoutError: + print("Receive operation timed out") + return [] +``` ### Messages not being received +Messages might not be received due to various configuration or state issues. + **Common causes and resolutions:** -1. Check entity state - verify queue/subscription exists and is active -2. For subscriptions, verify message filters and subscription rules -3. Check for competing consumers on the same queue/subscription -4. Use `peek_messages()` to see if messages exist without receiving them + +1. **Check entity state:** +```python +# Verify the queue/subscription exists and is active +try: + # This will fail if entity doesn't exist + receiver = client.get_queue_receiver(queue_name) + messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) + + if not messages: + print("No messages available - check if messages are being sent") + +except MessagingEntityNotFoundError: + print("Queue/subscription does not exist") +except MessagingEntityDisabledError: + print("Queue/subscription is disabled") +``` + +2. **Verify message filters (for subscriptions):** +```python +# For topic subscriptions, check if messages match subscription filters +from azure.servicebus.management import ServiceBusAdministrationClient + +admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) + +# Check subscription rules +rules = admin_client.list_rules(topic_name, subscription_name) +for rule in rules: + print(f"Rule: {rule.name}, Filter: {rule.filter}") +``` + +3. **Check for competing consumers:** +```python +# Multiple receivers on the same queue will compete for messages +# Ensure this is intended behavior or use topic/subscription pattern + +# For debugging, temporarily use peek to see if messages exist +messages = receiver.peek_messages(max_message_count=10) +print(f"Found {len(messages)} messages in queue without receiving them") +``` ## Troubleshooting quota and capacity issues @@ -403,6 +802,510 @@ with receiver: 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace +## Threading and concurrency issues + +### Thread safety limitations + +**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: + +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior + +**Best practices:** + +1. **Use separate client instances per thread/task:** +```python +import threading +from azure.servicebus import ServiceBusClient + +def worker_thread(connection_string, queue_name): + # Create a separate client instance for each thread + client = ServiceBusClient.from_connection_string(connection_string) + with client: + sender = client.get_queue_sender(queue_name) + with sender: + # Perform operations... + pass + +# Start multiple threads with separate clients +threads = [] +for i in range(5): + t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) + threads.append(t) + t.start() + +for t in threads: + t.join() +``` + +2. **Use connection pooling patterns when needed:** +```python +# For high-throughput scenarios, consider using a thread-safe queue +# to manage client instances +import queue +import threading + +client_pool = queue.Queue() + +def get_client(): + try: + return client_pool.get_nowait() + except queue.Empty: + return ServiceBusClient.from_connection_string(connection_string) + +def return_client(client): + try: + client_pool.put_nowait(client) + except queue.Full: + client.close() +``` + +3. **Avoid sharing clients across async tasks:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) + +async def bad_async_pattern(): + # Multiple tasks sharing the same client can cause issues + sender = client.get_queue_sender(queue_name) + # This can lead to race conditions + +# DO THIS INSTEAD +async def good_async_pattern(): + # Each async function should use its own client + async with ServiceBusClient.from_connection_string(connection_string) as client: + sender = client.get_queue_sender(queue_name) + async with sender: + # Perform operations safely + pass +``` + +### Async/await best practices + +When using the async APIs in the Python Service Bus SDK: + +1. **Always use async context managers properly:** +```python +async def proper_async_usage(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello World") + await sender.send_messages(message) + + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + await receiver.complete_message(message) +``` + +2. **Don't mix sync and async code without proper handling:** +```python +# Avoid mixing sync and async incorrectly +async def mixed_code_example(): + # Don't call synchronous methods from async context without wrapping + # client = ServiceBusClient.from_connection_string(conn_str) # This is sync + + # Instead, create clients within async context or use proper wrapping + async with ServiceBusClient.from_connection_string(conn_str) as client: + pass +``` + +3. **Handle async exceptions properly:** +```python +import asyncio +from azure.servicebus import ServiceBusError + +async def handle_async_errors(): + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) + # Process messages... + except ServiceBusError as e: + print(f"Service Bus error: {e}") + except asyncio.TimeoutError: + print("Operation timed out") + except Exception as e: + print(f"Unexpected error: {e}") +``` + +**Common threading/concurrency mistakes to avoid:** + +- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads +- Not properly closing clients and their resources in multi-threaded scenarios +- Using the same connection string with too many concurrent clients (can hit connection limits) +- Mixing blocking and non-blocking operations incorrectly +- Not handling connection failures in multi-threaded scenarios + +## Troubleshooting async operations + +### Event loop issues + +Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. + +**Common symptoms:** +- `RuntimeError: no running event loop` +- `RuntimeError: cannot be called from a running event loop` +- Async operations hanging indefinitely + +**Resolution:** + +1. **Proper event loop management:** +```python +import asyncio +from azure.servicebus.aio import ServiceBusClient + +async def main(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage("Hello async world") + await sender.send_messages(message) + +# Correct way to run async Service Bus code +if __name__ == "__main__": + asyncio.run(main()) +``` + +2. **Handling existing event loops (e.g., in Jupyter notebooks):** +```python +import asyncio +import nest_asyncio + +# In environments like Jupyter where an event loop is already running +nest_asyncio.apply() + +async def notebook_friendly_function(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # Your async Service Bus operations + pass + +# Can be called directly in Jupyter +await notebook_friendly_function() +``` + +3. **Event loop in multi-threaded applications:** +```python +import asyncio +import threading +from concurrent.futures import ThreadPoolExecutor + +def run_async_in_thread(connection_string, queue_name): + """Run async Service Bus operations in a separate thread""" + async def async_operations(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Received: {message}") + await receiver.complete_message(message) + + # Create new event loop for this thread + asyncio.run(async_operations()) + +# Use ThreadPoolExecutor for better management +with ThreadPoolExecutor(max_workers=3) as executor: + futures = [ + executor.submit(run_async_in_thread, connection_string, f"queue_{i}") + for i in range(3) + ] + + for future in futures: + future.result() # Wait for completion +``` + +### Async context manager problems + +Improper use of async context managers can lead to resource leaks and connection issues. + +**Common mistakes:** + +1. **Not using async context managers:** +```python +# DON'T DO THIS +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +await sender.send_messages(message) +# Resources not properly closed + +# DO THIS INSTEAD +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Improper exception handling in async context:** +```python +async def proper_exception_handling(): + """Handle exceptions properly in async context managers""" + try: + async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_receiver(queue_name) as receiver: + messages = await receiver.receive_messages(max_message_count=10) + + for message in messages: + try: + # Process message + await process_message_async(message) + await receiver.complete_message(message) + except Exception as processing_error: + print(f"Processing failed: {processing_error}") + await receiver.abandon_message(message) + + except ServiceBusError as sb_error: + print(f"Service Bus error: {sb_error}") + except Exception as general_error: + print(f"Unexpected error: {general_error}") +``` + +3. **Resource cleanup in long-running async operations:** +```python +import asyncio +from contextlib import AsyncExitStack + +async def long_running_processor(): + """Properly manage resources in long-running async operations""" + async with AsyncExitStack() as stack: + client = await stack.enter_async_context( + ServiceBusClient.from_connection_string(connection_string) + ) + receiver = await stack.enter_async_context( + client.get_queue_receiver(queue_name) + ) + + # Long-running processing loop + while True: + try: + messages = await receiver.receive_messages( + max_message_count=10, + max_wait_time=30 + ) + + if not messages: + await asyncio.sleep(1) + continue + + # Process messages with proper error handling + await process_messages_batch(receiver, messages) + + except KeyboardInterrupt: + print("Shutting down gracefully...") + break + except Exception as e: + print(f"Error in processing loop: {e}") + await asyncio.sleep(5) # Brief pause before retry + +async def process_messages_batch(receiver, messages): + """Process a batch of messages with individual error handling""" + for message in messages: + try: + await process_single_message(message) + await receiver.complete_message(message) + except Exception as e: + print(f"Failed to process message {message.message_id}: {e}") + await receiver.abandon_message(message) +``` + +### Mixing sync and async code + +Mixing synchronous and asynchronous Service Bus operations can cause issues. + +**Common problems:** + +1. **Calling async methods without await:** +```python +# WRONG - This returns a coroutine, doesn't actually send +client = ServiceBusClient.from_connection_string(connection_string) +sender = client.get_queue_sender(queue_name) +sender.send_messages(message) # Missing 'await' + +# CORRECT +async with ServiceBusClient.from_connection_string(connection_string) as client: + async with client.get_queue_sender(queue_name) as sender: + await sender.send_messages(message) +``` + +2. **Using sync and async clients together:** +```python +# Avoid mixing sync and async clients in the same application +# Choose one pattern and stick with it + +# Option 1: Pure async +async def async_pattern(): + async with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are async + pass + +# Option 2: Pure sync +def sync_pattern(): + with ServiceBusClient.from_connection_string(connection_string) as client: + # All operations are sync + pass +``` + +3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** +```python +# Example with FastAPI +from fastapi import FastAPI, BackgroundTasks +from azure.servicebus.aio import ServiceBusClient + +app = FastAPI() + +# Global client for reuse (properly managed) +class ServiceBusManager: + def __init__(self): + self.client = None + + async def start(self): + self.client = ServiceBusClient.from_connection_string(connection_string) + + async def stop(self): + if self.client: + await self.client.close() + +sb_manager = ServiceBusManager() + +@app.on_event("startup") +async def startup_event(): + await sb_manager.start() + +@app.on_event("shutdown") +async def shutdown_event(): + await sb_manager.stop() + +@app.post("/send-message") +async def send_message(message_content: str): + async with sb_manager.client.get_queue_sender(queue_name) as sender: + message = ServiceBusMessage(message_content) + await sender.send_messages(message) + return {"status": "sent"} +``` + +## Frequently asked questions + +### Q: Why am I getting connection timeout errors? + +**A:** Connection timeouts can occur due to: +- Network connectivity issues +- Firewall blocking AMQP ports (5671-5672) +- DNS resolution problems + +Try using AMQP over WebSockets (port 443) or check your network configuration. + +### Q: How do I handle transient errors? + +**A:** Implement retry logic with exponential backoff for transient errors like: +- `ServiceBusConnectionError` +- `OperationTimeoutError` +- `ServiceBusServerBusyError` + +### Q: Why are my messages going to the dead letter queue? + +**A:** Common reasons include: +- Message TTL expiration +- Maximum delivery count exceeded +- Explicit dead lettering in message processing logic +- Poison message detection + +Check the `dead_letter_reason` and `dead_letter_error_description` properties on dead lettered messages. + +### Q: How do I process messages faster? + +**A:** Consider: +- Using concurrent message processing (with separate client instances per thread/task) +- Optimizing your message processing logic +- Using `prefetch_count` to pre-fetch messages (use with caution - see note below) +- Scaling out with multiple receivers (on different clients) + +**Note on prefetch_count:** Be careful when using `prefetch_count` as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. + +### Q: What's the difference between `complete_message()` and `abandon_message()`? + +**A:** +- `complete_message()`: Removes the message from the queue/subscription (successful processing) +- `abandon_message()`: Returns the message to the queue/subscription for reprocessing + +**Important:** Due to Python AMQP implementation limitations, these operations return immediately without waiting for service acknowledgment. Implement idempotent processing to handle potential redelivery. + +### Q: How do I handle message ordering? + +**A:** +- Use **sessions** for guaranteed message ordering within a session +- For partitioned entities, messages with the same partition key maintain order +- Regular queues do not guarantee strict FIFO ordering + +```python +# Using sessions for ordered processing +with client.get_queue_receiver(queue_name, session_id="order_123") as session_receiver: + messages = session_receiver.receive_messages(max_message_count=10) + + # Messages within this session are processed in order + for message in messages: + process_message_in_order(message) + session_receiver.complete_message(message) +``` + +### Q: How do I implement retry logic for transient failures? + +**A:** +```python +import time +import random +from azure.servicebus.exceptions import ServiceBusError + +def exponential_backoff_retry(operation, max_retries=3): + """Implement exponential backoff retry for Service Bus operations""" + for attempt in range(max_retries + 1): + try: + return operation() + except ServiceBusError as e: + if attempt == max_retries: + raise + + # Check if error is retryable + if hasattr(e, 'reason'): + retryable_reasons = ['ServiceTimeout', 'ServerBusy', 'ServiceCommunicationProblem'] + if e.reason not in retryable_reasons: + raise + + # Calculate backoff delay + delay = (2 ** attempt) + random.uniform(0, 1) + print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.2f} seconds...") + time.sleep(delay) + +# Usage example +def send_with_retry(sender, message): + return exponential_backoff_retry(lambda: sender.send_messages(message)) +``` + +### Q: How do I monitor message processing performance? + +**A:** +```python +import time +import logging +from contextlib import contextmanager + +@contextmanager +def message_processing_timer(message_id): + """Context manager to time message processing""" + start_time = time.time() + try: + yield + finally: + processing_time = time.time() - start_time + logging.info(f"Message {message_id} processed in {processing_time:.3f}s") + +# Usage +def process_with_monitoring(receiver, message): + with message_processing_timer(message.message_id): + # Your processing logic + result = process_message(message) + receiver.complete_message(message) + return result +``` + ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. From a8841a33d79d01c41a8699d49c2001e1cd8211eb Mon Sep 17 00:00:00 2001 From: swathipil Date: Mon, 9 Jun 2025 22:11:00 -0700 Subject: [PATCH 19/28] address comments --- .../azure-servicebus/TROUBLESHOOTING.md | 1176 ++--------------- 1 file changed, 135 insertions(+), 1041 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 91843b0eebd1..554fc24053de 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -7,12 +7,14 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [General troubleshooting](#general-troubleshooting) * [Enable client logging](#enable-client-logging) * [Common exceptions](#common-exceptions) - * [Timeouts](#timeouts) -* [Troubleshooting authentication issues](#troubleshooting-authentication-issues) +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) + * [Async/await best practices](#asyncawait-best-practices) +* [Authentication issues](#authentication-issues) * [Authentication errors](#authentication-errors) * [Authorization errors](#authorization-errors) * [Connection string issues](#connection-string-issues) -* [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) +* [Connectivity issues](#troubleshooting-connectivity-issues) * [Connection errors](#connection-errors) * [Firewall and proxy issues](#firewall-and-proxy-issues) * [Service busy errors](#service-busy-errors) @@ -20,44 +22,25 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Message lock issues](#message-lock-issues) * [Message size issues](#message-size-issues) * [Message settlement issues](#message-settlement-issues) - * [Dead letter queue issues](#dead-letter-queue-issues) * [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) * [Session lock issues](#session-lock-issues) * [Session cannot be locked](#session-cannot-be-locked) -* [Troubleshooting sender issues](#troubleshooting-sender-issues) - * [Cannot send batch with multiple partition keys](#cannot-send-batch-with-multiple-partition-keys) - * [Batch fails to send](#batch-fails-to-send) - * [Message encoding issues](#message-encoding-issues) * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) - * [Message completion behavior](#message-completion-behavior) - * [Receive operation hangs](#receive-operation-hangs) - * [Messages not being received](#messages-not-being-received) -* [Troubleshooting quota and capacity issues](#troubleshooting-quota-and-capacity-issues) - * [Quota exceeded errors](#quota-exceeded-errors) - * [Entity not found errors](#entity-not-found-errors) -* [Threading and concurrency issues](#threading-and-concurrency-issues) - * [Thread safety limitations](#thread-safety-limitations) - * [Async/await best practices](#asyncawait-best-practices) -* [Troubleshooting async operations](#troubleshooting-async-operations) - * [Event loop issues](#event-loop-issues) - * [Async context manager problems](#async-context-manager-problems) * [Mixing sync and async code](#mixing-sync-and-async-code) -* [Frequently asked questions](#frequently-asked-questions) + * [Dead letter queue issues](#dead-letter-queue-issues) +* [Quotas](#quotas) +* [Troubleshooting async operations](#troubleshooting-async-operations) * [Get additional help](#get-additional-help) ## General troubleshooting -Azure Service Bus client library will raise exceptions defined in [Azure Core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and [azure.servicebus.exceptions](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions). +Azure Service Bus client library will raise exceptions defined in [azure.core](https://aka.ms/azsdk/python/core/docs#module-azure.core.exceptions) and [azure.servicebus.exceptions](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions). ### Enable client logging This library uses the standard [logging](https://docs.python.org/3/library/logging.html) library for logging. -Basic information about HTTP sessions (URLs, headers, etc.) is logged at `INFO` level. - -Detailed `DEBUG` level logging, including request/response bodies and **unredacted** headers, can be enabled on the client or per-operation with the `logging_enable` keyword argument. - To enable client logging and AMQP frame level trace: ```python @@ -83,11 +66,21 @@ See full Python SDK logging documentation with examples [here](https://learn.mic ### Common exceptions -The Service Bus APIs generate the following exceptions in `azure.servicebus.exceptions`: +The Service Bus client library will surface exceptions when an error is encountered by a service operation or within the client. For scenarios specific to Service Bus, a [ServiceBusError](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions.servicebuserror?view=azure-python) will be raised; this is the most common exception type that applications will encounter. + +ServiceBusErrors often have an underlying AMQP error code which specifies whether an error should be retried. For retryable errors (ie. `amqp:connection:forced` or `amqp:link:detach-forced`), the client library will attempt to recover from these errors based on the retry options specified using the following keyword arguments when instantiating the client: + +* `retry_total`: The total number of attempts to redo a failed operation when an error occurs +* `retry_backoff_factor`: A backoff factor to apply between attempts after the second try +* `retry_backoff_max`: The maximum back off time +* `retry_mode`: The delay behavior between retry attempts. Supported values are 'fixed' or 'exponential' +When an exception is surfaced to the application, either all retries were applied unsuccessfully, or the exception was considered non-transient. #### Connection and Authentication Exceptions -- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. +- **ServiceBusConnectionError:** An error occurred in the connection to the se +rvice. This may have been caused by a transient network issue or service proble +m. It is recommended to retry. - **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. @@ -97,17 +90,17 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). #### Message Handling Exceptions -- **MessageSizeExceededError:** This indicates that the message content is larger than the service bus frame size. This could happen when too many service bus messages are sent in a batch or the content passed into the body of a `Message` is too large. It is recommended to reduce the count of messages being sent in a batch or the size of content being passed into a single `ServiceBusMessage`. +- **MessageSizeExceededError:** This indicates that the max message size has been exceeded. The message size includes the body of the message, as well as any associated metadata and system overhead. The best approach for resolving this error is to reduce the number of messages being sent in a batch or the size of the body included in the message. Because size limits are subject to change, please refer to [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for specifics. - **MessageAlreadySettled:** This indicates failure to settle the message. This could happen when trying to settle an already-settled message. -- **MessageLockLostError:** The lock on the message has expired and it has been released back to the queue. It will need to be received again in order to settle it. You should be aware of the lock duration of a message and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. +- **MessageLockLostError:** Indicates that the lock on the message is lost. Callers should attempt to receive and process the message again. This exception only applies to entities that don't use sessions. This error occurs if processing takes longer than the lock duration and the message lock isn't renewed. This error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes, as enforced by the service. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. -- **MessageNotFoundError:** Attempt to receive a message with a particular sequence number. This message isn't found. Make sure the message hasn't been received already. Check the deadletter queue to see if the message has been deadlettered. +- **MessageNotFoundError:** This occurs when attempting to receive a deferred message by sequence number for a message that either doesn't exist in the entity, or is currently locked. #### Session Handling Exceptions @@ -117,7 +110,7 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce #### Service and Entity Exceptions -- **ServiceBusQuotaExceededError:** The messaging entity has reached its maximum allowable size, or the maximum number of connections to a namespace has been exceeded. Create space in the entity by receiving messages from the entity or its subqueues. +- **ServiceBusQuotaExceededError:** This typically indicates that there are too many active receive operations for a single entity. In order to avoid this error, reduce the number of potential concurrent receives. You can use batch receives to attempt to receive multiple messages per receive request. Please see [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for more information. - **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. @@ -131,153 +124,130 @@ The Service Bus APIs generate the following exceptions in `azure.servicebus.exce - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. -#### Python-Specific Considerations +## Threading and concurrency issues -- **ImportError/ModuleNotFoundError:** Common when Azure Service Bus dependencies are not properly installed. Ensure you have installed the correct package version: -```bash -pip install azure-servicebus -``` +### Thread safety limitations -- **TypeError:** Often occurs when passing incorrect data types to Service Bus methods: -```python -# Incorrect: passing string instead of ServiceBusMessage -sender.send_messages("Hello World") # This will fail +**Important:** We do not guarantee that the ServiceBusClient, ServiceBusSender, and ServiceBusReceiver are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. -# Correct: create ServiceBusMessage objects -from azure.servicebus import ServiceBusMessage -message = ServiceBusMessage("Hello World") -sender.send_messages(message) -``` +The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. -- **ConnectionError/socket.gaierror:** Network-level errors that may require checking DNS resolution and network connectivity: -```python -import socket -try: - # Test DNS resolution - socket.gethostbyname("your-namespace.servicebus.windows.net") -except socket.gaierror as e: - print(f"DNS resolution failed: {e}") -``` +Using the same client instances across multiple threads or tasks without proper synchronization can lead to: -### Timeouts +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior -There are various timeouts a user should be aware of within the library: +It is up to the running application to use these classes in a concurrency-safe manner. -- **10 minute service side link closure:** A link, once opened, will be closed after 10 minutes idle to protect the service against resource leakage. This should largely be transparent to a user, but if you notice a reconnect occurring after such a duration, this is why. Performing any operations, including management operations, on the link will extend this timeout. +For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like asyncio.Lock(). -- **max_wait_time:** Provided on creation of a receiver or when calling `receive_messages()`, the time after which receiving messages will halt after no traffic. This applies both to the imperative `receive_messages()` function as well as the length a generator-style receive will run for before exiting if there are no messages. Passing None (default) will wait forever, up until the 10 minute threshold if no other action is taken. +```python +import asyncio +from azure.servicebus.aio import ServiceBusClient +from azure.servicebus import ServiceBusMessage +from azure.identity.aio import DefaultAzureCredential -> **NOTE:** If processing of a message or session is sufficiently long as to cause timeouts, as an alternative to calling `receiver.renew_message_lock`/`receiver.session.renew_lock` manually, one can leverage the `AutoLockRenewer` functionality. +SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" +QUEUE_NAME = "" -## Troubleshooting authentication issues +lock = asyncio.Lock() -### Authentication errors +async def send_batch(sender_id, sender): + async with lock: + messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] + await sender.send_messages(messages) + print(f"Sender {sender_id} sent messages.") -Authentication errors typically occur when the credentials provided are incorrect or have expired. +credential = DefaultAzureCredential() +client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) -**Common causes:** -- Incorrect connection string -- Expired SAS token -- Invalid managed identity configuration -- Wrong credential type being used +async with client: + sender = client.get_queue_sender(queue_name=QUEUE_NAME) + async with sender: + await asyncio.gather(*(send_batch(i, sender) for i in range(5))) +``` -**Resolution:** -1. Verify your connection string is correct and complete -2. Check if using SAS tokens that they haven't expired -3. For managed identity, ensure the identity is properly configured and has the necessary permissions -4. Test connectivity using a simple connection string first +For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like threading.Lock(). **Note:** Native async APIs should be used instead of running in a ThreadPoolExecutor, if possible. ```python -# Example of proper authentication -from azure.servicebus import ServiceBusClient +import threading +from concurrent.futures import ThreadPoolExecutor +from azure.servicebus import ServiceBusClient, ServiceBusMessage +from azure.identity import DefaultAzureCredential -# Using connection string -client = ServiceBusClient.from_connection_string("your_connection_string") +SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" +QUEUE_NAME = "" + +lock = threading.Lock() + +def send_batch(sender_id, sender): + with lock: + messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] + sender.send_messages(messages) + print(f"Sender {sender_id} sent messages.") -# Using Azure Identity -from azure.identity import DefaultAzureCredential credential = DefaultAzureCredential() -client = ServiceBusClient("your_namespace.servicebus.windows.net", credential) +client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) + +with client: + sender = client.get_queue_sender(queue_name=QUEUE_NAME) + with sender: + with ThreadPoolExecutor(max_workers=5) as executor: + for i in range(5): + executor.submit(send_batch, i, sender) ``` -### Authorization errors +## Authentication issues -Authorization errors occur when the authenticated identity doesn't have sufficient permissions. +Authentication errors typically occur when the credentials provided are incorrect or have expired. Authorization errors occur when the authenticated identity doesn't have sufficient permissions. -**Required permissions for Service Bus operations:** -- **Send:** Required to send messages to queues/topics -- **Listen:** Required to receive messages from queues/subscriptions -- **Manage:** Required for management operations (create/delete entities) +The following verification steps are recommended, depending on the type of authorization provided when constructing the client: -**Resolution:** -1. Check the Access Control (IAM) settings in Azure portal -2. Ensure the identity has the appropriate Service Bus roles: - - `Azure Service Bus Data Owner` - - `Azure Service Bus Data Sender` - - `Azure Service Bus Data Receiver` -3. For connection strings, verify the SAS policy has the correct permissions +- [Verify the connection string is correct](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quickstart-portal#get-the-connection-string) -### Connection string issues +- [Verify the SAS token was generated correctly](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-sas) -**Common connection string problems:** -- Missing required components (Endpoint, SharedAccessKeyName, SharedAccessKey) -- Incorrect namespace or entity names -- URL encoding issues with special characters +- [Verify the correct RBAC roles were granted](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-managed-service-identity) - Indicated by errors: `Send/Listen claim(s) are required to perform this operation.` In this case, ensure that the appropriate roles were assigned: `Azure Service Bus Data Owner`, `Azure Service Bus Data Sender`, or `Azure Service Bus Data Receiver`. -**Example of correct connection string format:** -``` -Endpoint=sb://your-namespace.servicebus.windows.net/;SharedAccessKeyName=your-policy;SharedAccessKey=your-key -``` +## Connectivity issues -## Troubleshooting connectivity issues +### Timeout when connecting to service -### Connection errors +Depending on the host environment and network, this may present to applications as timeout or operation exceptions. This most often occurs when the client cannot find a network path to the service. -Connection errors can occur due to network issues, firewall restrictions, or service problems. +To troubleshoot: -**Common causes:** -- Network connectivity issues -- DNS resolution problems -- Firewall or proxy blocking connections -- Service Bus namespace not accessible from current location +- Verify that the connection string or fully qualified domain name specified when creating the client is correct. For information on how to acquire a connection string, see: [Get a Service Bus connection string](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quickstart-portal#get-the-connection-string). -**Resolution:** -1. Test basic network connectivity to `your-namespace.servicebus.windows.net` on port 5671 (AMQP) or 443 (AMQP over WebSockets) -2. Try using AMQP over WebSockets if regular AMQP is blocked: +- Check the firewall and port permissions in your hosting environment and that the AMQP ports 5671 and 5672 are open and that the endpoint is allowed through the firewall. -```python -from azure.servicebus import ServiceBusClient, TransportType -from azure.identity import DefaultAzureCredential +- Try using the Web Socket transport option, which connects using port 443. This can be done by passing the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/en-us/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the client. -# Using Azure Identity with WebSockets -credential = DefaultAzureCredential() -client = ServiceBusClient( - "your_namespace.servicebus.windows.net", - credential, - transport_type=TransportType.AmqpOverWebsocket -) -``` +- See if your network is blocking specific IP addresses. For details, see: [What IP addresses do I need to allow?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ip-addresses-do-i-need-to-add-to-allowlist-). -### Firewall and proxy issues +- If applicable, verify the proxy configuration. For details, see: [Proxy sample](https://github.com/Azure/azure-sdk-for-python/blob/fb9f99e09a0968e51839f8456ad69b0354837f95/sdk/servicebus/azure-servicebus/samples/sync_samples/proxy.py). -If your environment has strict firewall rules or requires proxy configuration: +### SSL handshake failures -**For firewall:** -- Allow outbound connections to `*.servicebus.windows.net` on ports 5671-5672 (AMQP) and 443 (HTTPS/WebSockets) -- Consider using AMQP over WebSockets (port 443) if AMQP ports are blocked +This error can occur when an intercepting proxy is used. To verify, it is recommended that the application be tested in the host environment with the proxy disabled. -**For proxy:** -- Service Bus supports HTTP CONNECT proxy for AMQP over WebSockets -- Configure proxy settings in your environment variables or application +### Adding components to the connection string does not work -### Service busy errors +The current generation of the Service Bus client library supports connection strings only in the form published by the Azure portal. These are intended to provide basic location and shared key information only; configuring behavior of the clients is done through its options. -`ServiceBusServerBusyError` indicates the service is temporarily overloaded. +Previous generations of the Service Bus clients allowed for some behavior to be configured by adding key/value components to a connection string. These components are no longer recognized and have no effect on client behavior. -**Resolution:** -1. Implement exponential backoff retry logic -2. Reduce the frequency of requests -3. Consider scaling up your Service Bus tier if errors persist +#### "TransportType.AmqpOverWebSocket" Alternative + +To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/en-us/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the ServiceBusClient. + +#### Azure Identity Authentication Alternative + +To authenticate with Azure Identity, see: [Client Identity Authentication](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/servicebus/azure-servicebus/samples/sync_samples/client_identity_authentication.py). + +For more information about the `azure-identity` library, see: [Azure Identity client library for Python][https://learn.microsoft.com/python/api/overview/azure/identity-readme?view=azure-python]. ## Troubleshooting message handling issues @@ -287,7 +257,9 @@ Messages in Service Bus have a lock duration during which they must be settled ( **MessageLockLostError resolution:** 1. Process messages faster or increase lock duration -2. Use `AutoLockRenewer` for long-running processing: +2. If setting `prefetch_count` to a large number, consider setting it lower as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. +3. Use `AutoLockRenewer` for long-running processing. + * When running the async AutoLockRenewer, ensure that the event loop is not blocked during message processing. (e.g. `time.sleep(60)` --> `await asyncio.sleep(60)`). Otherwise, the AutoLockRenewer will be prevented from running in the background. ```python from azure.servicebus import AutoLockRenewer @@ -307,9 +279,7 @@ with receiver: **MessageSizeExceededError resolution:** 1. Reduce message payload size -2. Use message properties and application properties for metadata instead of body -3. For batch operations, reduce the number of messages in the batch -4. Consider splitting large messages across multiple smaller messages +2. Consider splitting large messages across multiple smaller messages **Service Bus message size limits:** - Standard tier: 256 KB per message @@ -317,203 +287,6 @@ with receiver: For the most up-to-date information on Service Bus limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. -### Message settlement issues - -**MessageAlreadySettled resolution:** -1. Ensure you're not trying to settle the same message multiple times -2. Check your application logic for race conditions -3. Use try-catch blocks when settling messages - -```python -try: - receiver.complete_message(message) -except MessageAlreadySettled: - # Message was already settled, this is expected in some scenarios - pass -``` - -### Dead letter queue issues - -Messages can be moved to the dead letter queue for various reasons: - -**Common reasons:** -- Message TTL expired -- Max delivery count exceeded -- Message was explicitly dead lettered -- Message processing failed repeatedly - -**Debugging dead letter messages:** -```python -# Receive from dead letter queue -dlq_receiver = servicebus_client.get_queue_receiver( - queue_name="your_queue", - sub_queue=ServiceBusSubQueue.DEAD_LETTER -) - -with dlq_receiver: - messages = dlq_receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Dead letter reason: {message.dead_letter_reason}") - print(f"Dead letter description: {message.dead_letter_error_description}") -``` - -## Troubleshooting session handling issues - -### Session lock issues - -Session-enabled entities require proper session management. - -**SessionLockLostError resolution:** -1. Renew session locks before they expire -2. Use `AutoLockRenewer` for automatic session lock renewal -3. Handle session lock lost errors by reconnecting to the session - -```python -from azure.servicebus import AutoLockRenewer - -renewer = AutoLockRenewer() -with receiver: - session = receiver.session - renewer.register(receiver, session, max_lock_renewal_duration=300) - # Process messages in session -``` - -### Session cannot be locked - -**SessionCannotBeLockedError resolution:** -1. Ensure no other clients are already connected to the same session -2. Wait for the current session lock to expire before reconnecting -3. Use a different session ID if specific session is not required - -## Troubleshooting sender issues - -### Cannot send batch with multiple partition keys - -When sending to a partition-enabled entity, all messages included in a single send operation must have the same `session_id` if the entity is session-enabled, or the same custom properties that determine partitioning. - -**Error symptoms:** -- Messages are rejected or go to different partitions than expected -- Inconsistent message ordering - -**Resolution:** -1. **For session-enabled entities, ensure all messages in a batch have the same session ID:** -```python -from azure.servicebus import ServiceBusMessage - -# Correct: All messages have the same session_id -messages = [ - ServiceBusMessage("Message 1", session_id="session1"), - ServiceBusMessage("Message 2", session_id="session1"), - ServiceBusMessage("Message 3", session_id="session1") -] - -with sender: - sender.send_messages(messages) -``` - -2. **For partitioned entities, group messages by partition key:** -```python -# Group messages by partition key before sending -partition1_messages = [ - ServiceBusMessage("Message 1", application_properties={"region": "east"}), - ServiceBusMessage("Message 2", application_properties={"region": "east"}) -] - -partition2_messages = [ - ServiceBusMessage("Message 3", application_properties={"region": "west"}), - ServiceBusMessage("Message 4", application_properties={"region": "west"}) -] - -# Send each group separately -with sender: - sender.send_messages(partition1_messages) - sender.send_messages(partition2_messages) -``` - -### Batch fails to send - -The Service Bus service has size limits for message batches and individual messages. - -**Error symptoms:** -- `MessageSizeExceededError` when sending batches -- Messages larger than expected failing to send - -**Resolution:** -1. **Reduce batch size or message payload:** -```python -from azure.servicebus import ServiceBusMessage -from azure.servicebus.exceptions import MessageSizeExceededError -import json - -def send_large_dataset(sender, data_list, max_batch_size=100): - """Send large datasets in smaller batches""" - for i in range(0, len(data_list), max_batch_size): - batch = data_list[i:i + max_batch_size] - messages = [ServiceBusMessage(json.dumps(item)) for item in batch] - - try: - sender.send_messages(messages) - except MessageSizeExceededError: - # If batch is still too large, send individually - for message in messages: - sender.send_messages(message) -``` - -2. **Check message size limits:** - - Standard tier: 256 KB per message - - Premium tier: 1 MB per message - - Batch limit: 1 MB regardless of tier - -3. **Use message properties for metadata instead of body:** -```python -# Instead of including metadata in message body -large_message = ServiceBusMessage(json.dumps({ - "data": large_data_payload, - "metadata": {"source": "app1", "timestamp": "2023-01-01"} -})) - -# Use application properties for metadata -optimized_message = ServiceBusMessage(large_data_payload) -optimized_message.application_properties = { - "source": "app1", - "timestamp": "2023-01-01" -} -``` - -### Message encoding issues - -Python string encoding can cause issues when sending messages with special characters. - -**Error symptoms:** -- Messages appear corrupted on the receiver side -- Encoding/decoding exceptions - -**Resolution:** -1. **Explicitly handle string encoding:** -```python -import json -from azure.servicebus import ServiceBusMessage - -# For text messages, ensure proper UTF-8 encoding -text_data = "Message with special characters: ñáéíóú" -message = ServiceBusMessage(text_data.encode('utf-8')) - -# For JSON data, use explicit encoding -json_data = {"message": "Data with unicode: ñáéíóú"} -json_string = json.dumps(json_data, ensure_ascii=False) -message = ServiceBusMessage(json_string.encode('utf-8')) - -# Set content type to help receivers -message.content_type = "application/json; charset=utf-8" -``` - -2. **Handle binary data correctly:** -```python -# For binary data, pass bytes directly -binary_data = b"\x00\x01\x02\x03" -message = ServiceBusMessage(binary_data) -message.content_type = "application/octet-stream" -``` ## Troubleshooting receiver issues @@ -584,215 +357,40 @@ def continuous_message_processing(receiver): time.sleep(5) # Brief pause before retry ``` -### Message completion behavior - -**Important limitation:** The Pure Python AMQP implementation used by the Azure Service Bus Python SDK does not currently wait for dispositions from the service to acknowledge message completion operations. - -**What this means:** -- When you call `complete_message()`, `abandon_message()`, or `dead_letter_message()`, the operation returns immediately -- The SDK does not wait for confirmation from the Service Bus service that the message was actually settled -- This can lead to scenarios where the local operation succeeds but the service operation fails - -**Implications:** -1. **Message state uncertainty:** -```python -# This operation may succeed locally but fail on the service -try: - receiver.complete_message(message) - print("Message completed successfully") # This may be misleading -except Exception as e: - print(f"Local completion failed: {e}") - # But even if no exception, service operation might have failed -``` - -2. **Potential message redelivery:** -- If the service doesn't receive the completion acknowledgment, the message may be redelivered -- This can lead to duplicate processing if not handled properly - -**Mitigation strategies:** -1. **Implement idempotent message processing:** -```python -import hashlib - -processed_messages = set() - -def process_message_idempotently(receiver, message): - """Process messages in an idempotent manner""" - # Create a unique identifier for the message - message_id = message.message_id or hashlib.md5(str(message.body).encode()).hexdigest() - - if message_id in processed_messages: - print(f"Message {message_id} already processed, skipping") - receiver.complete_message(message) - return - - try: - # Your message processing logic here - result = process_business_logic(message) - - # Record successful processing before completing - processed_messages.add(message_id) - receiver.complete_message(message) - - return result - except Exception as e: - print(f"Processing failed for message {message_id}: {e}") - receiver.abandon_message(message) - raise -``` - -2. **Use external tracking for critical operations:** -```python -import logging - -def track_message_completion(receiver, message, tracking_store): - """Track message completion in external store""" - message_id = message.message_id - - try: - # Process the message - result = process_message(message) - - # Store completion in external tracking system - tracking_store.mark_completed(message_id, result) - - # Complete the message in Service Bus - receiver.complete_message(message) - - logging.info(f"Message {message_id} processed and completed successfully") - - except Exception as e: - logging.error(f"Failed to process message {message_id}: {e}") - - # Check if we should retry or dead letter - if should_retry(message, e): - receiver.abandon_message(message) - else: - receiver.dead_letter_message(message, reason="ProcessingFailed", error_description=str(e)) -``` - -3. **Monitor for redelivered messages:** -```python -def handle_potential_redelivery(receiver, message): - """Handle messages that might be redelivered due to completion uncertainty""" - delivery_count = message.delivery_count - - if delivery_count > 1: - logging.warning(f"Message has been delivered {delivery_count} times. " - f"This might indicate completion acknowledgment issues.") - - # Process with extra caution for high delivery count messages - if delivery_count > 3: - # Consider different processing logic or dead lettering - logging.error(f"Message delivery count too high ({delivery_count}), dead lettering") - receiver.dead_letter_message(message, - reason="HighDeliveryCount", - error_description=f"Delivered {delivery_count} times") - return - - # Normal processing - process_message_idempotently(receiver, message) -``` - -### Receive operation hangs - -Receive operations may appear to hang when no messages are available. - -**Symptoms:** -- `receive_messages()` doesn't return for extended periods -- Application appears unresponsive - -**Resolution:** -1. **Set appropriate timeouts:** -```python -# Don't wait indefinitely for messages -messages = receiver.receive_messages(max_message_count=5, max_wait_time=30) - -# For polling scenarios, use shorter timeouts -def poll_for_messages(receiver): - while True: - messages = receiver.receive_messages(max_message_count=10, max_wait_time=5) - - if messages: - for message in messages: - process_message(message) - receiver.complete_message(message) - else: - print("No messages available, waiting...") - time.sleep(1) -``` - -2. **Use async operations with proper cancellation:** -```python -import asyncio - -async def receive_with_cancellation(receiver): - try: - # Use asyncio timeout for better control - messages = await asyncio.wait_for( - receiver.receive_messages(max_message_count=10, max_wait_time=30), - timeout=35 # Slightly longer than max_wait_time - ) - return messages - except asyncio.TimeoutError: - print("Receive operation timed out") - return [] -``` - -### Messages not being received +### Mixing sync and async code -Messages might not be received due to various configuration or state issues. +Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. -**Common causes and resolutions:** -1. **Check entity state:** -```python -# Verify the queue/subscription exists and is active -try: - # This will fail if entity doesn't exist - receiver = client.get_queue_receiver(queue_name) - messages = receiver.receive_messages(max_message_count=1, max_wait_time=5) - - if not messages: - print("No messages available - check if messages are being sent") - -except MessagingEntityNotFoundError: - print("Queue/subscription does not exist") -except MessagingEntityDisabledError: - print("Queue/subscription is disabled") -``` - -2. **Verify message filters (for subscriptions):** -```python -# For topic subscriptions, check if messages match subscription filters -from azure.servicebus.management import ServiceBusAdministrationClient +### Dead letter queue issues -admin_client = ServiceBusAdministrationClient.from_connection_string(connection_string) +Messages can be moved to the dead letter queue for various reasons: -# Check subscription rules -rules = admin_client.list_rules(topic_name, subscription_name) -for rule in rules: - print(f"Rule: {rule.name}, Filter: {rule.filter}") -``` +**Common reasons:** +- Message TTL expired +- Max delivery count exceeded +- Message was explicitly dead lettered +- Message processing failed repeatedly -3. **Check for competing consumers:** +**Debugging dead letter messages:** ```python -# Multiple receivers on the same queue will compete for messages -# Ensure this is intended behavior or use topic/subscription pattern +# Receive from dead letter queue +dlq_receiver = servicebus_client.get_queue_receiver( + queue_name="your_queue", + sub_queue=ServiceBusSubQueue.DEAD_LETTER +) -# For debugging, temporarily use peek to see if messages exist -messages = receiver.peek_messages(max_message_count=10) -print(f"Found {len(messages)} messages in queue without receiving them") +with dlq_receiver: + messages = dlq_receiver.receive_messages(max_message_count=10) + for message in messages: + print(f"Dead letter reason: {message.dead_letter_reason}") + print(f"Dead letter description: {message.dead_letter_error_description}") ``` -## Troubleshooting quota and capacity issues -### Quota exceeded errors +## Quotas -**ServiceBusQuotaExceededError resolution:** -1. **For message count limits:** Receive and process messages to reduce queue/subscription size -2. **For size limits:** Remove old messages or increase entity size limits -3. **For connection limits:** Close unused connections or consider scaling to Premium tier +Information about Service Bus quotas can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas). ### Entity not found errors @@ -802,510 +400,8 @@ print(f"Found {len(messages)} messages in queue without receiving them") 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace -## Threading and concurrency issues - -### Thread safety limitations - -**Important:** The Azure Service Bus Python SDK is **not thread-safe or coroutine-safe**. Using the same client instances across multiple threads or tasks without proper synchronization can lead to: - -- Connection errors and unexpected exceptions -- Message corruption or loss -- Deadlocks and race conditions -- Unpredictable behavior - -**Best practices:** - -1. **Use separate client instances per thread/task:** -```python -import threading -from azure.servicebus import ServiceBusClient - -def worker_thread(connection_string, queue_name): - # Create a separate client instance for each thread - client = ServiceBusClient.from_connection_string(connection_string) - with client: - sender = client.get_queue_sender(queue_name) - with sender: - # Perform operations... - pass - -# Start multiple threads with separate clients -threads = [] -for i in range(5): - t = threading.Thread(target=worker_thread, args=(connection_string, queue_name)) - threads.append(t) - t.start() - -for t in threads: - t.join() -``` - -2. **Use connection pooling patterns when needed:** -```python -# For high-throughput scenarios, consider using a thread-safe queue -# to manage client instances -import queue -import threading - -client_pool = queue.Queue() - -def get_client(): - try: - return client_pool.get_nowait() - except queue.Empty: - return ServiceBusClient.from_connection_string(connection_string) - -def return_client(client): - try: - client_pool.put_nowait(client) - except queue.Full: - client.close() -``` - -3. **Avoid sharing clients across async tasks:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) - -async def bad_async_pattern(): - # Multiple tasks sharing the same client can cause issues - sender = client.get_queue_sender(queue_name) - # This can lead to race conditions - -# DO THIS INSTEAD -async def good_async_pattern(): - # Each async function should use its own client - async with ServiceBusClient.from_connection_string(connection_string) as client: - sender = client.get_queue_sender(queue_name) - async with sender: - # Perform operations safely - pass -``` - -### Async/await best practices - -When using the async APIs in the Python Service Bus SDK: - -1. **Always use async context managers properly:** -```python -async def proper_async_usage(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello World") - await sender.send_messages(message) - - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - await receiver.complete_message(message) -``` - -2. **Don't mix sync and async code without proper handling:** -```python -# Avoid mixing sync and async incorrectly -async def mixed_code_example(): - # Don't call synchronous methods from async context without wrapping - # client = ServiceBusClient.from_connection_string(conn_str) # This is sync - - # Instead, create clients within async context or use proper wrapping - async with ServiceBusClient.from_connection_string(conn_str) as client: - pass -``` - -3. **Handle async exceptions properly:** -```python -import asyncio -from azure.servicebus import ServiceBusError - -async def handle_async_errors(): - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=1, max_wait_time=5) - # Process messages... - except ServiceBusError as e: - print(f"Service Bus error: {e}") - except asyncio.TimeoutError: - print("Operation timed out") - except Exception as e: - print(f"Unexpected error: {e}") -``` - -**Common threading/concurrency mistakes to avoid:** - -- Sharing `ServiceBusClient`, `ServiceBusSender`, or `ServiceBusReceiver` instances across threads -- Not properly closing clients and their resources in multi-threaded scenarios -- Using the same connection string with too many concurrent clients (can hit connection limits) -- Mixing blocking and non-blocking operations incorrectly -- Not handling connection failures in multi-threaded scenarios - ## Troubleshooting async operations -### Event loop issues - -Python's asyncio event loop can cause issues when not properly managed in Service Bus async operations. - -**Common symptoms:** -- `RuntimeError: no running event loop` -- `RuntimeError: cannot be called from a running event loop` -- Async operations hanging indefinitely - -**Resolution:** - -1. **Proper event loop management:** -```python -import asyncio -from azure.servicebus.aio import ServiceBusClient - -async def main(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage("Hello async world") - await sender.send_messages(message) - -# Correct way to run async Service Bus code -if __name__ == "__main__": - asyncio.run(main()) -``` - -2. **Handling existing event loops (e.g., in Jupyter notebooks):** -```python -import asyncio -import nest_asyncio - -# In environments like Jupyter where an event loop is already running -nest_asyncio.apply() - -async def notebook_friendly_function(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # Your async Service Bus operations - pass - -# Can be called directly in Jupyter -await notebook_friendly_function() -``` - -3. **Event loop in multi-threaded applications:** -```python -import asyncio -import threading -from concurrent.futures import ThreadPoolExecutor - -def run_async_in_thread(connection_string, queue_name): - """Run async Service Bus operations in a separate thread""" - async def async_operations(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - for message in messages: - print(f"Received: {message}") - await receiver.complete_message(message) - - # Create new event loop for this thread - asyncio.run(async_operations()) - -# Use ThreadPoolExecutor for better management -with ThreadPoolExecutor(max_workers=3) as executor: - futures = [ - executor.submit(run_async_in_thread, connection_string, f"queue_{i}") - for i in range(3) - ] - - for future in futures: - future.result() # Wait for completion -``` - -### Async context manager problems - -Improper use of async context managers can lead to resource leaks and connection issues. - -**Common mistakes:** - -1. **Not using async context managers:** -```python -# DON'T DO THIS -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -await sender.send_messages(message) -# Resources not properly closed - -# DO THIS INSTEAD -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Improper exception handling in async context:** -```python -async def proper_exception_handling(): - """Handle exceptions properly in async context managers""" - try: - async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_receiver(queue_name) as receiver: - messages = await receiver.receive_messages(max_message_count=10) - - for message in messages: - try: - # Process message - await process_message_async(message) - await receiver.complete_message(message) - except Exception as processing_error: - print(f"Processing failed: {processing_error}") - await receiver.abandon_message(message) - - except ServiceBusError as sb_error: - print(f"Service Bus error: {sb_error}") - except Exception as general_error: - print(f"Unexpected error: {general_error}") -``` - -3. **Resource cleanup in long-running async operations:** -```python -import asyncio -from contextlib import AsyncExitStack - -async def long_running_processor(): - """Properly manage resources in long-running async operations""" - async with AsyncExitStack() as stack: - client = await stack.enter_async_context( - ServiceBusClient.from_connection_string(connection_string) - ) - receiver = await stack.enter_async_context( - client.get_queue_receiver(queue_name) - ) - - # Long-running processing loop - while True: - try: - messages = await receiver.receive_messages( - max_message_count=10, - max_wait_time=30 - ) - - if not messages: - await asyncio.sleep(1) - continue - - # Process messages with proper error handling - await process_messages_batch(receiver, messages) - - except KeyboardInterrupt: - print("Shutting down gracefully...") - break - except Exception as e: - print(f"Error in processing loop: {e}") - await asyncio.sleep(5) # Brief pause before retry - -async def process_messages_batch(receiver, messages): - """Process a batch of messages with individual error handling""" - for message in messages: - try: - await process_single_message(message) - await receiver.complete_message(message) - except Exception as e: - print(f"Failed to process message {message.message_id}: {e}") - await receiver.abandon_message(message) -``` - -### Mixing sync and async code - -Mixing synchronous and asynchronous Service Bus operations can cause issues. - -**Common problems:** - -1. **Calling async methods without await:** -```python -# WRONG - This returns a coroutine, doesn't actually send -client = ServiceBusClient.from_connection_string(connection_string) -sender = client.get_queue_sender(queue_name) -sender.send_messages(message) # Missing 'await' - -# CORRECT -async with ServiceBusClient.from_connection_string(connection_string) as client: - async with client.get_queue_sender(queue_name) as sender: - await sender.send_messages(message) -``` - -2. **Using sync and async clients together:** -```python -# Avoid mixing sync and async clients in the same application -# Choose one pattern and stick with it - -# Option 1: Pure async -async def async_pattern(): - async with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are async - pass - -# Option 2: Pure sync -def sync_pattern(): - with ServiceBusClient.from_connection_string(connection_string) as client: - # All operations are sync - pass -``` - -3. **Proper integration with async frameworks (FastAPI, aiohttp, etc.):** -```python -# Example with FastAPI -from fastapi import FastAPI, BackgroundTasks -from azure.servicebus.aio import ServiceBusClient - -app = FastAPI() - -# Global client for reuse (properly managed) -class ServiceBusManager: - def __init__(self): - self.client = None - - async def start(self): - self.client = ServiceBusClient.from_connection_string(connection_string) - - async def stop(self): - if self.client: - await self.client.close() - -sb_manager = ServiceBusManager() - -@app.on_event("startup") -async def startup_event(): - await sb_manager.start() - -@app.on_event("shutdown") -async def shutdown_event(): - await sb_manager.stop() - -@app.post("/send-message") -async def send_message(message_content: str): - async with sb_manager.client.get_queue_sender(queue_name) as sender: - message = ServiceBusMessage(message_content) - await sender.send_messages(message) - return {"status": "sent"} -``` - -## Frequently asked questions - -### Q: Why am I getting connection timeout errors? - -**A:** Connection timeouts can occur due to: -- Network connectivity issues -- Firewall blocking AMQP ports (5671-5672) -- DNS resolution problems - -Try using AMQP over WebSockets (port 443) or check your network configuration. - -### Q: How do I handle transient errors? - -**A:** Implement retry logic with exponential backoff for transient errors like: -- `ServiceBusConnectionError` -- `OperationTimeoutError` -- `ServiceBusServerBusyError` - -### Q: Why are my messages going to the dead letter queue? - -**A:** Common reasons include: -- Message TTL expiration -- Maximum delivery count exceeded -- Explicit dead lettering in message processing logic -- Poison message detection - -Check the `dead_letter_reason` and `dead_letter_error_description` properties on dead lettered messages. - -### Q: How do I process messages faster? - -**A:** Consider: -- Using concurrent message processing (with separate client instances per thread/task) -- Optimizing your message processing logic -- Using `prefetch_count` to pre-fetch messages (use with caution - see note below) -- Scaling out with multiple receivers (on different clients) - -**Note on prefetch_count:** Be careful when using `prefetch_count` as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. - -### Q: What's the difference between `complete_message()` and `abandon_message()`? - -**A:** -- `complete_message()`: Removes the message from the queue/subscription (successful processing) -- `abandon_message()`: Returns the message to the queue/subscription for reprocessing - -**Important:** Due to Python AMQP implementation limitations, these operations return immediately without waiting for service acknowledgment. Implement idempotent processing to handle potential redelivery. - -### Q: How do I handle message ordering? - -**A:** -- Use **sessions** for guaranteed message ordering within a session -- For partitioned entities, messages with the same partition key maintain order -- Regular queues do not guarantee strict FIFO ordering - -```python -# Using sessions for ordered processing -with client.get_queue_receiver(queue_name, session_id="order_123") as session_receiver: - messages = session_receiver.receive_messages(max_message_count=10) - - # Messages within this session are processed in order - for message in messages: - process_message_in_order(message) - session_receiver.complete_message(message) -``` - -### Q: How do I implement retry logic for transient failures? - -**A:** -```python -import time -import random -from azure.servicebus.exceptions import ServiceBusError - -def exponential_backoff_retry(operation, max_retries=3): - """Implement exponential backoff retry for Service Bus operations""" - for attempt in range(max_retries + 1): - try: - return operation() - except ServiceBusError as e: - if attempt == max_retries: - raise - - # Check if error is retryable - if hasattr(e, 'reason'): - retryable_reasons = ['ServiceTimeout', 'ServerBusy', 'ServiceCommunicationProblem'] - if e.reason not in retryable_reasons: - raise - - # Calculate backoff delay - delay = (2 ** attempt) + random.uniform(0, 1) - print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.2f} seconds...") - time.sleep(delay) - -# Usage example -def send_with_retry(sender, message): - return exponential_backoff_retry(lambda: sender.send_messages(message)) -``` - -### Q: How do I monitor message processing performance? - -**A:** -```python -import time -import logging -from contextlib import contextmanager - -@contextmanager -def message_processing_timer(message_id): - """Context manager to time message processing""" - start_time = time.time() - try: - yield - finally: - processing_time = time.time() - start_time - logging.info(f"Message {message_id} processed in {processing_time:.3f}s") - -# Usage -def process_with_monitoring(receiver, message): - with message_processing_timer(message.message_id): - # Your processing logic - result = process_message(message) - receiver.complete_message(message) - return result -``` - ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. @@ -1331,5 +427,3 @@ When filing GitHub issues for Service Bus, please include: 5. **Error details:** Complete exception stack trace and error messages The more information provided, the faster we can help resolve your issue. - -Please view the [exceptions reference docs](https://docs.microsoft.com/python/api/azure-servicebus/azure.servicebus.exceptions) for detailed descriptions of our common Exception types. From 1991b18c8c85dcee1686bf19c16a539940f45890 Mon Sep 17 00:00:00 2001 From: swathipil Date: Mon, 9 Jun 2025 23:14:19 -0700 Subject: [PATCH 20/28] address rest of comments --- .../azure-servicebus/TROUBLESHOOTING.md | 68 ++++++++++--------- 1 file changed, 35 insertions(+), 33 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 554fc24053de..4c6bc17d01b9 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -7,31 +7,32 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [General troubleshooting](#general-troubleshooting) * [Enable client logging](#enable-client-logging) * [Common exceptions](#common-exceptions) + * [Authentication Exceptions](#authentication-exceptions) + * [Connection and Timeout Exceptions](#connection-and-timeout-exceptions) + * [Message and Session Handling Exceptions](#message-and-session-handling-exceptions) + * [Service and Entity Exceptions](#service-and-entity-exceptions) + * [Auto Lock Renewal Exceptions](#auto-lock-renewal-exceptions) * [Threading and concurrency issues](#threading-and-concurrency-issues) * [Thread safety limitations](#thread-safety-limitations) - * [Async/await best practices](#asyncawait-best-practices) -* [Authentication issues](#authentication-issues) - * [Authentication errors](#authentication-errors) - * [Authorization errors](#authorization-errors) - * [Connection string issues](#connection-string-issues) -* [Connectivity issues](#troubleshooting-connectivity-issues) - * [Connection errors](#connection-errors) - * [Firewall and proxy issues](#firewall-and-proxy-issues) - * [Service busy errors](#service-busy-errors) +* [Troubleshooting Authentication issues](#troubleshooting-authentication-issues) +* [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) + * [Timeout when connecting to service](#timeout-when-connecting-to-service) + * [SSL handshake failures](#ssl-handshake-failures) + * [Adding components to the connection string does not work](#adding-components-to-the-connection-string-does-not-work) + * ["TransportType.AmqpOverWebSocket" Alternative](#transporttypeamqpoverwebsocket-alternative) + * [Azure Identity Authentication Alternative](#azure-identity-authentication-alternative) * [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) - * [Message lock issues](#message-lock-issues) + * [Message and session lock issues](#message-and-session-lock-issues) * [Message size issues](#message-size-issues) - * [Message settlement issues](#message-settlement-issues) -* [Troubleshooting session handling issues](#troubleshooting-session-handling-issues) - * [Session lock issues](#session-lock-issues) - * [Session cannot be locked](#session-cannot-be-locked) * [Troubleshooting receiver issues](#troubleshooting-receiver-issues) * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) * [Mixing sync and async code](#mixing-sync-and-async-code) * [Dead letter queue issues](#dead-letter-queue-issues) * [Quotas](#quotas) + * [Entity not found errors](#entity-not-found-errors) * [Troubleshooting async operations](#troubleshooting-async-operations) * [Get additional help](#get-additional-help) + * [Filing GitHub issues](#filing-github-issues) ## General troubleshooting @@ -76,38 +77,40 @@ ServiceBusErrors often have an underlying AMQP error code which specifies whethe * `retry_mode`: The delay behavior between retry attempts. Supported values are 'fixed' or 'exponential' When an exception is surfaced to the application, either all retries were applied unsuccessfully, or the exception was considered non-transient. -#### Connection and Authentication Exceptions - -- **ServiceBusConnectionError:** An error occurred in the connection to the se -rvice. This may have been caused by a transient network issue or service proble -m. It is recommended to retry. +#### Authentication Exceptions - **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. - **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation. It is recommended to check the permission of the credentials. -#### Operation and Timeout Exceptions +See the [Troubleshooting Authentication issues](#troubleshooting-authentication-issues) section to troubleshoot authentication/permission issues. + +#### Connection and Timeout Exceptions + +- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. - **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. - **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). -#### Message Handling Exceptions +See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) section to troubleshoot connection and timeout issues. More information on AMQP errors in Azure Service Bus can be found [here](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot). + +#### Message and Session Handling Exceptions - **MessageSizeExceededError:** This indicates that the max message size has been exceeded. The message size includes the body of the message, as well as any associated metadata and system overhead. The best approach for resolving this error is to reduce the number of messages being sent in a batch or the size of the body included in the message. Because size limits are subject to change, please refer to [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for specifics. - **MessageAlreadySettled:** This indicates failure to settle the message. This could happen when trying to settle an already-settled message. -- **MessageLockLostError:** Indicates that the lock on the message is lost. Callers should attempt to receive and process the message again. This exception only applies to entities that don't use sessions. This error occurs if processing takes longer than the lock duration and the message lock isn't renewed. This error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes, as enforced by the service. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. - - **MessageNotFoundError:** This occurs when attempting to receive a deferred message by sequence number for a message that either doesn't exist in the entity, or is currently locked. -#### Session Handling Exceptions +- **MessageLockLostError:** Indicates that the lock on the message is lost. Callers should attempt to receive and process the message again. This exception only applies to entities that don't use sessions. This error occurs if processing takes longer than the lock duration and the message lock isn't renewed. This error can also occur when the link is detached due to a transient network issue or when the link is idle for 10 minutes, as enforced by the service. `AutoLockRenewer` could help on keeping the lock of the message automatically renewed. - **SessionLockLostError:** The lock on the session has expired. All unsettled messages that have been received can no longer be settled. It is recommended to reconnect to the session if receive messages again if necessary. You should be aware of the lock duration of a session and keep renewing the lock before expiration in case of long processing time. `AutoLockRenewer` could help on keeping the lock of the session automatically renewed. - **SessionCannotBeLockedError:** Attempt to connect to a session with a specific session ID, but the session is currently locked by another client. Make sure the session is unlocked by other clients. +See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) section to troubleshoot message and session lock issues. + #### Service and Entity Exceptions - **ServiceBusQuotaExceededError:** This typically indicates that there are too many active receive operations for a single entity. In order to avoid this error, reduce the number of potential concurrent receives. You can use batch receives to attempt to receive multiple messages per receive request. Please see [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for more information. @@ -124,6 +127,8 @@ m. It is recommended to retry. - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. +See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) to help troubleshoot AutoLockRenewer errors. + ## Threading and concurrency issues ### Thread safety limitations @@ -199,7 +204,7 @@ with client: executor.submit(send_batch, i, sender) ``` -## Authentication issues +## Troubleshooting Authentication issues Authentication errors typically occur when the credentials provided are incorrect or have expired. Authorization errors occur when the authenticated identity doesn't have sufficient permissions. @@ -211,7 +216,7 @@ The following verification steps are recommended, depending on the type of autho - [Verify the correct RBAC roles were granted](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-managed-service-identity) - Indicated by errors: `Send/Listen claim(s) are required to perform this operation.` In this case, ensure that the appropriate roles were assigned: `Azure Service Bus Data Owner`, `Azure Service Bus Data Sender`, or `Azure Service Bus Data Receiver`. -## Connectivity issues +## Troubleshooting Connectivity issues ### Timeout when connecting to service @@ -251,11 +256,11 @@ For more information about the `azure-identity` library, see: [Azure Identity cl ## Troubleshooting message handling issues -### Message lock issues +### Message and session lock issues -Messages in Service Bus have a lock duration during which they must be settled (completed, abandoned, etc.). +Messages, sessionful and non-sessionful, in Service Bus have a lock duration during which they must be settled (completed, abandoned, etc.). -**MessageLockLostError resolution:** +**MessageLockLostError and SessionLockLostError resolution:** 1. Process messages faster or increase lock duration 2. If setting `prefetch_count` to a large number, consider setting it lower as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. 3. Use `AutoLockRenewer` for long-running processing. @@ -273,8 +278,6 @@ with receiver: receiver.complete_message(message) ``` -3. Handle lock lost errors gracefully by catching the exception and potentially re-receiving the message - ### Message size issues **MessageSizeExceededError resolution:** @@ -359,8 +362,7 @@ def continuous_message_processing(receiver): ### Mixing sync and async code -Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. - +Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations, such as the `AutoLockRenewer`, hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. ### Dead letter queue issues From 5e40d73782d338c33719f21df1b9e90846168a42 Mon Sep 17 00:00:00 2001 From: swathipil Date: Tue, 10 Jun 2025 09:19:36 -0700 Subject: [PATCH 21/28] fix links --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 4c6bc17d01b9..7b8104ac4034 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -91,9 +91,9 @@ See the [Troubleshooting Authentication issues](#troubleshooting-authentication- - **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). -See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) section to troubleshoot connection and timeout issues. More information on AMQP errors in Azure Service Bus can be found [here](https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot). +See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) section to troubleshoot connection and timeout issues. More information on AMQP errors in Azure Service Bus can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-amqp-troubleshoot). #### Message and Session Handling Exceptions @@ -228,7 +228,7 @@ To troubleshoot: - Check the firewall and port permissions in your hosting environment and that the AMQP ports 5671 and 5672 are open and that the endpoint is allowed through the firewall. -- Try using the Web Socket transport option, which connects using port 443. This can be done by passing the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/en-us/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the client. +- Try using the Web Socket transport option, which connects using port 443. This can be done by passing the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the client. - See if your network is blocking specific IP addresses. For details, see: [What IP addresses do I need to allow?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ip-addresses-do-i-need-to-add-to-allowlist-). @@ -246,7 +246,7 @@ Previous generations of the Service Bus clients allowed for some behavior to be #### "TransportType.AmqpOverWebSocket" Alternative -To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/en-us/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the ServiceBusClient. +To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the ServiceBusClient. #### Azure Identity Authentication Alternative From eb640c518bb4f215f43f87495fb37880cf2f434e Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 10 Jun 2025 19:50:44 +0000 Subject: [PATCH 22/28] Address review comments: improve retry info, fix headers, add WebSockets suggestion, and clean up formatting Co-authored-by: jsquire <913445+jsquire@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 73 +++++++++---------- 1 file changed, 33 insertions(+), 40 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 7b8104ac4034..21da2a91fbbe 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -7,20 +7,20 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [General troubleshooting](#general-troubleshooting) * [Enable client logging](#enable-client-logging) * [Common exceptions](#common-exceptions) - * [Authentication Exceptions](#authentication-exceptions) - * [Connection and Timeout Exceptions](#connection-and-timeout-exceptions) - * [Message and Session Handling Exceptions](#message-and-session-handling-exceptions) - * [Service and Entity Exceptions](#service-and-entity-exceptions) - * [Auto Lock Renewal Exceptions](#auto-lock-renewal-exceptions) + * [Authentication exceptions](#authentication-exceptions) + * [Connection and timeout exceptions](#connection-and-timeout-exceptions) + * [Message and session handling exceptions](#message-and-session-handling-exceptions) + * [Service and entity exceptions](#service-and-entity-exceptions) + * [Auto lock renewal exceptions](#auto-lock-renewal-exceptions) * [Threading and concurrency issues](#threading-and-concurrency-issues) * [Thread safety limitations](#thread-safety-limitations) -* [Troubleshooting Authentication issues](#troubleshooting-authentication-issues) -* [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) +* [Troubleshooting authentication and authorization issues](#troubleshooting-authentication-and-authorization-issues) +* [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) * [Timeout when connecting to service](#timeout-when-connecting-to-service) * [SSL handshake failures](#ssl-handshake-failures) * [Adding components to the connection string does not work](#adding-components-to-the-connection-string-does-not-work) - * ["TransportType.AmqpOverWebSocket" Alternative](#transporttypeamqpoverwebsocket-alternative) - * [Azure Identity Authentication Alternative](#azure-identity-authentication-alternative) + * [Specifying AMQP over websockets](#specifying-amqp-over-websockets) + * [Using Service Bus with Azure Identity](#using-service-bus-with-azure-identity) * [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) * [Message and session lock issues](#message-and-session-lock-issues) * [Message size issues](#message-size-issues) @@ -30,7 +30,6 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Dead letter queue issues](#dead-letter-queue-issues) * [Quotas](#quotas) * [Entity not found errors](#entity-not-found-errors) -* [Troubleshooting async operations](#troubleshooting-async-operations) * [Get additional help](#get-additional-help) * [Filing GitHub issues](#filing-github-issues) @@ -77,25 +76,25 @@ ServiceBusErrors often have an underlying AMQP error code which specifies whethe * `retry_mode`: The delay behavior between retry attempts. Supported values are 'fixed' or 'exponential' When an exception is surfaced to the application, either all retries were applied unsuccessfully, or the exception was considered non-transient. -#### Authentication Exceptions +#### Authentication exceptions - **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. -- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation. It is recommended to check the permission of the credentials. +- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation, or could be transient due to clock skew or service issues. The client will retry these errors automatically. If you continue to see this exception, it means all configured retries were exhausted - check the permission of the credentials and consider adjusting retry configuration. See the [Troubleshooting Authentication issues](#troubleshooting-authentication-issues) section to troubleshoot authentication/permission issues. -#### Connection and Timeout Exceptions +#### Connection and timeout exceptions -- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. It is recommended to retry. +- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. The client automatically retries these errors - if you see this exception, all configured retries were exhausted. Consider adjusting retry configuration rather than implementing additional retry logic. -- **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. It is recommended to attempt to verify the current state and retry if necessary. +- **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. The client automatically retries these errors - if you see this exception, all configured retries were exhausted. Consider verifying the current state and adjusting retry configuration if necessary. -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). You can also try setting the WebSockets transport type (`TransportType.AmqpOverWebsocket`) which often works around port/firewall issues. See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) section to troubleshoot connection and timeout issues. More information on AMQP errors in Azure Service Bus can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-amqp-troubleshoot). -#### Message and Session Handling Exceptions +#### Message and session handling exceptions - **MessageSizeExceededError:** This indicates that the max message size has been exceeded. The message size includes the body of the message, as well as any associated metadata and system overhead. The best approach for resolving this error is to reduce the number of messages being sent in a batch or the size of the body included in the message. Because size limits are subject to change, please refer to [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for specifics. @@ -111,17 +110,17 @@ See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issu See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) section to troubleshoot message and session lock issues. -#### Service and Entity Exceptions +#### Service and entity exceptions - **ServiceBusQuotaExceededError:** This typically indicates that there are too many active receive operations for a single entity. In order to avoid this error, reduce the number of potential concurrent receives. You can use batch receives to attempt to receive multiple messages per receive request. Please see [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) for more information. -- **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. +- **ServiceBusServerBusyError:** Service isn't able to process the request at this time. Client can wait for a period of time, then retry the operation. For more information about quotas and limits, see [Service Bus quotas](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas). - **MessagingEntityNotFoundError:** Entity associated with the operation doesn't exist or it has been deleted. Please make sure the entity exists. - **MessagingEntityDisabledError:** Request for a runtime operation on a disabled entity. Please activate the entity. -#### Auto Lock Renewal Exceptions +#### Auto lock renewal exceptions - **AutoLockRenewFailed:** An attempt to renew a lock on a message or session in the background has failed. This could happen when the receiver used by `AutoLockRenewer` is closed or the lock of the renewable has expired. It is recommended to re-register the renewable message or session by receiving the message or connect to the sessionful entity again. @@ -133,7 +132,7 @@ See the [Troubleshooting message handling issues](#troubleshooting-message-handl ### Thread safety limitations -**Important:** We do not guarantee that the ServiceBusClient, ServiceBusSender, and ServiceBusReceiver are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. +**Important:** We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. @@ -174,7 +173,9 @@ async with client: await asyncio.gather(*(send_batch(i, sender) for i in range(5))) ``` -For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like threading.Lock(). **Note:** Native async APIs should be used instead of running in a ThreadPoolExecutor, if possible. +For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like `threading.Lock()`. + +> **NOTE:** Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. ```python import threading @@ -204,7 +205,7 @@ with client: executor.submit(send_batch, i, sender) ``` -## Troubleshooting Authentication issues +## Troubleshooting authentication and authorization issues Authentication errors typically occur when the credentials provided are incorrect or have expired. Authorization errors occur when the authenticated identity doesn't have sufficient permissions. @@ -216,7 +217,7 @@ The following verification steps are recommended, depending on the type of autho - [Verify the correct RBAC roles were granted](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-managed-service-identity) - Indicated by errors: `Send/Listen claim(s) are required to perform this operation.` In this case, ensure that the appropriate roles were assigned: `Azure Service Bus Data Owner`, `Azure Service Bus Data Sender`, or `Azure Service Bus Data Receiver`. -## Troubleshooting Connectivity issues +## Troubleshooting connectivity issues ### Timeout when connecting to service @@ -236,7 +237,7 @@ To troubleshoot: ### SSL handshake failures -This error can occur when an intercepting proxy is used. To verify, it is recommended that the application be tested in the host environment with the proxy disabled. +This error can occur when an intercepting proxy is used. To verify, it is recommended that the application be tested in the host environment with the proxy disabled. Note that intercepting proxies are not a supported scenario. ### Adding components to the connection string does not work @@ -244,11 +245,11 @@ The current generation of the Service Bus client library supports connection str Previous generations of the Service Bus clients allowed for some behavior to be configured by adding key/value components to a connection string. These components are no longer recognized and have no effect on client behavior. -#### "TransportType.AmqpOverWebSocket" Alternative +#### Specifying AMQP over websockets To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the ServiceBusClient. -#### Azure Identity Authentication Alternative +#### Using Service Bus with Azure Identity To authenticate with Azure Identity, see: [Client Identity Authentication](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/servicebus/azure-servicebus/samples/sync_samples/client_identity_authentication.py). @@ -262,9 +263,9 @@ Messages, sessionful and non-sessionful, in Service Bus have a lock duration dur **MessageLockLostError and SessionLockLostError resolution:** 1. Process messages faster or increase lock duration -2. If setting `prefetch_count` to a large number, consider setting it lower as it can cause message lock expiration if processing takes too long. The client cannot extend locks for prefetched messages. +2. If setting `prefetch_count` to a large number, consider setting it lower as the lock timer starts running when the message is fetched, even though not visible to the application and the client cannot extend locks for prefetched messages. 3. Use `AutoLockRenewer` for long-running processing. - * When running the async AutoLockRenewer, ensure that the event loop is not blocked during message processing. (e.g. `time.sleep(60)` --> `await asyncio.sleep(60)`). Otherwise, the AutoLockRenewer will be prevented from running in the background. + * When running the async `AutoLockRenewer`, ensure that the event loop is not blocked during message processing. (e.g. `time.sleep(60)` --> `await asyncio.sleep(60)`). Otherwise, the `AutoLockRenewer` will be prevented from running in the background. ```python from azure.servicebus import AutoLockRenewer @@ -284,12 +285,7 @@ with receiver: 1. Reduce message payload size 2. Consider splitting large messages across multiple smaller messages -**Service Bus message size limits:** -- Standard tier: 256 KB per message -- Premium tier: 1 MB per message - -For the most up-to-date information on Service Bus limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. - +For the most up-to-date information on Service Bus message size limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. ## Troubleshooting receiver issues @@ -299,7 +295,7 @@ When attempting to receive multiple messages using `receive_messages()` with `ma **Why this happens:** - Service Bus optimizes for throughput and latency -- After the first message is received, the receiver waits only a short time (typically 20ms) for additional messages +- After the first message is received, the receiver prioritizes processing it and does not attempt to build a batch of the requested size - The `max_wait_time` controls how long to wait for the **first** message, not subsequent ones **Resolution:** @@ -362,7 +358,7 @@ def continuous_message_processing(receiver): ### Mixing sync and async code -Mixing synchronous and asynchronous Service Bus operations can cause issues such as async operations, such as the `AutoLockRenewer`, hanging indefinitely due to the event loop being blocked. Ensure that blocking calls are not made when receiving and message processing. +Mixing synchronous and asynchronous Service Bus operations can cause issues such as the `AutoLockRenewer` hanging indefinitely because the event loop is blocked. Ensure that blocking calls are not made when receiving and processing messages asynchronously. ### Dead letter queue issues @@ -389,7 +385,6 @@ with dlq_receiver: print(f"Dead letter description: {message.dead_letter_error_description}") ``` - ## Quotas Information about Service Bus quotas can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas). @@ -402,8 +397,6 @@ Information about Service Bus quotas can be found [here](https://learn.microsoft 3. Check if the entity was deleted and needs to be recreated 4. Verify you're connecting to the correct namespace -## Troubleshooting async operations - ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. From b684064e7088dbc36c979b55418412f9d9ed2b5f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 11 Jun 2025 16:20:38 +0000 Subject: [PATCH 23/28] Remove threading and concurrency section - not Service Bus specific Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 79 +------------------ 1 file changed, 1 insertion(+), 78 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 21da2a91fbbe..cd544042d420 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -12,8 +12,7 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Message and session handling exceptions](#message-and-session-handling-exceptions) * [Service and entity exceptions](#service-and-entity-exceptions) * [Auto lock renewal exceptions](#auto-lock-renewal-exceptions) -* [Threading and concurrency issues](#threading-and-concurrency-issues) - * [Thread safety limitations](#thread-safety-limitations) + * [Troubleshooting authentication and authorization issues](#troubleshooting-authentication-and-authorization-issues) * [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) * [Timeout when connecting to service](#timeout-when-connecting-to-service) @@ -128,82 +127,6 @@ See the [Troubleshooting message handling issues](#troubleshooting-message-handl See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) to help troubleshoot AutoLockRenewer errors. -## Threading and concurrency issues - -### Thread safety limitations - -**Important:** We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. - -The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. - -Using the same client instances across multiple threads or tasks without proper synchronization can lead to: - -- Connection errors and unexpected exceptions -- Message corruption or loss -- Deadlocks and race conditions -- Unpredictable behavior - -It is up to the running application to use these classes in a concurrency-safe manner. - -For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like asyncio.Lock(). - -```python -import asyncio -from azure.servicebus.aio import ServiceBusClient -from azure.servicebus import ServiceBusMessage -from azure.identity.aio import DefaultAzureCredential - -SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" -QUEUE_NAME = "" - -lock = asyncio.Lock() - -async def send_batch(sender_id, sender): - async with lock: - messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] - await sender.send_messages(messages) - print(f"Sender {sender_id} sent messages.") - -credential = DefaultAzureCredential() -client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) - -async with client: - sender = client.get_queue_sender(queue_name=QUEUE_NAME) - async with sender: - await asyncio.gather(*(send_batch(i, sender) for i in range(5))) -``` - -For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like `threading.Lock()`. - -> **NOTE:** Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. - -```python -import threading -from concurrent.futures import ThreadPoolExecutor -from azure.servicebus import ServiceBusClient, ServiceBusMessage -from azure.identity import DefaultAzureCredential - -SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" -QUEUE_NAME = "" - -lock = threading.Lock() - -def send_batch(sender_id, sender): - with lock: - messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] - sender.send_messages(messages) - print(f"Sender {sender_id} sent messages.") - -credential = DefaultAzureCredential() -client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) - -with client: - sender = client.get_queue_sender(queue_name=QUEUE_NAME) - with sender: - with ThreadPoolExecutor(max_workers=5) as executor: - for i in range(5): - executor.submit(send_batch, i, sender) -``` ## Troubleshooting authentication and authorization issues From 80409e23bdcebfcd85789cdbee4ac37d4a557ff6 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 11 Jun 2025 16:25:03 +0000 Subject: [PATCH 24/28] Remove client options references from connection string troubleshooting Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index cd544042d420..0436e99e44f2 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -164,9 +164,7 @@ This error can occur when an intercepting proxy is used. To verify, it is recomm ### Adding components to the connection string does not work -The current generation of the Service Bus client library supports connection strings only in the form published by the Azure portal. These are intended to provide basic location and shared key information only; configuring behavior of the clients is done through its options. - -Previous generations of the Service Bus clients allowed for some behavior to be configured by adding key/value components to a connection string. These components are no longer recognized and have no effect on client behavior. +The Service Bus client library supports connection strings only in the format published by the Azure portal. Custom key/value components added to connection strings are not recognized and have no effect on client behavior. #### Specifying AMQP over websockets From 564e3239a07ca56781b1060ee290671a733675f7 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 11 Jun 2025 22:41:48 +0000 Subject: [PATCH 25/28] Undo changes from commits b684064e and 80409e23 - restore threading section and client options references Co-authored-by: swathipil <76007337+swathipil@users.noreply.github.com> --- .../azure-servicebus/TROUBLESHOOTING.md | 83 ++++++++++++++++++- 1 file changed, 81 insertions(+), 2 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 0436e99e44f2..21da2a91fbbe 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -12,7 +12,8 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Message and session handling exceptions](#message-and-session-handling-exceptions) * [Service and entity exceptions](#service-and-entity-exceptions) * [Auto lock renewal exceptions](#auto-lock-renewal-exceptions) - +* [Threading and concurrency issues](#threading-and-concurrency-issues) + * [Thread safety limitations](#thread-safety-limitations) * [Troubleshooting authentication and authorization issues](#troubleshooting-authentication-and-authorization-issues) * [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) * [Timeout when connecting to service](#timeout-when-connecting-to-service) @@ -127,6 +128,82 @@ See the [Troubleshooting message handling issues](#troubleshooting-message-handl See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) to help troubleshoot AutoLockRenewer errors. +## Threading and concurrency issues + +### Thread safety limitations + +**Important:** We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. + +The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. + +Using the same client instances across multiple threads or tasks without proper synchronization can lead to: + +- Connection errors and unexpected exceptions +- Message corruption or loss +- Deadlocks and race conditions +- Unpredictable behavior + +It is up to the running application to use these classes in a concurrency-safe manner. + +For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like asyncio.Lock(). + +```python +import asyncio +from azure.servicebus.aio import ServiceBusClient +from azure.servicebus import ServiceBusMessage +from azure.identity.aio import DefaultAzureCredential + +SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" +QUEUE_NAME = "" + +lock = asyncio.Lock() + +async def send_batch(sender_id, sender): + async with lock: + messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] + await sender.send_messages(messages) + print(f"Sender {sender_id} sent messages.") + +credential = DefaultAzureCredential() +client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) + +async with client: + sender = client.get_queue_sender(queue_name=QUEUE_NAME) + async with sender: + await asyncio.gather(*(send_batch(i, sender) for i in range(5))) +``` + +For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like `threading.Lock()`. + +> **NOTE:** Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. + +```python +import threading +from concurrent.futures import ThreadPoolExecutor +from azure.servicebus import ServiceBusClient, ServiceBusMessage +from azure.identity import DefaultAzureCredential + +SERVICE_BUS_NAMESPACE = ".servicebus.windows.net" +QUEUE_NAME = "" + +lock = threading.Lock() + +def send_batch(sender_id, sender): + with lock: + messages = [ServiceBusMessage(f"Message {i} from sender {sender_id}") for i in range(10)] + sender.send_messages(messages) + print(f"Sender {sender_id} sent messages.") + +credential = DefaultAzureCredential() +client = ServiceBusClient(fully_qualified_namespace=SERVICE_BUS_NAMESPACE, credential=credential) + +with client: + sender = client.get_queue_sender(queue_name=QUEUE_NAME) + with sender: + with ThreadPoolExecutor(max_workers=5) as executor: + for i in range(5): + executor.submit(send_batch, i, sender) +``` ## Troubleshooting authentication and authorization issues @@ -164,7 +241,9 @@ This error can occur when an intercepting proxy is used. To verify, it is recomm ### Adding components to the connection string does not work -The Service Bus client library supports connection strings only in the format published by the Azure portal. Custom key/value components added to connection strings are not recognized and have no effect on client behavior. +The current generation of the Service Bus client library supports connection strings only in the form published by the Azure portal. These are intended to provide basic location and shared key information only; configuring behavior of the clients is done through its options. + +Previous generations of the Service Bus clients allowed for some behavior to be configured by adding key/value components to a connection string. These components are no longer recognized and have no effect on client behavior. #### Specifying AMQP over websockets From 3659814a58c2b33c691950b8f62408ac2bc5e675 Mon Sep 17 00:00:00 2001 From: swathipil Date: Wed, 11 Jun 2025 16:07:01 -0700 Subject: [PATCH 26/28] double check Jesse's comments are addressed --- .../azure-servicebus/TROUBLESHOOTING.md | 57 +++++++------------ 1 file changed, 21 insertions(+), 36 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 21da2a91fbbe..2caf16e4a561 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -18,9 +18,9 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) * [Timeout when connecting to service](#timeout-when-connecting-to-service) * [SSL handshake failures](#ssl-handshake-failures) - * [Adding components to the connection string does not work](#adding-components-to-the-connection-string-does-not-work) - * [Specifying AMQP over websockets](#specifying-amqp-over-websockets) - * [Using Service Bus with Azure Identity](#using-service-bus-with-azure-identity) + * [Specifying AMQP over websockets](#specifying-amqp-over-websockets) + * [Using Service Bus with Azure Identity](#using-service-bus-with-azure-identity) + * [Entity not found errors](#entity-not-found-errors) * [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) * [Message and session lock issues](#message-and-session-lock-issues) * [Message size issues](#message-size-issues) @@ -28,8 +28,6 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Number of messages returned doesn't match number requested](#number-of-messages-returned-doesnt-match-number-requested) * [Mixing sync and async code](#mixing-sync-and-async-code) * [Dead letter queue issues](#dead-letter-queue-issues) -* [Quotas](#quotas) - * [Entity not found errors](#entity-not-found-errors) * [Get additional help](#get-additional-help) * [Filing GitHub issues](#filing-github-issues) @@ -80,17 +78,17 @@ When an exception is surfaced to the application, either all retries were applie - **ServiceBusAuthenticationError:** An error occurred when authenticating the connection to the service. This may have been caused by the credentials being incorrect. It is recommended to check the credentials. -- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation, or could be transient due to clock skew or service issues. The client will retry these errors automatically. If you continue to see this exception, it means all configured retries were exhausted - check the permission of the credentials and consider adjusting retry configuration. +- **ServiceBusAuthorizationError:** An error occurred when authorizing the connection to the service. This may have been caused by the credentials not having the right permission to perform the operation, or could be transient due to clock skew or service issues. The client will retry these errors automatically. If you continue to see this exception, it means all configured retries were exhausted - check the permission of the credentials and consider adjusting [retry configuration](#common-exceptions). See the [Troubleshooting Authentication issues](#troubleshooting-authentication-issues) section to troubleshoot authentication/permission issues. #### Connection and timeout exceptions -- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. The client automatically retries these errors - if you see this exception, all configured retries were exhausted. Consider adjusting retry configuration rather than implementing additional retry logic. +- **ServiceBusConnectionError:** An error occurred in the connection to the service. This may have been caused by a transient network issue or service problem. The client automatically retries these errors - if you see this exception, all configured retries were exhausted. Consider adjusting [retry configuration](#common-exceptions) rather than implementing additional retry logic. - **OperationTimeoutError:** This indicates that the service did not respond to an operation within the expected amount of time. This may have been caused by a transient network issue or service problem. The service may or may not have successfully completed the request; the status is not known. The client automatically retries these errors - if you see this exception, all configured retries were exhausted. Consider verifying the current state and adjusting retry configuration if necessary. -- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). You can also try setting the WebSockets transport type (`TransportType.AmqpOverWebsocket`) which often works around port/firewall issues. +- **ServiceBusCommunicationError:** Client isn't able to establish a connection to Service Bus. Make sure the supplied host name is correct and the host is reachable. If your code runs in an environment with a firewall/proxy, ensure that the traffic to the Service Bus domain/IP address and ports isn't blocked. For details on which ports need to be open, see the [Azure Service Bus FAQ: What ports do I need to open on the firewall?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ports-do-i-need-to-open-on-the-firewall--). You can also try setting the [WebSockets transport type](#specifying-amqp-over-websockets) which often works around port/firewall issues. See the [Troubleshooting Connectivity issues](#troubleshooting-connectivity-issues) section to troubleshoot connection and timeout issues. More information on AMQP errors in Azure Service Bus can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-amqp-troubleshoot). @@ -126,13 +124,13 @@ See the [Troubleshooting message handling issues](#troubleshooting-message-handl - **AutoLockRenewTimeout:** The time allocated to renew the message or session lock has elapsed. You could re-register the object that wants be auto lock renewed or extend the timeout in advance. -See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) to help troubleshoot AutoLockRenewer errors. +See the [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) to help troubleshoot `AutoLockRenewer` errors. ## Threading and concurrency issues ### Thread safety limitations -**Important:** We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. +> **IMPORTANT:** We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. @@ -145,7 +143,7 @@ Using the same client instances across multiple threads or tasks without proper It is up to the running application to use these classes in a concurrency-safe manner. -For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like asyncio.Lock(). +For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like `asyncio.Lock()`. ```python import asyncio @@ -239,22 +237,23 @@ To troubleshoot: This error can occur when an intercepting proxy is used. To verify, it is recommended that the application be tested in the host environment with the proxy disabled. Note that intercepting proxies are not a supported scenario. -### Adding components to the connection string does not work - -The current generation of the Service Bus client library supports connection strings only in the form published by the Azure portal. These are intended to provide basic location and shared key information only; configuring behavior of the clients is done through its options. - -Previous generations of the Service Bus clients allowed for some behavior to be configured by adding key/value components to a connection string. These components are no longer recognized and have no effect on client behavior. - -#### Specifying AMQP over websockets +### Specifying AMQP over websockets -To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the ServiceBusClient. +To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the `ServiceBusClient`. -#### Using Service Bus with Azure Identity +### Using Service Bus with Azure Identity To authenticate with Azure Identity, see: [Client Identity Authentication](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/servicebus/azure-servicebus/samples/sync_samples/client_identity_authentication.py). For more information about the `azure-identity` library, see: [Azure Identity client library for Python][https://learn.microsoft.com/python/api/overview/azure/identity-readme?view=azure-python]. +### Entity not found errors + +**MessagingEntityNotFoundError resolution:** +1. Verify the queue/topic/subscription name is spelled correctly +2. Ensure the Service Bus namespace and entity exist +3. Check if the entity was deleted and needs to be recreated + ## Troubleshooting message handling issues ### Message and session lock issues @@ -282,10 +281,8 @@ with receiver: ### Message size issues **MessageSizeExceededError resolution:** -1. Reduce message payload size -2. Consider splitting large messages across multiple smaller messages - -For the most up-to-date information on Service Bus message size limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. +1. Reduce message payload size. +2. Consider splitting large messages across multiple smaller messages. For the most up-to-date information on Service Bus message size limits, refer to the [Azure Service Bus quotas and limits](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas) documentation. ## Troubleshooting receiver issues @@ -385,18 +382,6 @@ with dlq_receiver: print(f"Dead letter description: {message.dead_letter_error_description}") ``` -## Quotas - -Information about Service Bus quotas can be found [here](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-quotas). - -### Entity not found errors - -**MessagingEntityNotFoundError resolution:** -1. Verify the queue/topic/subscription name is spelled correctly -2. Ensure the entity exists in the Service Bus namespace -3. Check if the entity was deleted and needs to be recreated -4. Verify you're connecting to the correct namespace - ## Get additional help Additional information on ways to reach out for support can be found in the [SUPPORT.md](https://github.com/Azure/azure-sdk-for-python/blob/main/SUPPORT.md) at the root of the repo. From 10714e1853532e9ae738d0913a2aaa870e65bc58 Mon Sep 17 00:00:00 2001 From: swathipil Date: Wed, 11 Jun 2025 16:24:59 -0700 Subject: [PATCH 27/28] apply jesses comments to other sb docs --- sdk/servicebus/azure-servicebus/README.md | 6 +-- .../azure-servicebus/samples/README.md | 4 +- .../samples/sync_samples/send_queue.py | 42 ++----------------- .../samples/sync_samples/send_topic.py | 39 ++--------------- 4 files changed, 12 insertions(+), 79 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/README.md b/sdk/servicebus/azure-servicebus/README.md index f52214844f76..5488bd3450fd 100644 --- a/sdk/servicebus/azure-servicebus/README.md +++ b/sdk/servicebus/azure-servicebus/README.md @@ -95,11 +95,11 @@ To interact with these resources, one should be familiar with the following SDK ### Thread safety -We do not guarantee that the ServiceBusClient, ServiceBusSender, and ServiceBusReceiver are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. It is up to the running application to use these classes in a concurrency-safe manner. +We do not guarantee that the `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are thread-safe or coroutine-safe. We do not recommend reusing these instances across threads or sharing them between coroutines. It is up to the running application to use these classes in a concurrency-safe manner. The data model type, `ServiceBusMessageBatch` is not thread-safe or coroutine-safe. It should not be shared across threads nor used concurrently with client methods. -For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like threading.Lock(). **Note:** Native async APIs should be used instead of running in a ThreadPoolExecutor, if possible. +For scenarios requiring concurrent sending from multiple threads, ensure proper thread-safety management using mechanisms like `threading.Lock()`. **Note:** Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. ```python import threading from concurrent.futures import ThreadPoolExecutor @@ -128,7 +128,7 @@ with client: executor.submit(send_batch, i, sender) ``` -For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like asyncio.Lock() +For scenarios requiring concurrent sending in asyncio applications, ensure proper coroutine-safety management using mechanisms like `asyncio.Lock()`. ```python import asyncio from azure.servicebus.aio import ServiceBusClient diff --git a/sdk/servicebus/azure-servicebus/samples/README.md b/sdk/servicebus/azure-servicebus/samples/README.md index c5d3d6399674..ac90f76ee6c5 100644 --- a/sdk/servicebus/azure-servicebus/samples/README.md +++ b/sdk/servicebus/azure-servicebus/samples/README.md @@ -20,12 +20,12 @@ Both [sync version](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ - From a connection string - Enabling Logging - Send messages concurrently with proper thread/coroutine safety practices - - **Note**: ServiceBusClient, ServiceBusSender, and ServiceBusReceiver are not thread-safe or coroutine-safe + - **Note**: `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are not thread-safe or coroutine-safe - [send_topic.py](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/sync_samples/send_topic.py) ([async version](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/async_samples/send_topic_async.py)) - Examples to send messages to a service bus topic: - From a connection string - Enabling Logging - Send messages concurrently with proper thread/coroutine safety practices - - **Note**: ServiceBusClient, ServiceBusSender, and ServiceBusReceiver are not thread-safe or coroutine-safe + - **Note**: `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusReceiver` are not thread-safe or coroutine-safe - [receive_queue.py](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/sync_samples/receive_queue.py) ([async_version](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/async_samples/receive_queue_async.py)) - Examples to receive messages from a service bus queue: - Receive messages - [receive_subscription.py](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/sync_samples/receive_subscription.py) ([async_version](https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/servicebus/azure-servicebus/samples/async_samples/receive_subscription_async.py)) - Examples to receive messages from a service bus subscription: diff --git a/sdk/servicebus/azure-servicebus/samples/sync_samples/send_queue.py b/sdk/servicebus/azure-servicebus/samples/sync_samples/send_queue.py index c1ada7751ab1..f63479a0493a 100644 --- a/sdk/servicebus/azure-servicebus/samples/sync_samples/send_queue.py +++ b/sdk/servicebus/azure-servicebus/samples/sync_samples/send_queue.py @@ -8,13 +8,13 @@ """ Example to show sending message(s) to a Service Bus Queue. -WARNING: ServiceBusClient, ServiceBusSender, and ServiceBusMessageBatch are not thread-safe. -Do not share these instances between threads without proper thread-safe management using mechanisms like threading.Lock. +WARNING: `ServiceBusClient`, `ServiceBusSender`, and `ServiceBusMessageBatch` are not thread-safe. +Do not share these instances between threads without proper thread-safe management using mechanisms like `threading.Lock()`. +Note: Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. """ import os import threading -import asyncio from concurrent.futures import ThreadPoolExecutor from azure.servicebus import ServiceBusClient, ServiceBusMessage @@ -47,9 +47,6 @@ def send_batch_message(sender): sender.send_messages(batch_message) - - - def send_concurrent_with_shared_client_and_lock(): """ Example showing concurrent sending with a shared client using threading.Lock. @@ -79,32 +76,6 @@ def send_with_lock(thread_id): for future in futures: future.result() - -def send_with_run_in_executor(): - """ - Example showing how to use asyncio.run_in_executor for sync operations in async context. - This is useful when you need to call sync Service Bus operations from async code. - """ - async def async_main(): - loop = asyncio.get_event_loop() - - def sync_send_operation(): - credential = DefaultAzureCredential() - servicebus_client = ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential) - with servicebus_client: - sender = servicebus_client.get_queue_sender(queue_name=QUEUE_NAME) - with sender: - message = ServiceBusMessage("Message sent via run_in_executor") - sender.send_messages(message) - return "Message sent successfully" - - # Run the synchronous operation in an executor - result = await loop.run_in_executor(None, sync_send_operation) - print(f"run_in_executor result: {result}") - - asyncio.run(async_main()) - - credential = DefaultAzureCredential() servicebus_client = ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential, logging_enable=True) with servicebus_client: @@ -114,12 +85,7 @@ def sync_send_operation(): send_a_list_of_messages(sender) send_batch_message(sender) -print("Send message is done.") - - + print("Send message is done.") print("\nDemonstrating concurrent sending with shared client and locks...") send_concurrent_with_shared_client_and_lock() - -print("\nDemonstrating run_in_executor pattern...") -send_with_run_in_executor() diff --git a/sdk/servicebus/azure-servicebus/samples/sync_samples/send_topic.py b/sdk/servicebus/azure-servicebus/samples/sync_samples/send_topic.py index 013f129f710f..5d3b1687ee06 100644 --- a/sdk/servicebus/azure-servicebus/samples/sync_samples/send_topic.py +++ b/sdk/servicebus/azure-servicebus/samples/sync_samples/send_topic.py @@ -9,12 +9,12 @@ Example to show sending message(s) to a Service Bus Topic. WARNING: ServiceBusClient, ServiceBusSender, and ServiceBusMessageBatch are not thread-safe. -Do not share these instances between threads without proper thread-safe management using mechanisms like threading.Lock. +Do not share these instances between threads without proper thread-safe management using mechanisms like `threading.Lock()`. +Note: Native async APIs should be used instead of running in a `ThreadPoolExecutor`, if possible. """ import os import threading -import asyncio from concurrent.futures import ThreadPoolExecutor from azure.servicebus import ServiceBusClient, ServiceBusMessage @@ -46,9 +46,6 @@ def send_batch_message(sender): sender.send_messages(batch_message) - - - def send_concurrent_with_shared_client_and_lock(): """ Example showing concurrent sending with a shared client using threading.Lock. @@ -79,31 +76,6 @@ def send_with_lock(thread_id): future.result() -def send_with_run_in_executor(): - """ - Example showing how to use asyncio.run_in_executor for sync operations in async context. - This is useful when you need to call sync Service Bus operations from async code. - """ - async def async_main(): - loop = asyncio.get_event_loop() - - def sync_send_operation(): - credential = DefaultAzureCredential() - servicebus_client = ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential) - with servicebus_client: - sender = servicebus_client.get_topic_sender(topic_name=TOPIC_NAME) - with sender: - message = ServiceBusMessage("Message sent via run_in_executor") - sender.send_messages(message) - return "Message sent successfully" - - # Run the synchronous operation in an executor - result = await loop.run_in_executor(None, sync_send_operation) - print(f"run_in_executor result: {result}") - - asyncio.run(async_main()) - - credential = DefaultAzureCredential() servicebus_client = ServiceBusClient(FULLY_QUALIFIED_NAMESPACE, credential, logging_enable=True) with servicebus_client: @@ -113,12 +85,7 @@ def sync_send_operation(): send_a_list_of_messages(sender) send_batch_message(sender) -print("Send message is done.") - - + print("Send message is done.") print("\nDemonstrating concurrent sending with shared client and locks...") send_concurrent_with_shared_client_and_lock() - -print("\nDemonstrating run_in_executor pattern...") -send_with_run_in_executor() From e380784587a99818b26ee190956d168166e00de0 Mon Sep 17 00:00:00 2001 From: swathipil Date: Thu, 12 Jun 2025 11:27:26 -0700 Subject: [PATCH 28/28] libbas comments --- sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md index 2caf16e4a561..3715cd34086b 100644 --- a/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md +++ b/sdk/servicebus/azure-servicebus/TROUBLESHOOTING.md @@ -18,7 +18,7 @@ This troubleshooting guide contains instructions to diagnose frequently encounte * [Troubleshooting connectivity issues](#troubleshooting-connectivity-issues) * [Timeout when connecting to service](#timeout-when-connecting-to-service) * [SSL handshake failures](#ssl-handshake-failures) - * [Specifying AMQP over websockets](#specifying-amqp-over-websockets) + * [Specifying AMQP over WebSockets](#specifying-amqp-over-websockets) * [Using Service Bus with Azure Identity](#using-service-bus-with-azure-identity) * [Entity not found errors](#entity-not-found-errors) * [Troubleshooting message handling issues](#troubleshooting-message-handling-issues) @@ -45,6 +45,7 @@ To enable client logging and AMQP frame level trace: import logging import sys +# Enable client level logging handler = logging.StreamHandler(stream=sys.stdout) log_fmt = logging.Formatter(fmt="%(asctime)s | %(threadName)s | %(levelname)s | %(name)s | %(message)s") handler.setFormatter(log_fmt) @@ -52,11 +53,12 @@ logger = logging.getLogger('azure.servicebus') logger.setLevel(logging.DEBUG) logger.addHandler(handler) -# Enable AMQP frame level trace from azure.servicebus import ServiceBusClient from azure.identity import DefaultAzureCredential credential = DefaultAzureCredential() + +# Enable AMQP frame level trace with `logging_enable=True` client = ServiceBusClient(fully_qualified_namespace, credential, logging_enable=True) ``` @@ -227,7 +229,7 @@ To troubleshoot: - Check the firewall and port permissions in your hosting environment and that the AMQP ports 5671 and 5672 are open and that the endpoint is allowed through the firewall. -- Try using the Web Socket transport option, which connects using port 443. This can be done by passing the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the client. +- Try using the WebSocket transport option, which connects using port 443. This can be done by passing the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the client. - See if your network is blocking specific IP addresses. For details, see: [What IP addresses do I need to allow?](https://learn.microsoft.com/azure/service-bus-messaging/service-bus-faq#what-ip-addresses-do-i-need-to-add-to-allowlist-). @@ -237,9 +239,9 @@ To troubleshoot: This error can occur when an intercepting proxy is used. To verify, it is recommended that the application be tested in the host environment with the proxy disabled. Note that intercepting proxies are not a supported scenario. -### Specifying AMQP over websockets +### Specifying AMQP over WebSockets -To configure web socket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the `ServiceBusClient`. +To configure WebSocket use, pass the [`transport_type=TransportType.AmqpOverWebsocket`](https://learn.microsoft.com/python/api/azure-servicebus/azure.servicebus.transporttype?view=azure-python) to the `ServiceBusClient`. ### Using Service Bus with Azure Identity