Skip to content

Large messages in the audit queue cannot be processed #4185

@Jokelab

Description

@Jokelab

NOTE: issue edited for release

Symptoms

When a large message is sent from an audited endpoint, ingestion of the message fails and a message similar to the following is logged:

2024-05-21 08:03:46.8118|16|Fatal|ServiceControl.Audit.Auditing.AuditIngestion|OnCriticalError. 'Failed to import too many times'|System.InvalidOperationException: The link 'G6:14740510:amqps://**namespace removed**.servicebus.windows.net/-dd910ad4;0:7:8' is force detached by the broker because publisher(link3082) received a batch message with no data in it. Detach origin: Publisher.

Who's affected

you are affected if you are using ServiceControl 5.0.0 or later and you send large messages from an audited endpoint.

Root cause

Details ### Describe the bug

This bug is caused by Particular/NServiceBus.Transport.AzureServiceBus#994.

Description

After upgrading from ServiceControl 5.0.4 to 5.2.0, audit message ingestion fails and messages are not forwarded to the configured audit log queue.

Expected behavior

Audit messages are ingested and forwarded to the configured audit log queue.

Actual behavior

  • The audit instance logs an exception in the FailedImports folder. (See log output)
  • An event is logged in ServicePulse: Audit Message Ingestion Process: Failed to import too many times
  • The audit queue is not processed anymore.
  • The log queue doesn't receive any messages

Versions

ServiceControl 5.2.0
Transport: Azure ServiceBus (premium) I explicitly mention this because our system sometimes needs to process messages slightly bigger than 1MB.

Our endpoints are Azure Functions using the following versions:
NServiceBus 9.0.0
NServiceBus.Transport.AzureServiceBus 4.0.0
NServiceBus.AzureFunctions.Worker.ServiceBus 5.0.0

Steps to reproduce

  • Deploy a premium Azure ServiceBus namespace
  • Install a ServiceControl and Audit instance with version 5.0.4.
    • Configure the audit instance to forward messages to an audit log queue.
    • Use Azure ServiceBus transport and point the connectionstring to the premium Azure ServiceBus namespace
  • Set the max message size to 100MB
  • Make sure everything runs correctly by sending some messages.
  • Upgrade the ServiceControl and audit instance to 5.2.0
  • Send some messages

Expected: messages are ingested from the audit queue and forwarded to the audit log queue
Actual: messages stay in the audit queue and are not forwarded to the audit log queue.

Relevant log output

In the audit instance log:
2024-05-21 08:03:46.8118|16|Fatal|ServiceControl.Audit.Auditing.AuditIngestion|OnCriticalError. 'Failed to import too many times'|System.InvalidOperationException: The link 'G6:14740510:amqps://**namespace removed**.servicebus.windows.net/-dd910ad4;0:7:8' is force detached by the broker because publisher(link3082) received a batch message with no data in it. Detach origin: Publisher.
For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot.
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(IReadOnlyCollection`1 messages, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.<>c.<<SendBatchAsync>b__21_0>d.MoveNext()
--- End of stack trace from previous location ---
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.<>c__22`1.<<RunOperation>b__22_0>d.MoveNext()
--- End of stack trace from previous location ---
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose)
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose)
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchAsync(ServiceBusMessageBatch messageBatch, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.ServiceBusSender.SendMessagesAsync(ServiceBusMessageBatch messageBatch, CancellationToken cancellationToken)
   at NServiceBus.Transport.AzureServiceBus.MessageDispatcher.DispatchBatchForDestination(String destination, ServiceBusClient client, Transaction transaction, Queue`1 messagesToSend, CancellationToken cancellationToken) in /_/src/Transport/Sending/MessageDispatcher.cs:line 198
   at NServiceBus.Transport.AzureServiceBus.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, CancellationToken cancellationToken) in /_/src/Transport/Sending/MessageDispatcher.cs:line 101
   at ServiceControl.Audit.Auditing.AuditIngestor.Ingest(List`1 contexts) in /_/src/ServiceControl.Audit/Auditing/AuditIngestor.cs:line 72
   at ServiceControl.Audit.Auditing.AuditIngestion.Loop() in /_/src/ServiceControl.Audit/Auditing/AuditIngestion.cs:line 218
   at ServiceControl.Audit.Auditing.AuditIngestion.OnMessage(MessageContext messageContext, CancellationToken cancellationToken) in /_/src/ServiceControl.Audit/Auditing/AuditIngestion.cs:line 197
   at NServiceBus.Transport.AzureServiceBus.MessagePump.ProcessMessage(ServiceBusReceivedMessage message, ProcessMessageEventArgs processMessageEventArgs, String messageId, Dictionary`2 headers, BinaryData body, CancellationToken messageProcessingCancellationToken) in /_/src/Transport/Receiving/MessagePump.cs:line 285

Sample of an error message in the FailedImports folder:
Exception:
The link 'G6:14740510:amqps://**namespace removed**.servicebus.windows.net/-dd910ad4;0:7:8' is force detached by the broker because publisher(link3082) received a batch message with no data in it. Detach origin: Publisher.
For troubleshooting information, see https://aka.ms/azsdk/net/servicebus/exceptions/troubleshoot.
StackTrace:
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(AmqpMessage batchMessage, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchInternalAsync(IReadOnlyCollection`1 messages, TimeSpan timeout, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.<>c.<<SendBatchAsync>b__21_0>d.MoveNext()
--- End of stack trace from previous location ---
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.<>c__22`1.<<RunOperation>b__22_0>d.MoveNext()
--- End of stack trace from previous location ---
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose)
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1,TResult](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken, Boolean logTimeoutRetriesAsVerbose)
   at Azure.Messaging.ServiceBus.ServiceBusRetryPolicy.RunOperation[T1](Func`4 operation, T1 t1, TransportConnectionScope scope, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.Amqp.AmqpSender.SendBatchAsync(ServiceBusMessageBatch messageBatch, CancellationToken cancellationToken)
   at Azure.Messaging.ServiceBus.ServiceBusSender.SendMessagesAsync(ServiceBusMessageBatch messageBatch, CancellationToken cancellationToken)
   at NServiceBus.Transport.AzureServiceBus.MessageDispatcher.DispatchBatchForDestination(String destination, ServiceBusClient client, Transaction transaction, Queue`1 messagesToSend, CancellationToken cancellationToken) in /_/src/Transport/Sending/MessageDispatcher.cs:line 198
   at NServiceBus.Transport.AzureServiceBus.MessageDispatcher.Dispatch(TransportOperations outgoingMessages, TransportTransaction transaction, CancellationToken cancellationToken) in /_/src/Transport/Sending/MessageDispatcher.cs:line 101
   at ServiceControl.Audit.Auditing.AuditIngestor.Ingest(List`1 contexts) in /_/src/ServiceControl.Audit/Auditing/AuditIngestor.cs:line 72
   at ServiceControl.Audit.Auditing.AuditIngestion.Loop() in /_/src/ServiceControl.Audit/Auditing/AuditIngestion.cs:line 218
   at ServiceControl.Audit.Auditing.AuditIngestion.OnMessage(MessageContext messageContext, CancellationToken cancellationToken) in /_/src/ServiceControl.Audit/Auditing/AuditIngestion.cs:line 197
   at NServiceBus.Transport.AzureServiceBus.MessagePump.ProcessMessage(ServiceBusReceivedMessage message, ProcessMessageEventArgs processMessageEventArgs, String messageId, Dictionary`2 headers, BinaryData body, CancellationToken messageProcessingCancellationToken) in /_/src/Transport/Receiving/MessagePump.cs:line 285
Source:
Azure.Messaging.ServiceBus
TargetSite:
Void MoveNext()

Additional Information

Workarounds

Reinstalling everything with ServiceControl version to 5.0.4 resolved the issue.

Additional information

I tried re-importing the messages as suggested in the documentation https://docs.particular.net/servicecontrol/import-failed-messages, but this keeps triggering the same exception.
To make sure it is related to this version and that the issue doesn't occur in 5.0.4, I installed 5.0.4. again and tested that everything is ok. Then upgraded to 5.2.0 again and the issue appears.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions