Background
This issue was raised during the review of #8167. The problem is that because we fan out different types of messages for a given channel id to different subsystems that run on separate coroutines, we have the possibility of processing the messages in a different order than they are received. Without guaranteeing that these messages are processed in-order, we can violate core assumptions of the specification which can lead to hard-to-explain behavior.
It is likely that this should be dealt with via our proposed Channel Lifecycle architectural change. This would allow a coroutine fanout between different channels while preserving an in-order processing on a per-channel basis.
Formally speaking this isn't a bug we have encountered, but it is one that we have deduced can occur from review. It has consequences where the worst-case scenario is a force closure. We did not address it in #8167 because the status quo prior to that PR was force closures every time someone attempted to coop-close with in-flight htlcs. Moving forward without this change allows us to strictly reduce force-closure scenarios. However, it is still important to address this core architectural issue because the protocol assumes, at the spec level, that messages are received (and therefore processed) in-order on a per-connection (and therefore per-channel) basis.
While order is not an important guarantee between channels as they are considered isolated state machines. There are consistency requirements that would be violated by out-of-order message processing for a given channel.
Background
This issue was raised during the review of #8167. The problem is that because we fan out different types of messages for a given channel id to different subsystems that run on separate coroutines, we have the possibility of processing the messages in a different order than they are received. Without guaranteeing that these messages are processed in-order, we can violate core assumptions of the specification which can lead to hard-to-explain behavior.
It is likely that this should be dealt with via our proposed Channel Lifecycle architectural change. This would allow a coroutine fanout between different channels while preserving an in-order processing on a per-channel basis.
Formally speaking this isn't a bug we have encountered, but it is one that we have deduced can occur from review. It has consequences where the worst-case scenario is a force closure. We did not address it in #8167 because the status quo prior to that PR was force closures every time someone attempted to coop-close with in-flight htlcs. Moving forward without this change allows us to strictly reduce force-closure scenarios. However, it is still important to address this core architectural issue because the protocol assumes, at the spec level, that messages are received (and therefore processed) in-order on a per-connection (and therefore per-channel) basis.
While order is not an important guarantee between channels as they are considered isolated state machines. There are consistency requirements that would be violated by out-of-order message processing for a given channel.