-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Handle concurrent reads and concurrent writes on MsQuicStream #67329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Tagging subscribers to this area: @dotnet/ncl Issue DetailsThis PR consolidates detection of concurrent reads and concurrent writes on Fixes #52627.
|
We've discussed in #67037 that after that change we cannot rely on state anymore, because there are things being done outside the lock, so the Read operations can start to overlap now... it's true that at the moment it works because the second operation doesn't touch things that the first one might need, but it is fragile, because something new might be added in the future (used by both operations), break that unspoken contract and blow up unexpectedly, at a moment that will be super hard to catch... |
I am not sure I follow, the operations are still being serialized by the I originally wanted to add explicit guards (like it is done in |
|
One thing to add is that you need also take into account that you might get RECEIVE_EVENT from msquic in parallel, which might change the
Please correct me here if I'm missing some clever trick that prevents what I've just described from happening. Anyway, I have no idea what happens when we return twice the same resettable completion source. And if this doesn't blow up, what happens when both waiters gets signaled. |
1 similar comment
|
One thing to add is that you need also take into account that you might get RECEIVE_EVENT from msquic in parallel, which might change the
Please correct me here if I'm missing some clever trick that prevents what I've just described from happening. Anyway, I have no idea what happens when we return twice the same resettable completion source. And if this doesn't blow up, what happens when both waiters gets signaled. |
No, this is actually legit and can conceivably happen, good catch! This instance seems to have an easy fix: just move I see that we usually explicitly complete the completion source outside of the lock (I guess to minimize the critical section), so similar races may be elsewhere, I will try to look into it. |
|
So it seems that the write path does not suffer from the same problem because the SendState is reset on the reading thread only after the completion source is awaited and the cancellation/abortive paths transition to a final state so there shouldn't be similar races there. |
| } | ||
| } | ||
|
|
||
| [Fact] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is redundant, the stream conformance tests already test this.
|
/azp run runtime-libraries stress-http |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run runtime-libraries-coreclr outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
ManickaP
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If making sure there are no concurrent reads/writes is proving to be impossible to achieve with just the existing Read/SendState. Should we revisit the option to introduce an extra flag? We could pack it within the Read/SendState to save on space, but that might generate quite unreadable code.
src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs
Outdated
Show resolved
Hide resolved
| state.ReceiveResettableCompletionSource.Complete(readLength); | ||
| // We're completing a pending read. This statement must be still inside a lock, otherwise another thread | ||
| // could see ReadState.None and access the still incomplete completion source | ||
| if (shouldComplete) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to figure out why we complete all the completion sources outside of the lock and could only find this historical comment: d028d30#r353958420
Can we move the task completion to outside of the lock?
So I cannot tell if moving this inside lock will cause any deadlocks (we use CompleteAsynchronously so it shouldn't) or have some other negative impact.
Also, if we need to move this one inside the lock, are there any other places we should investigate as well?
Lastly, I'd love @CarnaViire to give this look as well, she's the one with the most knowledge about reads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to touch other completions, since they all happen on terminal SendStates, or -- in case of write's -- they are protected by intermediate state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From offline discussion: let's try to keep the completion outside of the lock and introduce extra ReadState to imitate what is happening on the Write path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end I think we can do without adding additional ReadState, we just keep the PendingRead state a little longer.
I also wanted to avoid making ReadAsync into a state machine, so I reset the ReadState still on the callback thread by taking the lock again after completing the completion source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scratch that, additional state was needed in the end, so I added PendingReadFinished, And I also had to make ReadAsync into state machine.
...libraries/System.Net.Quic/tests/FunctionalTests/QuicStreamConnectedStreamConformanceTests.cs
Outdated
Show resolved
Hide resolved
| switch (state.ReadState) | ||
| { | ||
| case ReadState.None: | ||
| case ReadState.PendingReadFinished: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there's a race that makes it possible to enter the callback and the lock it this state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the callback simply fires twice in a row before the other thread has time to reset the ReadState. Do you think this needs a comment in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, especially with current comment only explaining "None" state 😄
Are you sure the reading flow is fine in that case?
And - just on a side note - oh god I have a bad feeling about that 😢 the states becoming a bit blurred and still not fully helping us serialize stuff in a way we want to.
| { | ||
| case ReadState.PendingRead: | ||
| ex = new InvalidOperationException("Only one read is supported at a time."); | ||
| case ReadState.PendingReadFinished: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to change one thing above (closer to the start of ReadAsync):
runtime/src/libraries/System.Net.Quic/src/System/Net/Quic/Implementations/MsQuic/MsQuicStream.cs
Line 451 in 42044c6
| if (initialReadState != ReadState.PendingRead && cancellationToken.IsCancellationRequested) |
it now should be
if (initialReadState != ReadState.PendingRead && initialReadState != ReadState.PendingReadFinished && cancellationToken.IsCancellationRequested) because they both get this special treatment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
CarnaViire
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
This PR consolidates detection of concurrent reads and concurrent writes on
MsQuicStreamand throwsInvalidOperationExceptionwith proper message.Fixes #52627.
Analysis - Reads (ReadAsync): No changes were needed
When there is an outstanding read (i.e. a Task asynchronously waiting for incoming data), then
_state.ReadStateis set toReadState.PendingRead. Any concurrent read will then read the state (under a lock), and then throw outside of the lock (without doing any changes to the stream state).Concurrent reads which finish synchronously (or synchronous read followed by asynchronous read) are serialized correctly and don't lead to an exception (I suppose we are fine with this, as it does not corrupt the inner state).
Analysis - Writes (WriteAsync overloads)
All reads need to go through
HandleWriteStartState, which set_state.SendStatetoSendState.PendingorSendState.Finished(if it was 0 size write). We need to check just before the assignment (still inside the lock) and if the_state.SendStateisSendState.PendingorSendState.Finished, we clashed with a previous operation (either synchronous or asynchronous one).