Skip to content

Conversation

@wfurt
Copy link
Member

@wfurt wfurt commented Aug 24, 2022

This is mostly test problem. I saw occasional failures also on HTTP/1.1


System.Net.Http.Functional.Tests.SocketsHttpHandler_HttpClientHandler_MaxResponseHeadersLength_Http11.ThresholdExceeded_ThrowsException

      System.IO.IOException : Unable to write data to the transport connection: Broken pipe.
      ---- System.Net.Sockets.SocketException : Broken pipe
      Stack Trace:
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/LoopbackServer.cs(739,0): at System.Net.Test.Common.LoopbackServer.Connection.SendResponseAsync(Byte[] response)
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/LoopbackServer.cs(983,0): at System.Net.Test.Common.LoopbackServer.Connection.SendResponseBodyAsync(Byte[] content, Boolean isFinal)
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/GenericLoopbackServer.cs(161,0): at System.Net.Test.Common.GenericLoopbackConnection.SendResponseBodyAsync(String content, Boolean isFinal)

now, there are some significant differences and reasons why HTTP/3 fails often:

public async Task SendFrameAsync(long frameType, ReadOnlyMemory<byte> framePayload)
{
await SendFrameHeaderAsync(frameType, framePayload.Length).ConfigureAwait(false);
await _stream.WriteAsync(framePayload).ConfigureAwait(false);
}

and then on receiving side something like

        (frameType, payloadLength) = await ReadFrameEnvelopeAsync(cancellationToken).ConfigureAwait(false);

        private async ValueTask ReadHeadersAsync(long headersLength, CancellationToken cancellationToken)
        {
            // TODO: this header budget is sent as SETTINGS_MAX_HEADER_LIST_SIZE, so it should not use frame payload but rather 32 bytes + uncompressed size per entry.
            // https://tools.ietf.org/html/draft-ietf-quic-http-24#section-4.1.1
            if (headersLength > _headerBudgetRemaining)
            {
                _stream.Abort(QuicAbortDirection.Read, (long)Http3ErrorCode.ExcessiveLoad);
                throw new HttpRequestException(SR.Format(SR.net_http_response_headers_exceeded_length, _connection.Pool.Settings.MaxResponseHeadersByteLength));
            }

we get header length from the frame header and we can fail and Abort without even waiting for the headers (and sending task to complete)

The fix for this is to simply swallow transport errors.

Now looking at it, it seems like we are aborting wrong side. It somewhat does not matter as we will fail and abort both sides immediately but that delivers wrong application code to the peer.

Lastly I was trying to use tracing to figure out what is happening I finally succeeded but I had to improve it somewhat.
The current log is way to crude and it does not include useful information. So I split the generic log to each event handler so we can log also some specific parts. With that following fragments was creation for solving this:

Sender:
        01:06:54.7569379[Info] thisOrContextObject: QuicStream#64628464, memberName: WriteAsync, message: [strm][0x7F3464004F80] Stream writing memory of '3' bytes while not completing writes.
        01:06:54.7569734[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event SEND_COMPLETE
        01:06:54.7569903[Info] thisOrContextObject: QuicStream#64628464, memberName: WriteAsync, message: [strm][0x7F3464004F80] Stream writing memory of '15400' bytes while not completing writes.
        01:06:54.7584017[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event IDEAL_SEND_BUFFER_SIZE
        01:06:54.7584457[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event PEER_RECEIVE_ABORTED
        01:06:54.7594386[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event SEND_COMPLETE

Receiver:
        01:40:10.3530872[Info] thisOrContextObject: QuicStream#41727345, memberName: NativeCallback, message: [strm][0x7FA554013E40] Received event RECEIVE
        01:40:10.3531404[Info] thisOrContextObject: QuicStream#41727345, memberName: HandleEventReceive, message: [strm][0x7FA554013E40] Received 1 buffers, Total count of 3. totalCopied=3
        01:40:10.3532103[HandlerMessage] poolId: 34868631, workerId: 49584532, requestId: 0, memberName: ReadFrameEnvelopeAsync, message: Received frame 1 of length 15400.
        01:40:10.3532604[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 263.
        01:40:10.3546351[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Write with 258.
        01:40:10.3547886[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 268.
        01:40:10.3588480[Info] thisOrContextObject: QuicStream#41727345, memberName: NativeCallback, message: [strm][0x7FA554013E40] Received event SHUTDOWN_COMPLETE

so it is clear we received 3 bytes out of promised 15400 and aborted immediately. Peer (e.g. LoopBack server) will get PEER_RECEIVE_ABORTED before completing write.

fixes #73930

@wfurt wfurt requested review from ManickaP, MihaZupan and rzikm August 24, 2022 06:12
@wfurt wfurt self-assigned this Aug 24, 2022
@ghost
Copy link

ghost commented Aug 24, 2022

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

This is mostly test problem. I saw occasional failures also on HTTP/1.1


System.Net.Http.Functional.Tests.SocketsHttpHandler_HttpClientHandler_MaxResponseHeadersLength_Http11.ThresholdExceeded_ThrowsException

      System.IO.IOException : Unable to write data to the transport connection: Broken pipe.
      ---- System.Net.Sockets.SocketException : Broken pipe
      Stack Trace:
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/LoopbackServer.cs(739,0): at System.Net.Test.Common.LoopbackServer.Connection.SendResponseAsync(Byte[] response)
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/LoopbackServer.cs(983,0): at System.Net.Test.Common.LoopbackServer.Connection.SendResponseBodyAsync(Byte[] content, Boolean isFinal)
        /home/furt/github/wfurt-runtime2/src/libraries/Common/tests/System/Net/Http/GenericLoopbackServer.cs(161,0): at System.Net.Test.Common.GenericLoopbackConnection.SendResponseBodyAsync(String content, Boolean isFinal)

now, there are some significant differences and reasons why HTTP/3 fails often:

public async Task SendFrameAsync(long frameType, ReadOnlyMemory<byte> framePayload)
{
await SendFrameHeaderAsync(frameType, framePayload.Length).ConfigureAwait(false);
await _stream.WriteAsync(framePayload).ConfigureAwait(false);
}

and then on receiving side something like

        (frameType, payloadLength) = await ReadFrameEnvelopeAsync(cancellationToken).ConfigureAwait(false);

        private async ValueTask ReadHeadersAsync(long headersLength, CancellationToken cancellationToken)
        {
            // TODO: this header budget is sent as SETTINGS_MAX_HEADER_LIST_SIZE, so it should not use frame payload but rather 32 bytes + uncompressed size per entry.
            // https://tools.ietf.org/html/draft-ietf-quic-http-24#section-4.1.1
            if (headersLength > _headerBudgetRemaining)
            {
                _stream.Abort(QuicAbortDirection.Read, (long)Http3ErrorCode.ExcessiveLoad);
                throw new HttpRequestException(SR.Format(SR.net_http_response_headers_exceeded_length, _connection.Pool.Settings.MaxResponseHeadersByteLength));
            }

we get header length from the frame header and we can fail and Abort without even waiting for the headers (and sending task to complete)

The fix for this is to simply swallow transport errors.

Now looking at it, it seems like we are aborting wrong side. It somewhat does not matter as we will fail and abort both sides immediately but that delivers wrong application code to the peer.

Lastly I was trying to use tracing to figure out what is happening I finally succeeded but I had to improve it somewhat.
The current log is way to crude and it does not include useful information. So I split the generic log to each event handler so we can log also some specific parts. With that following fragments was creation for solving this:

Sender:
        01:06:54.7569379[Info] thisOrContextObject: QuicStream#64628464, memberName: WriteAsync, message: [strm][0x7F3464004F80] Stream writing memory of '3' bytes while not completing writes.
        01:06:54.7569734[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event SEND_COMPLETE
        01:06:54.7569903[Info] thisOrContextObject: QuicStream#64628464, memberName: WriteAsync, message: [strm][0x7F3464004F80] Stream writing memory of '15400' bytes while not completing writes.
        01:06:54.7584017[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event IDEAL_SEND_BUFFER_SIZE
        01:06:54.7584457[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event PEER_RECEIVE_ABORTED
        01:06:54.7594386[Info] thisOrContextObject: QuicStream#64628464, memberName: NativeCallback, message: [strm][0x7F3464004F80] Received event SEND_COMPLETE

Receiver:
        01:40:10.3530872[Info] thisOrContextObject: QuicStream#41727345, memberName: NativeCallback, message: [strm][0x7FA554013E40] Received event RECEIVE
        01:40:10.3531404[Info] thisOrContextObject: QuicStream#41727345, memberName: HandleEventReceive, message: [strm][0x7FA554013E40] Received 1 buffers, Total count of 3. totalCopied=3
        01:40:10.3532103[HandlerMessage] poolId: 34868631, workerId: 49584532, requestId: 0, memberName: ReadFrameEnvelopeAsync, message: Received frame 1 of length 15400.
        01:40:10.3532604[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 263.
        01:40:10.3546351[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Write with 258.
        01:40:10.3547886[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 268.
        01:40:10.3588480[Info] thisOrContextObject: QuicStream#41727345, memberName: NativeCallback, message: [strm][0x7FA554013E40] Received event SHUTDOWN_COMPLETE

so it is clear we received 3 bytes out of promised 15400 and aborted immediately. Peer (e.g. LoopBack server) will get PEER_RECEIVE_ABORTED before completing write.

fixes #73930

Author: wfurt
Assignees: wfurt
Labels:

area-System.Net.Http

Milestone: -

Copy link
Member

@rzikm rzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, small nits

Copy link
Member

@MihaZupan MihaZupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching the H3 issue

if (headersLength > _headerBudgetRemaining)
{
_stream.Abort(QuicAbortDirection.Write, (long)Http3ErrorCode.ExcessiveLoad);
_stream.Abort(QuicAbortDirection.Read, (long)Http3ErrorCode.ExcessiveLoad);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the previous behavior of closing the write direction, were things still eventually being cleaned up correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can sort of see it ion the posted logs. Frirst comes from the ReadHeadersAsync above.

        01:40:10.3532604[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 263.

        01:40:10.3546351[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Write with 258.
        01:40:10.3547886[Info] thisOrContextObject: QuicStream#41727345, memberName: Abort, message: [strm][0x7FA554013E40] Aborting Read with 268.

I think this comes from HTTP3 error processing. This is easy to reproduce so I can look where this duplicate reset is coming from.

Copy link
Member

@ManickaP ManickaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add ToStrings to the event data structures instead of adding logging into each individual event handler?

Also this change should be done in QuicConnection and QuicListener as well, not just in QuicStream. Or we end up with inconsistent messy logging as we had before.


private unsafe int HandleEventStartComplete(ref START_COMPLETE data)
{
if (NetEventSource.Log.IsEnabled())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also cover QuicConnection and QuicListener for a consistency.
Also could you log the varying arguments in a similar manner, like:

NetEventSource.Info(this, $"{this} Received event START_COMPLETE with {nameof(Status)}={data.Status}, {nameof(PeerAccepted)}={data.PeerAccepted}");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuicListener did not have anything interesting IMHO so I left is as is. I updated QuicConnection with same pattern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least it has a new connection. And this is also about consistency.

data.TotalBufferLength = totalCopied;
return (_receiveBuffers.HasCapacity() && Interlocked.CompareExchange(ref _receivedNeedsEnable, 0, 1) == 1) ? QUIC_STATUS_CONTINUE : QUIC_STATUS_SUCCESS;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? methods should be separated by line IMHO. So as empty line should follow closing brace.
I think we should do separate style pass on the file. @rzikm seems to have best editor setting....

QUIC_STREAM_EVENT_TYPE.SHUTDOWN_COMPLETE => HandleEventShutdownComplete(ref streamEvent.SHUTDOWN_COMPLETE),
QUIC_STREAM_EVENT_TYPE.PEER_ACCEPTED => HandleEventPeerAccepted(),
_ => QUIC_STATUS_SUCCESS
_ => HandleOtherEvent(streamEvent.Type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuicConnection and QuicListener.


try
{
// Process the event.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, why don't we add ToString to the event data structure and log it from one place?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That change would need to go to the generated interop stubs from MsQuic then, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, all the structs are partial, so we can add it to a different file. Or just define it as extension/static method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would be beneficial. It may contains many field (we don't even use) and dumping everything is going to make logs hard to read IMHO. It will incur performance penalty and bring no value.
If we feel strongly about this we can do it at at the Verbose level - we use it in Sockets for example to dump part of the actual data.
I think using nameof() on interesting fields and consistent formatting may be the best

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be us coding and thus controlling what goes in.

@ghost ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Aug 24, 2022
Copy link
Member

@ManickaP ManickaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comments.

I'm still not 100% convinced we should put the logging into every method instead of having ToString or some common way to do it, but I'll not hold this PR on it.

if (NetEventSource.Log.IsEnabled())
{
NetEventSource.Info(this, $"{this} Connection connected {LocalEndPoint} -> {RemoteEndPoint}");
NetEventSource.Error(this, $"{this} Received event CONNECTED {LocalEndPoint} -> {RemoteEndPoint}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an Error

{
if (NetEventSource.Log.IsEnabled())
{
NetEventSource.Info(this, $"{this} Received event {type}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NetEventSource.Info(this, $"{this} Received event {type}");
NetEventSource.Info(this, $"{this} Received event {type}");


private unsafe int HandleEventStartComplete(ref START_COMPLETE data)
{
if (NetEventSource.Log.IsEnabled())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least it has a new connection. And this is also about consistency.

_connectedTcs.TrySetResult();
return QUIC_STATUS_SUCCESS;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you insist on putting empty line in between every HandleXYZ then it should be done everywhere. I was intentional with this, I don't mind the change but it should be the same everywhere.

{
NetEventSource.Info(this, $"{this} Received event SHUTDOWN_INITIATED_BY_PEER_DATA with {nameof(data.ErrorCode)}={data.ErrorCode}");
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

{
if (NetEventSource.Log.IsEnabled())
{
NetEventSource.Info(this, $"{this} Received event {type}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NetEventSource.Info(this, $"{this} Received event {type}");
NetEventSource.Info(this, $"{this} Received event {type}");

@wfurt wfurt merged commit ecb3038 into dotnet:main Aug 26, 2022
@wfurt wfurt deleted the maxHeaders branch August 26, 2022 19:35
@karelz karelz added this to the 8.0.0 milestone Aug 26, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Sep 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failures in SocketsHttpHandler_HttpClientHandler_MaxResponseHeadersLength_Http3

5 participants