Harden connection state machine: cap header block, monitor owner, propagate send errors#2
Merged
Merged
Conversation
…pagate send errors
Three correctness fixes in h2_connection:
CONTINUATION flood (Critical, RFC 9113 §10.5 / §6.5.2). Track raw bytes
of the in-flight HEADERS+CONTINUATION block on the stream; bail with
GOAWAY(ENHANCE_YOUR_CALM) past 256 KB so a peer cannot OOM the node
before max_header_list_size (which acts post-HPACK) can fire.
controlling_process/2 owner liveness (Critical). The original swap
only updated state.owner and left the new owner untracked: the new
owner's death produced no signal and the connection orphaned the
socket. Install a monitor on the new owner and demonitor the previous
one; the initial owner still uses the start_link bidirectional link.
send_frame/2 error propagation (High). Previously always returned ok
even after Transport:send returned {error, closed}, so send_request /
send_response / send_data / send_trailers / send_request_headers
replied ok to the caller after the wire died. Now returns the real
result; API reply paths stop with {shutdown, {send_failed, Reason}}
and propagate {error, Reason} to the in-flight caller. Internal
async sites (window updates, settings) stay best-effort since the
imminent tcp_closed/ssl_closed will surface the failure.
Adds CT regression tests for each.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three correctness fixes in
h2_connection, surfaced by a security/concurrency audit of the codebase:handle_continuation/4appended every CONTINUATION payload to a per-stream iolist with no size cap, andmax_header_list_sizeonly runs after HPACK decode. A peer streaming CONTINUATIONs could OOM the node before the limit fires. Track raw bytes on the stream and bail withGOAWAY(ENHANCE_YOUR_CALM)past 256 KB.controlling_process/2owner liveness (Critical) — the swap updatedstate.ownerbut left the new owner untracked; the new owner's death produced no signal and the connection orphaned the socket. Install a monitor on the new owner, demonitor the previous one. The initial owner still uses thestart_linkbidirectional link.send_frame/2error propagation (High) — always returnedokeven afterTransport:sendreturned{error, closed}.send_request/send_response/send_data/send_trailers/send_request_headersrepliedokto the caller after the wire died. Nowsend_frame/2returns the real result; API reply paths stop with{shutdown, {send_failed, Reason}}and propagate{error, Reason}to the in-flight caller. Internal async sites (window updates, settings) stay best-effort since the imminenttcp_closed/ssl_closedsurfaces the failure anyway.Tests
Three new CT regression tests in
h2_compliance_SUITE:continuation_flood_triggers_enhance_your_calm_test— 320 KB of CONTINUATION traffic must yield GOAWAY/ENHANCE_YOUR_CALM.controlling_process_monitors_new_owner_test— killing the new owner after transfer terminates the connection.send_returns_error_on_closed_socket_test— closing the underlying socket then callingsend_datayields{error, _}, notok.All 92 tests pass.