Skip to content

tls: add read deadline to containers/image registry connections#777

Open
rphillips wants to merge 2 commits intocontainers:mainfrom
rphillips:fix_read_timeout_for_tls
Open

tls: add read deadline to containers/image registry connections#777
rphillips wants to merge 2 commits intocontainers:mainfrom
rphillips:fix_read_timeout_for_tls

Conversation

@rphillips
Copy link
Copy Markdown
Contributor

@rphillips rphillips commented Apr 20, 2026

  • Add a deadlineConn wrapper that sets a per-read SetReadDeadline on every Read() call to the underlying registry connection, preventing indefinite stalls in tls.Conn.Read
  • Add DockerReadTimeout field to SystemContext so callers can configure the per-read deadline. When zero (the default), no deadline is enforced
  • Handle timeout errors in bodyReader.Read() as a reconnectable condition (alongside ECONNRESET and ErrUnexpectedEOF), triggering the existing Range-based resume logic
  • Add ResponseHeaderTimeout = 2m to the HTTP transport
  • Add unit tests for isRetryableNetworkError and deadlineConn

Fixes a class of image pull hangs where a registry TLS connection stalls mid-transfer and never returns data. The pull goroutine blocks forever in crypto/tls.(*Conn).Readdocker.(*bodyReader).Read, leaving pods stuck in ContainerCreating indefinitely. Context cancellation alone cannot interrupt a blocked TLS read syscall.

With this change, callers set SystemContext.DockerReadTimeout (e.g. 5 * time.Minute) to enable stall detection. When a read exceeds the timeout, it returns a net.Error with Timeout() == true, the body is closed, and bodyReader reconnects with a Range: bytes=N- header to resume the download from where it left off.

Generated by Claude.
Reviewed by @rphillips

@github-actions github-actions Bot added the image Related to "image" package label Apr 20, 2026
@rphillips rphillips force-pushed the fix_read_timeout_for_tls branch from 5662998 to 43f29f7 Compare April 20, 2026 17:36
Copy link
Copy Markdown
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,

I’m not too happy about adding all this extra infrastructure in, essentially, application-level software (and about hard-coding timeouts, not that making it tunable would make it any better for ultimate users).

What is the theory of the network under which this helps? If this is for pulls, presumably the sender is going to be sending packets and automatically retrying unless it receives ACKs.

And the receiver already has KeepAlive: 30 set. So the connection should only stay alive if the two endpoints are fully live, it’s just that the sender is choosing not to send anything.


We also have reports of registries stalling at/around(?) EOF, reportedly because some security scan is running. A hard-coded timeout does not scale to large images for such operations.

Comment thread image/docker/body_reader_test.go Outdated
return c.sys.DockerProxy(request.URL)
}
}
tr.ResponseHeaderTimeout = 2 * time.Minute
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not documented :/

@rphillips
Copy link
Copy Markdown
Contributor Author

rphillips commented Apr 20, 2026

Correct, this is on client pulls.

Claude evaluated the Node logs within a job and says there is a network "hiccup" right before the long running pulls start to happen.

I suspect we can generalize this to be an issue in the client if the network bounces for some reason which stalls the client socket read.

The read timeout is long (5m) and less than desirable, but I do not see a way to plumb the timeout all the way down.

Updated the PR with a config option.

@rphillips rphillips force-pushed the fix_read_timeout_for_tls branch from 43f29f7 to ddd4b53 Compare April 20, 2026 17:58
A stalled TLS connection to a container registry (e.g. quay.io) can
cause image pulls to hang indefinitely. The HTTP response body read
blocks forever in tls.Conn.Read with no timeout, starving the entire
pull pipeline and leaving pods stuck in ContainerCreating for hours.

Wrap the HTTP transport dialer with a deadlineConn that enforces a
5-minute read deadline via SetReadDeadline on every Read call. When
triggered, bodyReader treats the timeout the same as ECONNRESET and
attempts a Range-based reconnect to resume the download. Also add a
2-minute ResponseHeaderTimeout to the transport.

Ref: https://redhat.atlassian.net/browse/OCPBUGS-79544
Signed-off-by: Ryan Phillips <rphillips@redhat.com>

add
Signed-off-by: Ryan Phillips <rphillips@redhat.com>
@rphillips rphillips force-pushed the fix_read_timeout_for_tls branch from ddd4b53 to 9333c62 Compare April 20, 2026 18:20
@rphillips
Copy link
Copy Markdown
Contributor Author

@mtrmac I added a DockerReadTimeout to the SystemContext, which defaults to unlimited still. It should allow cri-o to configure a max read timeout.

@mtrmac
Copy link
Copy Markdown
Contributor

mtrmac commented Apr 20, 2026

Claude evaluated the Node logs within a job and says there is a network "hiccup" right before the long running pulls start to happen.

I don’t know what a “hiccup” means.

Why did the existing Keepalive option on the TCP socket not catch this?

@rphillips
Copy link
Copy Markdown
Contributor Author

Great question. I'm not sure if this is a server side issue not sending data, or a client side issue with the keepalive. The client should be able to tell the tcp socket to not block forever though.

This could be a quay registry issue. I am not sure.

Copy link
Copy Markdown
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have too much on our plate to take on the long-term cost of maintaining a feature+option to handle an unknown situation (which implies no way to reliably reproduce), with a code that shouldn’t be necessary and can break some users.

@rphillips
Copy link
Copy Markdown
Contributor Author

We had a discussion today on the Node Team. We can defer this PR to perhaps the next release of OpenShift. I do not necessarily agree this is an unknown situation. Socket reads will block without a deadline attached to them and currently the client code is assuming the server is doing the right thing of sending data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

image Related to "image" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants