Skip to content

A persistent timeout occurs when the network changes (static IP changed or PPPoE reconnected). #189

@malash

Description

@malash

What happened

When using https-dns-proxy on OpenWrt, I encountered a troubling issue: every time I reconnect the WAN interface using the PPPoE protocol, the DNS often becomes unresponsive. Specifically, both dig @router and dig @127.0.0.1 -p 5053 (with 5053 being the listening port of https-dns-proxy) result in a timeout.

I’ve investigated this issue in depth and found that it can be easily reproduced on Ubuntu 24.04 as well.

How to reproduce

  1. Install Ubuntu 24.04 (I used the Desktop version).
  2. Configure the network to use a static IP address. E.g. 192.168.9.100
  3. Open two terminal windows and run the following commands in each:
./https_dns_proxy -r https://dns.alidns.com/dns-query -p 5053 -4 -m 10 -vvv
while true; do dig 1.2.3.4.sslip.io @127.0.0.1 -p 5053 ; sleep 1;  done
  1. Change the static IP address, for example, to 192.168.9.101.

Image

  1. You can observe the output like this (please use this PR to fix logging):
[D] 1749047614.457423 https_client.c:219 E50A: * Operation timed out after 5004 milliseconds with 0 bytes received
[D] 1749047614.457494 https_client.c:219 E50A: * Connection #0 to host dns.alidns.com left intact
[D] 1749047614.457508 https_client.c:620 Released used io event: 0x7ffe22471f90
[W] 1749047614.457520 https_client.c:360 E50A: curl request failed with 28: Timeout was reached
[W] 1749047614.457537 https_client.c:362 E50A: curl error message: Operation timed out after 5004 milliseconds with 0 bytes received
[W] 1749047614.457548 https_client.c:376 E50A: Connecting and sending request to resolver was successful, but no response was sent back
[D] 1749047614.457555 https_client.c:446 E50A: CURLINFO_NUM_CONNECTS: 0
[D] 1749047614.457559 https_client.c:458 E50A: CURLINFO_EFFECTIVE_URL: https://dns.alidns.com/dns-query
[D] 1749047614.457568 https_client.c:495 E50A: Times: 0.000033, 0.000000, 0.000000, 0.000143, 0.000000, 5.004893
[I] 1749047614.457583 https_client.c:517 E50A: Response was faulty, skipping DNS reply
[D] 1749047614.457588 main.c:83 Received response for id: E50A, len: 0
[D] 1749047614.457627 main.c:112 Received request for id: E50A, len: 57
[D] 1749047614.457649 https_client.c:261 E50A: Requesting HTTP/2
[D] 1749047614.457678 https_client.c:219 E50A: * RESOLVE dns.alidns.com:443 - old addresses discarded
[D] 1749047614.457698 https_client.c:219 E50A: * Added dns.alidns.com:443:223.5.5.5,223.6.6.6 to DNS cache
[D] 1749047614.457747 https_client.c:219 E50A: * Found bundle for host: 0x559ad0c37f30 [can multiplex]
[D] 1749047614.457762 https_client.c:219 E50A: * Re-using existing connection with host dns.alidns.com
[D] 1749047614.457806 https_client.c:219 E50A: * [HTTP/2] [27] OPENED stream for https://dns.alidns.com/dns-query
[D] 1749047614.457816 https_client.c:219 E50A: * [HTTP/2] [27] [:method: POST]
[D] 1749047614.457823 https_client.c:219 E50A: * [HTTP/2] [27] [:scheme: https]
[D] 1749047614.457828 https_client.c:219 E50A: * [HTTP/2] [27] [:authority: dns.alidns.com]
[D] 1749047614.457833 https_client.c:219 E50A: * [HTTP/2] [27] [:path: /dns-query]
[D] 1749047614.457854 https_client.c:219 E50A: * [HTTP/2] [27] [user-agent: https_dns_proxy/0.3]
[D] 1749047614.457860 https_client.c:219 E50A: * [HTTP/2] [27] [accept: application/dns-message]
[D] 1749047614.457866 https_client.c:219 E50A: * [HTTP/2] [27] [content-type: application/dns-message]
[D] 1749047614.457872 https_client.c:219 E50A: * [HTTP/2] [27] [content-length: 57]
[D] 1749047614.457961 https_client.c:219 E50A: > POST /dns-query HTTP/2
[D] 1749047614.457975 https_client.c:219 E50A: > Host: dns.alidns.com
[D] 1749047614.457980 https_client.c:219 E50A: > User-Agent: https_dns_proxy/0.3
[D] 1749047614.457984 https_client.c:219 E50A: > Accept: application/dns-message
[D] 1749047614.458045 https_client.c:219 E50A: > Content-Type: application/dns-message
[D] 1749047614.458057 https_client.c:219 E50A: > Content-Length: 57
[D] 1749047614.458067 https_client.c:170 E50A: > 0000: e5 0a 01 20 00 01 00 00 00 00 00 01 01 31 01 32 ... .........1.2
[D] 1749047614.458077 https_client.c:170 E50A: > 0010: 01 33 01 34 05 73 73 6c 69 70 02 69 6f 00 00 01 .3.4.sslip.io...
[D] 1749047614.458087 https_client.c:170 E50A: > 0020: 00 01 00 00 29 04 d0 00 00 00 00 00 0c 00 0a 00 ....)...........
[D] 1749047614.458102 https_client.c:170 E50A: > 0030: 08 57 d6 2c 4d ff 74 cb 32                      .W.,M.t.2
[D] 1749047614.458122 https_client.c:633 Reserved new io event: 0x7ffe22471f90

...about 5s later...

[D] 1749047619.461386 https_client.c:219 E50A: * Operation timed out after 5003 milliseconds with 0 bytes received
[D] 1749047619.461450 https_client.c:219 E50A: * Connection #0 to host dns.alidns.com left intact
[D] 1749047619.461461 https_client.c:620 Released used io event: 0x7ffe22471f90
[W] 1749047619.461471 https_client.c:360 E50A: curl request failed with 28: Timeout was reached
[W] 1749047619.461476 https_client.c:362 E50A: curl error message: Operation timed out after 5003 milliseconds with 0 bytes received
[W] 1749047619.461483 https_client.c:376 E50A: Connecting and sending request to resolver was successful, but no response was sent back
[D] 1749047619.461488 https_client.c:446 E50A: CURLINFO_NUM_CONNECTS: 0
[D] 1749047619.461491 https_client.c:458 E50A: CURLINFO_EFFECTIVE_URL: https://dns.alidns.com/dns-query
[D] 1749047619.461498 https_client.c:495 E50A: Times: 0.000058, 0.000000, 0.000000, 0.000400, 0.000000, 5.003270
[I] 1749047619.461509 https_client.c:517 E50A: Response was faulty, skipping DNS reply
[D] 1749047619.461512 main.c:83 Received response for id: E50A, len: 0
[D] 1749047619.462845 main.c:112 Received request for id: E50A, len: 57
[D] 1749047619.462857 https_client.c:261 E50A: Requesting HTTP/2
[D] 1749047619.462870 https_client.c:219 E50A: * RESOLVE dns.alidns.com:443 - old addresses discarded
[D] 1749047619.462873 https_client.c:219 E50A: * Added dns.alidns.com:443:223.5.5.5,223.6.6.6 to DNS cache
[D] 1749047619.462890 https_client.c:219 E50A: * Found bundle for host: 0x559ad0c37f30 [can multiplex]
[D] 1749047619.462894 https_client.c:219 E50A: * Re-using existing connection with host dns.alidns.com
[D] 1749047619.462907 https_client.c:219 E50A: * [HTTP/2] [29] OPENED stream for https://dns.alidns.com/dns-query
[D] 1749047619.462911 https_client.c:219 E50A: * [HTTP/2] [29] [:method: POST]
[D] 1749047619.462912 https_client.c:219 E50A: * [HTTP/2] [29] [:scheme: https]
[D] 1749047619.462913 https_client.c:219 E50A: * [HTTP/2] [29] [:authority: dns.alidns.com]
[D] 1749047619.462915 https_client.c:219 E50A: * [HTTP/2] [29] [:path: /dns-query]
[D] 1749047619.462916 https_client.c:219 E50A: * [HTTP/2] [29] [user-agent: https_dns_proxy/0.3]
[D] 1749047619.462921 https_client.c:219 E50A: * [HTTP/2] [29] [accept: application/dns-message]
[D] 1749047619.462922 https_client.c:219 E50A: * [HTTP/2] [29] [content-type: application/dns-message]
[D] 1749047619.462924 https_client.c:219 E50A: * [HTTP/2] [29] [content-length: 57]
[D] 1749047619.462954 https_client.c:219 E50A: > POST /dns-query HTTP/2
[D] 1749047619.462957 https_client.c:219 E50A: > Host: dns.alidns.com
[D] 1749047619.462958 https_client.c:219 E50A: > User-Agent: https_dns_proxy/0.3
[D] 1749047619.462959 https_client.c:219 E50A: > Accept: application/dns-message
[D] 1749047619.462960 https_client.c:219 E50A: > Content-Type: application/dns-message
[D] 1749047619.462961 https_client.c:219 E50A: > Content-Length: 57
[D] 1749047619.462963 https_client.c:170 E50A: > 0000: e5 0a 01 20 00 01 00 00 00 00 00 01 01 31 01 32 ... .........1.2
[D] 1749047619.462965 https_client.c:170 E50A: > 0010: 01 33 01 34 05 73 73 6c 69 70 02 69 6f 00 00 01 .3.4.sslip.io...
[D] 1749047619.462967 https_client.c:170 E50A: > 0020: 00 01 00 00 29 04 d0 00 00 00 00 00 0c 00 0a 00 ....)...........
[D] 1749047619.462969 https_client.c:170 E50A: > 0030: 08 57 d6 2c 4d ff 74 cb 32                      .W.,M.t.2
[D] 1749047619.462974 https_client.c:633 Reserved new io event: 0x7ffe22471f90

Analysis

It is clear that using the -m parameter (CURLOPT_MAXAGE_CONN) does not alleviate the problem. As long as the dig continues running, the timeout can persist for a long time (I tested for 10 minutes and it was still timing out).

This issue is likely related to HTTP/2 TCP connection reuse. When forcing HTTP/1.1 with the -x option, the timeout only lasts for a short period.

I suspect this issue may be related to TCP half-open connections. In the reproduction steps, changing to a different IP address rather than simply toggling the ethernet interface is necessary.

Possible Solutions

Replacing CURLOPT_MAXAGE_CONN with CURLOPT_MAXLIFETIME_CONN could resolve this issue.

According to this function, CURLOPT_MAXAGE_CONN compares conn->created with the current time, which helps avoid the issue where conn->lastused gets updated by "send-only" activity.

Related issue

#106

This issue is similar to the problem at hand and provides a temporary workaround by restarting the service on OpenWrt. However, I believe this solution is not ideal.

curl/curl#3132

This issue includes an in-depth discussion of the problem and suggests that using HTTP/2 PING frames might be a potential solution—though. But I haven't seen a concrete implementation provided.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions