-
Notifications
You must be signed in to change notification settings - Fork 128
Description
What happened
When using https-dns-proxy on OpenWrt, I encountered a troubling issue: every time I reconnect the WAN interface using the PPPoE protocol, the DNS often becomes unresponsive. Specifically, both dig @router and dig @127.0.0.1 -p 5053 (with 5053 being the listening port of https-dns-proxy) result in a timeout.
I’ve investigated this issue in depth and found that it can be easily reproduced on Ubuntu 24.04 as well.
How to reproduce
- Install Ubuntu 24.04 (I used the Desktop version).
- Configure the network to use a static IP address. E.g.
192.168.9.100 - Open two terminal windows and run the following commands in each:
./https_dns_proxy -r https://dns.alidns.com/dns-query -p 5053 -4 -m 10 -vvvwhile true; do dig 1.2.3.4.sslip.io @127.0.0.1 -p 5053 ; sleep 1; done- Change the static IP address, for example, to
192.168.9.101.
- You can observe the output like this (please use this PR to fix logging):
[D] 1749047614.457423 https_client.c:219 E50A: * Operation timed out after 5004 milliseconds with 0 bytes received
[D] 1749047614.457494 https_client.c:219 E50A: * Connection #0 to host dns.alidns.com left intact
[D] 1749047614.457508 https_client.c:620 Released used io event: 0x7ffe22471f90
[W] 1749047614.457520 https_client.c:360 E50A: curl request failed with 28: Timeout was reached
[W] 1749047614.457537 https_client.c:362 E50A: curl error message: Operation timed out after 5004 milliseconds with 0 bytes received
[W] 1749047614.457548 https_client.c:376 E50A: Connecting and sending request to resolver was successful, but no response was sent back
[D] 1749047614.457555 https_client.c:446 E50A: CURLINFO_NUM_CONNECTS: 0
[D] 1749047614.457559 https_client.c:458 E50A: CURLINFO_EFFECTIVE_URL: https://dns.alidns.com/dns-query
[D] 1749047614.457568 https_client.c:495 E50A: Times: 0.000033, 0.000000, 0.000000, 0.000143, 0.000000, 5.004893
[I] 1749047614.457583 https_client.c:517 E50A: Response was faulty, skipping DNS reply
[D] 1749047614.457588 main.c:83 Received response for id: E50A, len: 0
[D] 1749047614.457627 main.c:112 Received request for id: E50A, len: 57
[D] 1749047614.457649 https_client.c:261 E50A: Requesting HTTP/2
[D] 1749047614.457678 https_client.c:219 E50A: * RESOLVE dns.alidns.com:443 - old addresses discarded
[D] 1749047614.457698 https_client.c:219 E50A: * Added dns.alidns.com:443:223.5.5.5,223.6.6.6 to DNS cache
[D] 1749047614.457747 https_client.c:219 E50A: * Found bundle for host: 0x559ad0c37f30 [can multiplex]
[D] 1749047614.457762 https_client.c:219 E50A: * Re-using existing connection with host dns.alidns.com
[D] 1749047614.457806 https_client.c:219 E50A: * [HTTP/2] [27] OPENED stream for https://dns.alidns.com/dns-query
[D] 1749047614.457816 https_client.c:219 E50A: * [HTTP/2] [27] [:method: POST]
[D] 1749047614.457823 https_client.c:219 E50A: * [HTTP/2] [27] [:scheme: https]
[D] 1749047614.457828 https_client.c:219 E50A: * [HTTP/2] [27] [:authority: dns.alidns.com]
[D] 1749047614.457833 https_client.c:219 E50A: * [HTTP/2] [27] [:path: /dns-query]
[D] 1749047614.457854 https_client.c:219 E50A: * [HTTP/2] [27] [user-agent: https_dns_proxy/0.3]
[D] 1749047614.457860 https_client.c:219 E50A: * [HTTP/2] [27] [accept: application/dns-message]
[D] 1749047614.457866 https_client.c:219 E50A: * [HTTP/2] [27] [content-type: application/dns-message]
[D] 1749047614.457872 https_client.c:219 E50A: * [HTTP/2] [27] [content-length: 57]
[D] 1749047614.457961 https_client.c:219 E50A: > POST /dns-query HTTP/2
[D] 1749047614.457975 https_client.c:219 E50A: > Host: dns.alidns.com
[D] 1749047614.457980 https_client.c:219 E50A: > User-Agent: https_dns_proxy/0.3
[D] 1749047614.457984 https_client.c:219 E50A: > Accept: application/dns-message
[D] 1749047614.458045 https_client.c:219 E50A: > Content-Type: application/dns-message
[D] 1749047614.458057 https_client.c:219 E50A: > Content-Length: 57
[D] 1749047614.458067 https_client.c:170 E50A: > 0000: e5 0a 01 20 00 01 00 00 00 00 00 01 01 31 01 32 ... .........1.2
[D] 1749047614.458077 https_client.c:170 E50A: > 0010: 01 33 01 34 05 73 73 6c 69 70 02 69 6f 00 00 01 .3.4.sslip.io...
[D] 1749047614.458087 https_client.c:170 E50A: > 0020: 00 01 00 00 29 04 d0 00 00 00 00 00 0c 00 0a 00 ....)...........
[D] 1749047614.458102 https_client.c:170 E50A: > 0030: 08 57 d6 2c 4d ff 74 cb 32 .W.,M.t.2
[D] 1749047614.458122 https_client.c:633 Reserved new io event: 0x7ffe22471f90
...about 5s later...
[D] 1749047619.461386 https_client.c:219 E50A: * Operation timed out after 5003 milliseconds with 0 bytes received
[D] 1749047619.461450 https_client.c:219 E50A: * Connection #0 to host dns.alidns.com left intact
[D] 1749047619.461461 https_client.c:620 Released used io event: 0x7ffe22471f90
[W] 1749047619.461471 https_client.c:360 E50A: curl request failed with 28: Timeout was reached
[W] 1749047619.461476 https_client.c:362 E50A: curl error message: Operation timed out after 5003 milliseconds with 0 bytes received
[W] 1749047619.461483 https_client.c:376 E50A: Connecting and sending request to resolver was successful, but no response was sent back
[D] 1749047619.461488 https_client.c:446 E50A: CURLINFO_NUM_CONNECTS: 0
[D] 1749047619.461491 https_client.c:458 E50A: CURLINFO_EFFECTIVE_URL: https://dns.alidns.com/dns-query
[D] 1749047619.461498 https_client.c:495 E50A: Times: 0.000058, 0.000000, 0.000000, 0.000400, 0.000000, 5.003270
[I] 1749047619.461509 https_client.c:517 E50A: Response was faulty, skipping DNS reply
[D] 1749047619.461512 main.c:83 Received response for id: E50A, len: 0
[D] 1749047619.462845 main.c:112 Received request for id: E50A, len: 57
[D] 1749047619.462857 https_client.c:261 E50A: Requesting HTTP/2
[D] 1749047619.462870 https_client.c:219 E50A: * RESOLVE dns.alidns.com:443 - old addresses discarded
[D] 1749047619.462873 https_client.c:219 E50A: * Added dns.alidns.com:443:223.5.5.5,223.6.6.6 to DNS cache
[D] 1749047619.462890 https_client.c:219 E50A: * Found bundle for host: 0x559ad0c37f30 [can multiplex]
[D] 1749047619.462894 https_client.c:219 E50A: * Re-using existing connection with host dns.alidns.com
[D] 1749047619.462907 https_client.c:219 E50A: * [HTTP/2] [29] OPENED stream for https://dns.alidns.com/dns-query
[D] 1749047619.462911 https_client.c:219 E50A: * [HTTP/2] [29] [:method: POST]
[D] 1749047619.462912 https_client.c:219 E50A: * [HTTP/2] [29] [:scheme: https]
[D] 1749047619.462913 https_client.c:219 E50A: * [HTTP/2] [29] [:authority: dns.alidns.com]
[D] 1749047619.462915 https_client.c:219 E50A: * [HTTP/2] [29] [:path: /dns-query]
[D] 1749047619.462916 https_client.c:219 E50A: * [HTTP/2] [29] [user-agent: https_dns_proxy/0.3]
[D] 1749047619.462921 https_client.c:219 E50A: * [HTTP/2] [29] [accept: application/dns-message]
[D] 1749047619.462922 https_client.c:219 E50A: * [HTTP/2] [29] [content-type: application/dns-message]
[D] 1749047619.462924 https_client.c:219 E50A: * [HTTP/2] [29] [content-length: 57]
[D] 1749047619.462954 https_client.c:219 E50A: > POST /dns-query HTTP/2
[D] 1749047619.462957 https_client.c:219 E50A: > Host: dns.alidns.com
[D] 1749047619.462958 https_client.c:219 E50A: > User-Agent: https_dns_proxy/0.3
[D] 1749047619.462959 https_client.c:219 E50A: > Accept: application/dns-message
[D] 1749047619.462960 https_client.c:219 E50A: > Content-Type: application/dns-message
[D] 1749047619.462961 https_client.c:219 E50A: > Content-Length: 57
[D] 1749047619.462963 https_client.c:170 E50A: > 0000: e5 0a 01 20 00 01 00 00 00 00 00 01 01 31 01 32 ... .........1.2
[D] 1749047619.462965 https_client.c:170 E50A: > 0010: 01 33 01 34 05 73 73 6c 69 70 02 69 6f 00 00 01 .3.4.sslip.io...
[D] 1749047619.462967 https_client.c:170 E50A: > 0020: 00 01 00 00 29 04 d0 00 00 00 00 00 0c 00 0a 00 ....)...........
[D] 1749047619.462969 https_client.c:170 E50A: > 0030: 08 57 d6 2c 4d ff 74 cb 32 .W.,M.t.2
[D] 1749047619.462974 https_client.c:633 Reserved new io event: 0x7ffe22471f90
Analysis
It is clear that using the -m parameter (CURLOPT_MAXAGE_CONN) does not alleviate the problem. As long as the dig continues running, the timeout can persist for a long time (I tested for 10 minutes and it was still timing out).
This issue is likely related to HTTP/2 TCP connection reuse. When forcing HTTP/1.1 with the -x option, the timeout only lasts for a short period.
I suspect this issue may be related to TCP half-open connections. In the reproduction steps, changing to a different IP address rather than simply toggling the ethernet interface is necessary.
Possible Solutions
Replacing CURLOPT_MAXAGE_CONN with CURLOPT_MAXLIFETIME_CONN could resolve this issue.
According to this function, CURLOPT_MAXAGE_CONN compares conn->created with the current time, which helps avoid the issue where conn->lastused gets updated by "send-only" activity.
Related issue
#106
This issue is similar to the problem at hand and provides a temporary workaround by restarting the service on OpenWrt. However, I believe this solution is not ideal.
curl/curl#3132
This issue includes an in-depth discussion of the problem and suggests that using HTTP/2 PING frames might be a potential solution—though. But I haven't seen a concrete implementation provided.
