Skip to content

Fix DNS hangs on ESP32#5438

Open
willmmiles wants to merge 2 commits intowled:mainfrom
willmmiles:ipv6-dns-fix
Open

Fix DNS hangs on ESP32#5438
willmmiles wants to merge 2 commits intowled:mainfrom
willmmiles:ipv6-dns-fix

Conversation

@willmmiles
Copy link
Member

@willmmiles willmmiles commented Mar 22, 2026

Discard IPv6 RA packets, which cause LwIP to overwrite the IPv4 DNS servers, resulting in DNS queries hanging up because there are no accessible servers.

Original issue discussed in #5332 (comment) . The detect-and-recover method did not work well because there is no event notification on processing unsolicited packets.

Summary by CodeRabbit

  • New Features
    • ESP32 devices with IPv6 support now block unwanted IPv6 router advertisements during startup, improving network stability and IPv6 connectivity management on compatible hardware.

Discard IPv6 RA packets, which cause LwIP to overwrite the IPv4 DNS
servers, resulting in DNS queries hanging up because there are no
accessible servers.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a101644e-dc0c-4a7b-82ca-fb1c42fe5a94

📥 Commits

Reviewing files that changed from the base of the PR and between bda74ba and abb4124.

📒 Files selected for processing (1)
  • wled00/network.cpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • wled00/network.cpp

Walkthrough

Adds an ESP32+LwIP IPv6 Router Advertisement (RA) blocker: declaration in fcn_declare.h, implementation using a LwIP raw PCB/ICMPv6 callback in network.cpp, and a conditional call in WLED::setup() to install the blocker on IPv6-capable ESP32 builds.

Changes

Cohort / File(s) Summary
Function Declaration
wled00/fcn_declare.h
Added void installIPv6RABlocker(); declaration in the network-related section.
IPv6 RA Blocker Implementation
wled00/network.cpp
Added IPv6-only block using LwIP: conditional includes (lwip/raw.h, lwip/icmp6.h), a static raw receive callback blockRouterAdvertisements() that detects/consumes ICMPv6 RA packets, and installIPv6RABlocker() that creates a raw PCB and registers the callback (guarded by ARDUINO_ARCH_ESP32 && LWIP_IPV6).
Setup Integration
wled00/wled.cpp
Inserted conditional call to installIPv6RABlocker() in WLED::setup() when ARDUINO_ARCH_ESP32 and LWIP_IPV6 are defined (placed after IR init).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix DNS hangs on ESP32' directly addresses the main objective of the PR: preventing DNS hangs caused by IPv6 Router Advertisement packets overwriting configured DNS servers. The change is concise, specific, and clearly summarizes the primary purpose of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can generate a title for your PR based on the changes.

Add @coderabbitai placeholder anywhere in the title of your PR and CodeRabbit will replace it with a title based on the changes in the PR. You can change the placeholder by changing the reviews.auto_title_placeholder setting.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@wled00/network.cpp`:
- Around line 377-384: The new C++ callback and installer use 4-space
indentation; change them to the repository's 2-space indentation style. Update
the function blockRouterAdvertisements (and the related installer callback
bodies added nearby) to use 2 spaces per indent level throughout their bodies,
keeping existing control flow and return logic unchanged so only whitespace is
modified.
- Around line 386-389: installIPv6RABlocker calls raw_recv() unconditionally
after raw_new_ip_type() which can return NULL; update installIPv6RABlocker to
check the returned ra_blocker for NULL and avoid calling raw_recv() (and
optionally log or return) when allocation fails so you don't pass a NULL pcb to
raw_recv; locate the function installIPv6RABlocker and the symbols
raw_new_ip_type, ra_blocker, raw_recv, and blockRouterAdvertisements and add a
simple NULL check (if (!ra_blocker) { /* handle/log and return */ }) before
invoking raw_recv.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4f9a864d-72e7-46af-8f2d-9fddc3af7e8d

📥 Commits

Reviewing files that changed from the base of the PR and between 9cf3a7d and bda74ba.

📒 Files selected for processing (3)
  • wled00/fcn_declare.h
  • wled00/network.cpp
  • wled00/wled.cpp

Comment on lines +386 to +389
void installIPv6RABlocker() {
static struct raw_pcb* ra_blocker = raw_new_ip_type(IPADDR_TYPE_V6, IP6_NEXTH_ICMP6);
raw_recv(ra_blocker, blockRouterAdvertisements, NULL);
}
Copy link
Contributor

@coderabbitai coderabbitai bot Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In lwIP raw API (as used by ESP32 Arduino/ESP-IDF), can raw_new_ip_type(...) return NULL, and what is the documented behavior of calling raw_recv(NULL, ...)?

💡 Result:

Yes, raw_new_ip_type(...) can return NULL if the PCB data structure (allocated via memp_malloc(MEMP_RAW_PCB)) could not be allocated, typically due to memory pool exhaustion. The documented behavior of calling raw_recv(NULL, ...) is not explicitly stated, but standard lwIP API usage requires a valid non-NULL pcb (returned from raw_new_ip_type) to be passed to raw_recv. Passing NULL is undefined behavior, likely resulting in a crash or no-op since pcb->recv and pcb->recv_arg would be dereferenced/null. Applications must check if raw_new_ip_type returned non-NULL before calling raw_recv on it. In ESP32 Arduino/ESP-IDF lwIP, the raw API is available but ESP-IDF officially does not support it (recommends sockets/netconn), though it can be enabled via LWIP_RAW.

Citations:


Guard raw_recv() against raw PCB allocation failure.

raw_new_ip_type() can return NULL if memory allocation fails. Calling raw_recv() with a NULL pointer causes undefined behavior and likely crashes. Add a NULL check after allocation before invoking raw_recv().

Proposed fix
 void installIPv6RABlocker() {
-    static struct raw_pcb* ra_blocker = raw_new_ip_type(IPADDR_TYPE_V6, IP6_NEXTH_ICMP6);
-    raw_recv(ra_blocker, blockRouterAdvertisements, NULL);
+  static struct raw_pcb* ra_blocker = nullptr;
+  if (ra_blocker) return;
+
+  ra_blocker = raw_new_ip_type(IPADDR_TYPE_V6, IP6_NEXTH_ICMP6);
+  if (!ra_blocker) {
+    DEBUG_PRINTLN(F("IPv6 RA blocker: failed to allocate raw PCB"));
+    return;
+  }
+
+  raw_recv(ra_blocker, blockRouterAdvertisements, nullptr);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@wled00/network.cpp` around lines 386 - 389, installIPv6RABlocker calls
raw_recv() unconditionally after raw_new_ip_type() which can return NULL; update
installIPv6RABlocker to check the returned ra_blocker for NULL and avoid calling
raw_recv() (and optionally log or return) when allocation fails so you don't
pass a NULL pcb to raw_recv; locate the function installIPv6RABlocker and the
symbols raw_new_ip_type, ra_blocker, raw_recv, and blockRouterAdvertisements and
add a simple NULL check (if (!ra_blocker) { /* handle/log and return */ })
before invoking raw_recv.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called once during program setup. If we're out of heap then, nothing will ever work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willmmiles, that's a fair point — in an embedded firmware context, if memp_malloc fails during early boot, the device is effectively in an unrecoverable state regardless. A NULL guard here would add noise without providing any practical protection. I'll withdraw the suggestion.

ʕ •ᴥ• ʔ


🧠 Learnings used
Learnt from: softhack007
Repo: wled/WLED PR: 5436
File: wled00/wled.cpp:791-798
Timestamp: 2026-03-22T00:14:25.425Z
Learning: In arduino-esp32 v2.x (and consistent with v1.0.6), calling WiFi.mode(WIFI_MODE_NULL) on ESP32 physically shuts down BOTH the STA and AP interfaces at the hardware/driver level. WIFI_MODE_NULL=0 clears all mode bits (WIFI_MODE_STA=bit0, WIFI_MODE_AP=bit1). Therefore, setting apActive=false immediately after WiFi.mode(WIFI_MODE_NULL) in WLED's initConnection() is correct and accurately reflects the actual hardware state. The official arduino-esp32 docs confirm: "To change the hostname, reset WiFi with WiFi.mode(WIFI_MODE_NULL), then proceed with WiFi.setHostname(...) and restart WiFi from scratch." AP behavior policy (AP_BEHAVIOR_NO_CONN, AP_BEHAVIOR_TEMPORARY, AP_BEHAVIOR_ALWAYS) is preserved in apBehavior (not apActive), so handleConnection() will correctly re-initiate AP on the next cycle if the policy requires it.

Learnt from: willmmiles
Repo: wled/WLED PR: 4859
File: usermods/udp_name_sync/udp_name_sync.cpp:62-70
Timestamp: 2025-08-21T00:00:17.196Z
Learning: In WLED codebase, UDP packet processing code must never rely on null termination being present on the wire and must always validate packet size first. The codebase already has some vulnerabilities where malformed data can crash the software, so avoiding introduction of new ones is a priority.

Learnt from: willmmiles
Repo: wled/WLED PR: 0
File: :0-0
Timestamp: 2025-08-29T01:28:10.512Z
Learning: In ESP8266 code, the system_get_rst_info() function cannot return null as it returns a pointer to static memory. This is a common C API pattern that hides internal structure details, so null checks are unnecessary for this function.

Learnt from: DedeHai
Repo: wled/WLED PR: 4791
File: wled00/FX_fcn.cpp:1187-1191
Timestamp: 2025-08-29T15:38:46.208Z
Learning: In WLED's allocate_buffer() function, BFRALLOC_ENFORCE_PSRAM already includes fallback logic to DRAM if PSRAM is not available, as documented in the comment "use PSRAM if available, otherwise fall back to DRAM". The function also uses validateFreeHeap() for additional safety checks. During setup() when finalizeInit() runs, PSRAM has vast available memory making failures unlikely.

Learnt from: willmmiles
Repo: wled/WLED PR: 4890
File: lib/NeoESP32RmtHI/include/NeoEsp32RmtHIMethod.h:173-180
Timestamp: 2025-09-02T01:56:43.841Z
Learning: willmmiles prefers to maintain consistency with upstream NeoPixelBus patterns (like unchecked malloc in construct() methods) rather than diverging until improvements are made upstream first, to minimize maintenance burden and keep the codebase aligned.

Learnt from: softhack007
Repo: wled/WLED PR: 5355
File: wled00/util.cpp:635-638
Timestamp: 2026-02-07T16:06:08.677Z
Learning: PSRAM-related compilation guards should enable PSRAM code only for ESP32 variants that actually include PSRAM: ESP32-C61, ESP32-C5, and ESP32-P4. Exclude ESP32-C3, ESP32-C6, and ESP8266 from these guards. Apply this rule across the codebase (not just wled00/util.cpp) by reviewing and updating PSRAM guards/macros in all relevant files (C/C++ headers and sources).

Learnt from: softhack007
Repo: wled/WLED PR: 4893
File: wled00/set.cpp:95-95
Timestamp: 2026-03-14T20:56:46.543Z
Learning: Guideline: Ensure WiFi hostname is set after WiFi.mode() but before WiFi.begin() to avoid default esp-XXXXXX hostname being used in DHCP. This ordering only takes effect after the STA interface exists (so avoid placing hostname setting before initConnection). In WLED, place the hostname configuration inside initConnection() (after WiFi.disconnect(true) and before WiFi.begin()) rather than in earlier boot code like deserializeConfig(). This rule should be applied in code reviews for WLED’s network initialization paths in wled00/*.cpp, and note that on ESP8266 the ordering is less strict but still desirable for consistency.

Copy link
Collaborator

@DedeHai DedeHai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no means to test but code looks good to me

@netmindz
Copy link
Member

Is this just in V4 ?
Should we be backporting to 0-15 ?

@OberfragGER
Copy link

OberfragGER commented Mar 23, 2026

Okay.
Im using a Wemos D1 Mini Pro and i have reboots with all current firmwares with my new AP from Ubiquity. Im using the U7 Pro. The UK Ultra works fine.
With FritzBox Wifi it runs fine.

With this fix, the network connection is still up but i got a 503 Error. It seems that i run out of RAM. Ping is working the whole time but no web frontend. I have tested the WLED_16.0.0-alpha_ESP02.bin. The compat.bin didnt work on that device.

When i switch back to the old wifi - everything works fine.
Free heap goes from 7kb to 17kb.

Changing Force 802.11g mode (ESP8266 only) or Disable WiFi sleep doesnt help with the problem.

Tested a NodeMCU. Same problem.

@softhack007 softhack007 added this to the 16.0.0 beta milestone Mar 23, 2026
@softhack007
Copy link
Member

softhack007 commented Mar 23, 2026

@OberfragGER can you test with an esp32 based board?

This fix does not change anything on esp8266 (esp01, esp02, esp12, nodemcu, etc), its only for esp32 (classic, S3, S2, C3) with the "V4" framework.

@OberfragGER
Copy link

@OberfragGER can you test with an esp32 based board?

This fix does not change anything on esp8266 (esp01, esp02, esp12, nodemcu, etc), its only for esp32 (classic, S3, S2, C3) with the "V4" framework.

Im not sure if i have any ESP32 V4. Is that one?
grafik

This Device is working fine. No disconnects or something like this. Everything is as fine as expected.

@softhack007
Copy link
Member

softhack007 commented Mar 24, 2026

I don't have any DNS problems on my local wifi - but I can say that nothing is "obviously broken" in a few tests I did on one of my S3 boards after uploading this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants