diff --git a/docker/docker-compose.yml b/docker/docker-compose.yml index d3d29d45f..b511141d0 100644 --- a/docker/docker-compose.yml +++ b/docker/docker-compose.yml @@ -9,7 +9,18 @@ services: ports: - "3000:3000" # REST API - "3001:3001" # WebSocket - - "5005:5005/udp" # ESP32 UDP + # ESP32 UDP. On Linux/macOS this works with multiple ESP32 nodes out of + # the box. On Docker Desktop for Windows, multi-source UDP is collapsed + # to one source IP at the WSL/Hyper-V boundary, so all-but-one node's + # frames are silently dropped (issue #374, #386). + # + # Windows workaround: change this to "5006:5005/udp" and run the host + # relay so every datagram arrives from the same loopback source: + # + # python scripts/udp-relay.py --listen-port 5005 --forward-port 5006 + # + # See docs/TROUBLESHOOTING.md §9 for details. + - "5005:5005/udp" environment: - RUST_LOG=info # CSI_SOURCE controls the data source for the sensing server. diff --git a/docs/TROUBLESHOOTING.md b/docs/TROUBLESHOOTING.md index bea536cce..90ba94b20 100644 --- a/docs/TROUBLESHOOTING.md +++ b/docs/TROUBLESHOOTING.md @@ -109,3 +109,75 @@ ssh thyhack@100.90.238.87 **Symptom:** Plugging into the right USB-C port (when facing the board with USB-C toward you) shows no serial device on the host. **Fix:** Use the left USB-C port. On most ESP32-S3-DevKitC boards, the left port is the USB-to-UART bridge (CP2102/CH340) used for flashing and serial monitor. The right port is the native USB (USB-JTAG) which requires different drivers and isn't used by the RuView firmware. + +--- + +## 9. Docker Desktop on Windows drops UDP from multiple ESP32 nodes + +**Symptom:** Two or more ESP32 nodes are flashed, provisioned, and visibly transmit on the network — `tcpdump`/Wireshark on the Windows host shows datagrams from every node — but inside the Docker container only one source IP arrives. `/api/v1/sensing/latest` shows a single node and the live UI freezes or only tracks one body. Reported in #374 (4-node bench) and reproduced in #386 (6-node demo, RuView v0.7.0). + +**Root cause:** Docker Desktop on Windows runs the engine inside a WSL2 / Hyper-V VM. Inbound UDP from the host LAN is forwarded through `vpnkit` / `vEthernet` and the multi-source-IP datagrams are demultiplexed onto a single virtual socket. The first source-IP "wins"; subsequent unique sources are silently dropped at the VM boundary. This is a Docker Desktop limitation, not a sensing-server bug — `host.docker.internal` and `--network host` do not help (host networking is not implemented for the Linux engine on Windows). + +**Fix:** Run the bundled UDP relay on the host so every forwarded datagram arrives from the same loopback source IP, which Docker passes through unchanged. + +```powershell +# 1. Start the relay (PowerShell or any terminal) +python scripts/udp-relay.py --listen-port 5005 --forward-port 5006 + +# 2. Edit docker/docker-compose.yml — change the ESP32 UDP mapping from +# - "5005:5005/udp" +# to +# - "5006:5005/udp" + +# 3. Bring the stack up +docker compose -f docker/docker-compose.yml up +``` + +ESP32 nodes still target the host on `--target-ip :5005` — no firmware re-provisioning is needed. The relay is `scripts/udp-relay.py` (stdlib only, no extra deps). Verify with `--verbose` that each node's source IP appears at least once before forwarding stabilises on a single ephemeral relay port. + +**Prevention:** Linux and macOS hosts are unaffected; the relay only needs to run on Docker Desktop for Windows. If Docker Desktop ships per-source UDP forwarding (tracked at [docker/for-win#1144](https://github.com/docker/for-win/issues/1144) and related), this workaround can be retired. + +**Prior art:** PR #413 (`txhno`) proposed a docs-only writeup of the same workaround; this entry supersedes it. + +--- + +## 10. `404` on the visualization page when running sensing-server + +**Symptom:** `sensing-server` starts cleanly, logs `HTTP server listening on http://localhost:3000`, but loading `http://localhost:3000/` (or `/ui/index.html`) returns `404 Not Found`. Reported in #188. + +**Root cause:** The default `--ui-path ../../ui` is resolved relative to the binary's *current working directory*, not the binary location. When the binary is launched from anywhere other than `crates/wifi-densepose-sensing-server/`, the relative path doesn't reach the UI assets and Axum's static file handler returns 404. + +**Fix:** Pass an absolute UI path, run the binary from the crate directory, or use the Docker image (which bundles the UI under `/app/ui`). + +```bash +# Option A — absolute path (recommended for production) +sensing-server --source esp32 --udp-port 5005 --http-port 3000 \ + --ws-port 3001 --ui-path /absolute/path/to/ui + +# Option B — run from the crate dir (works for local dev / cargo run) +cd v2/crates/wifi-densepose-sensing-server +cargo run -- --source esp32 + +# Option C — Docker (no path config needed) +docker compose -f docker/docker-compose.yml up sensing-server +``` + +**Prevention:** Track future work in #188 to fall back to a path resolved relative to the executable when the cwd-relative path doesn't exist, so the binary works regardless of where it's launched. + +--- + +## 11. Boot loop on `--edge-tier 1` or `--edge-tier 2` + +**Symptom:** ESP32-S3 boots normally with `--edge-tier 0`, but flashing the same firmware with `--edge-tier 1` or `2` produces a boot loop. Serial output reaches `cpu_start` and `heap_init`, then resets repeatedly. Reported in #438 against firmware `v0.4.3.1-esp32-3-g66e2fa083-dir`. + +**Root cause:** Edge tiers 1 and 2 enable the on-device DSP pipeline on Core 1. In the affected build, the `edge_dsp` task ran a tight per-frame loop without yielding, so the FreeRTOS task watchdog tripped on Core 1 and panicked. Tier 0 is passthrough only and doesn't activate the pipeline, so the watchdog never fires there. + +**Fix:** Flash the [v0.4.3.1-esp32](https://github.com/ruvnet/RuView/releases/tag/v0.4.3.1-esp32) release or later — the DSP task yield fixes have shipped on `main` since the build in the report. + +```bash +# Verify what version you're on (look for "App version" in serial output on boot) +python -m serial.tools.miniterm COM7 115200 +# Expect: "App version: v0.4.3.1-esp32" or higher +``` + +If the boot loop persists on a release build, capture a full serial trace including the watchdog backtrace and reopen #438 with the new build hash. diff --git a/scripts/udp-relay.py b/scripts/udp-relay.py new file mode 100644 index 000000000..223d87493 --- /dev/null +++ b/scripts/udp-relay.py @@ -0,0 +1,103 @@ +#!/usr/bin/env python3 +""" +UDP relay for Docker Desktop on Windows (issue #374, #386). + +Docker Desktop on Windows multiplexes inbound UDP from multiple source IPs to +a single source IP inside the container, which causes packets from all but one +ESP32 node to be silently dropped at the WSL/Hyper-V boundary. + +This relay listens on the host, then re-emits each datagram from its own +single socket back to a localhost port that Docker forwards into the +container. Because every forwarded datagram now has the same source IP/port +(the relay's loopback socket), Docker passes them all through. + +Usage: + # Default: listen on host:5005, forward to 127.0.0.1:5006 + # Container should be started with -p 5006:5005/udp. + python scripts/udp-relay.py + + # Custom ports + python scripts/udp-relay.py --listen-port 5005 --forward-port 5006 + + # Verbose (one line per packet) + python scripts/udp-relay.py --verbose +""" + +import argparse +import socket +import sys +import time + + +def run_relay(listen_host: str, listen_port: int, forward_host: str, + forward_port: int, stats_interval: float, verbose: bool) -> int: + rx = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) + rx.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) + try: + rx.bind((listen_host, listen_port)) + except OSError as e: + print(f"udp-relay: failed to bind {listen_host}:{listen_port}: {e}", + file=sys.stderr) + return 1 + + tx = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) + forward_addr = (forward_host, forward_port) + + print(f"udp-relay: listening on {listen_host}:{listen_port} " + f"-> forwarding to {forward_host}:{forward_port}") + print("udp-relay: collapses multi-source UDP to a single loopback source " + "so Docker Desktop on Windows forwards every packet (issue #374).") + + sources: dict[tuple[str, int], int] = {} + total = 0 + last_stats = time.monotonic() + + try: + while True: + data, src = rx.recvfrom(65535) + tx.sendto(data, forward_addr) + total += 1 + sources[src] = sources.get(src, 0) + 1 + + if verbose: + print(f"udp-relay: {src[0]}:{src[1]} -> " + f"{forward_host}:{forward_port} ({len(data)}B)") + + now = time.monotonic() + if now - last_stats >= stats_interval: + print(f"udp-relay: forwarded {total} pkts from " + f"{len(sources)} sources in last {stats_interval:.0f}s") + sources.clear() + total = 0 + last_stats = now + except KeyboardInterrupt: + print("udp-relay: stopping") + return 0 + finally: + rx.close() + tx.close() + + +def main() -> int: + p = argparse.ArgumentParser(description=__doc__, + formatter_class=argparse.RawDescriptionHelpFormatter) + p.add_argument("--listen-host", default="0.0.0.0", + help="Host interface to bind (default: 0.0.0.0)") + p.add_argument("--listen-port", type=int, default=5005, + help="Port the ESP32 nodes send to (default: 5005)") + p.add_argument("--forward-host", default="127.0.0.1", + help="Where to forward packets (default: 127.0.0.1)") + p.add_argument("--forward-port", type=int, default=5006, + help="Port Docker maps into the container (default: 5006)") + p.add_argument("--stats-interval", type=float, default=10.0, + help="Seconds between stats lines (default: 10)") + p.add_argument("--verbose", action="store_true", + help="Log every forwarded packet") + args = p.parse_args() + + return run_relay(args.listen_host, args.listen_port, args.forward_host, + args.forward_port, args.stats_interval, args.verbose) + + +if __name__ == "__main__": + sys.exit(main())