Skip to content

fix(setup): persist firewall rules across reboot + report errors instead of swallowing#29

Closed
NeritonDias wants to merge 1 commit into
evolution-foundation:developfrom
NeritonDias:fix/firewall-persistence
Closed

fix(setup): persist firewall rules across reboot + report errors instead of swallowing#29
NeritonDias wants to merge 1 commit into
evolution-foundation:developfrom
NeritonDias:fix/firewall-persistence

Conversation

@NeritonDias
Copy link
Copy Markdown
Contributor

Summary

Reproduced on Oracle Cloud (Ubuntu 24.04 cloud image): the wizard prints "Firewall ports opened (80, 443)" but the dashboard is unreachable from outside, and after a reboot the iptables rules vanish entirely. Cloudflare returns 523 "Origin Unreachable" for the duration.

Repro

sudo git clone --branch develop https://github.com/EvolutionAPI/evo-nexus.git /root/evonexus
cd /root/evonexus && make setup   # answer "Domain with SSL"
# wizard finishes, prints "✓ Portas do firewall abertas (80, 443)"
sudo iptables-save | grep -E 'dport (80|443)'
# → empty. The "✓" was a lie.
sudo reboot
# … even if the rules WERE in memory, they're gone now

Root cause

Three bugs in the original one-liner:

os.system("ufw allow 80/tcp 2>/dev/null; ufw allow 443/tcp 2>/dev/null; ...")
os.system("iptables -I INPUT -p tcp --dport 80 -j ACCEPT 2>/dev/null; ...")
print("✓ Firewall ports opened (80, 443)")  # always prints
  1. 2>/dev/null swallows every error. OCI/Ubuntu cloud images don't ship ufw — every ufw line silently fails. The iptables fallback often runs, but if it errors (permission, nf_tables backend rejection, missing CAP_NET_ADMIN inside a container) you'd never know.
  2. Nothing calls netfilter-persistent save (or writes to /etc/iptables/rules.v4). Even when iptables -I succeeds, the next reboot reloads the persistent ruleset which doesn't include 80/443 → dashboard offline until the operator manually re-runs setup.
  3. Re-running the wizard adds duplicate ACCEPT rules each time (no -C check before -I).

Fix

New helper _open_firewall_ports(ports):

  • Prefers ufw when present (handles persistence itself).
  • Falls back to iptables, with -C idempotency check before -I (re-runs on the same machine don't pile up duplicate rules).
  • Persists via netfilter-persistent save — auto-installs iptables-persistent non-interactively on Debian/Ubuntu if missing. Last-resort fallback writes /etc/iptables/rules.v4 directly.
  • Surfaces actual errors instead of silencing them. Reports which backend was used and which persistence path succeeded.
  • Best-effort cloud-provider detection (OCI, AWS, GCP, Azure, DigitalOcean, Hetzner) via /sys/class/dmi/id/* — prints a hint that host-level firewall changes alone may not be enough; the operator likely also needs to open the port in the cloud Security List/Group/NSG. No host-level command can fix the cloud network firewall — but a clear hint saves hours of debugging "523 Origin Unreachable" from Cloudflare.

7 new translation keys, mirrored across en-US / pt-BR / es. Bundles remain at exact key parity (160 each).

Test plan

  • python -c "import ast; ast.parse(open('setup.py',encoding='utf-8').read())" — clean parse
  • Translation parity — all 3 bundles at 160 keys, set-diff empty, every new format string survives .format(tool=, err=, provider=)
  • Oracle Cloud Ubuntu 24.04: rules go in via iptables, persist via netfilter-persistent, survive reboot. Hint about OCI Security List shown.
  • Ubuntu desktop with ufw: rules go in via ufw, persist automatically, no extra hint shown.
  • Re-running wizard: idempotent (no duplicate INPUT rules).
  • Operator validation on OCI: clean install → reboot → iptables-save | grep dport shows 80 + 443 → curl -I https://<domain>/ from a different machine returns 200.

Breaking changes

None.

  • The behavior is strictly a superset: when ufw was working before, it still works now (and was already auto-persistent).
  • When iptables was the active backend, rules are now persisted instead of vanishing — that's the bug fix.
  • The cloud-provider hint is informational; if no provider is detected (bare metal, unknown hypervisor), nothing extra is printed.
  • No new mandatory dependencies; iptables-persistent is auto-installed only when iptables is the active backend AND apt-get is available.

…f swallowing them

Reproduced on Oracle Cloud (Ubuntu 24.04 cloud image): wizard prints
"Firewall ports opened (80, 443)" but the dashboard is unreachable
from outside, and after a reboot the iptables rules vanish entirely.

Three bugs in the original one-liner:

    os.system("ufw allow 80/tcp 2>/dev/null; ufw allow 443/tcp 2>/dev/null; ...")
    os.system("iptables -I INPUT -p tcp --dport 80 -j ACCEPT 2>/dev/null; ...")
    print("Firewall ports opened")  # always prints, regardless

  1. ``2>/dev/null`` swallows every error. On OCI/Ubuntu cloud images
     ``ufw`` isn't installed — the ufw lines all fail silently. The
     iptables fallback often runs, but if it errors (permission,
     nf_tables backend rejection, missing CAP_NET_ADMIN) you'd never
     know.
  2. Nothing calls ``netfilter-persistent save`` (or saves to
     ``/etc/iptables/rules.v4``). Even when iptables -I succeeds,
     the next reboot reloads the persistent ruleset which doesn't
     include 80/443 → dashboard offline until the operator manually
     re-runs setup.
  3. Re-running the wizard adds duplicate ACCEPT rules each time
     (no -C check before -I).

Refactor:

  * New helper ``_open_firewall_ports(ports)`` that prefers ufw when
    present (it persists itself), falls back to iptables with -C
    idempotency check, and PERSISTS via netfilter-persistent —
    auto-installing iptables-persistent on Debian/Ubuntu if missing.
    Falls back further to ``iptables-save > /etc/iptables/rules.v4``.
  * Surfaces actual errors instead of silencing. Reports which
    backend was used and which persistence path succeeded.
  * Best-effort cloud-provider detection (OCI, AWS, GCP, Azure,
    DigitalOcean, Hetzner) via /sys/class/dmi/id/* — prints a hint
    that host-level firewall changes alone may not be enough; the
    operator likely also needs to open the port in the cloud
    Security List/Group/NSG. (No host-level command can fix the
    cloud network firewall — but a clear hint saves hours of
    debugging "523 Origin Unreachable" from Cloudflare.)

Translation keys: 7 new, mirrored across en-US / pt-BR / es. Bundles
remain at exact key parity (160 each).

Verified locally:
  * Oracle Cloud Ubuntu 24.04: rules go in via iptables, persist via
    netfilter-persistent, survive reboot. Hint about OCI Security
    List shown.
  * Ubuntu desktop with ufw: rules go in via ufw, persist
    automatically, no extra hint shown.
  * Re-running wizard: idempotent (no duplicate INPUT rules).
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @NeritonDias, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@NeritonDias
Copy link
Copy Markdown
Contributor Author

Superseded by #28 — unified all three fixes (start-services preservation, scheduler PID dir, firewall persistence) into a single PR. They form one coherent end-to-end fix for fresh VPS installs surviving the first reboot, and bundling reduces the risk of a partial squash-merge (cf. what happened with #27). The exact firewall changes from this PR are commit bb2030a on the unified branch.

@NeritonDias NeritonDias deleted the fix/firewall-persistence branch April 24, 2026 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant