Skip to content

frr: sweep stale grout routes across zebra restart#605

Draft
maxime-leroy wants to merge 5 commits intoDPDK:mainfrom
maxime-leroy:cleanup_fib_routes_frr_startup
Draft

frr: sweep stale grout routes across zebra restart#605
maxime-leroy wants to merge 5 commits intoDPDK:mainfrom
maxime-leroy:cleanup_fib_routes_frr_startup

Conversation

@maxime-leroy
Copy link
Copy Markdown
Collaborator

@maxime-leroy maxime-leroy commented Apr 22, 2026

When zebra is killed and restarted while grout keeps running, grout's FIB retains every route the previous FRR instance had installed. With the current plugin those routes become orphans: the startup sync explicitly drops every self-originated route returned by the grout dump (rt_grout.c:337) so zebra's RIB never sees them, and no mechanism ever garbage-collects them from grout. Unreclaimed prefixes linger indefinitely and can blackhole traffic to stale nexthops.

Upstream FRR solves the same problem for the kernel dplane through the ZEBRA_FLAG_SELFROUTE flag, rib_sweep_route and the GR_RUN metaQ sub-queue: kernel-read routes are re-inserted as selfroute placeholders with a pre-startup uptime; protocol clients take them over via rib_compare_routes when they re-announce; anything unreclaimed is purged when rib_sweep_route fires after -K seconds. The grout plugin could not reuse that directly because grout_sync runs from frr_config_post, which is after zebra_main_router_started() has already stamped zrouter.startup_time and armed t_rib_sweep. Entries created in grout_sync therefore have uptime > startup_time and would never match the sweep predicate, and the sweep itself is already scheduled.

Transplant the mechanism into the plugin:

  • Thread a bool startup flag through grout_route4_change / grout_route6_change / grout_route_change (mirroring the existing grout_nexthop_change signature), true on the initial dump path, false on asynchronous notifications. The blanket self-origin skip stays for the notification case (an echo of an install we sent) but becomes injection for the dump case: build the route_entry with ZEBRA_FLAG_SELFROUTE and force re->uptime = zrouter.startup_time so the sweep predicate SELFROUTE && uptime <= startup_time matches, and rib_compare_routes will replace the placeholder cleanly when a client reclaims the prefix.

  • In grout_add_del_route, short-circuit DPLANE_OP_ROUTE_INSTALL and DPLANE_OP_ROUTE_UPDATE when the ctx carries ZEBRA_FLAG_SELFROUTE. Those are the routes we just learned from grout; pushing them back down would be a redundant round-trip. DPLANE_OP_ROUTE_DELETE is left alone so the sweep's uninstall reaches grout via the normal flow.

  • At the very end of the sync chain (last VRF of grout_sync_routes), cancel zrouter.t_rib_sweep and re-arm it with delay equal to the operator's -K value. This aligns the sweep on sync_complete + -K instead of startup_time + -K, giving our backdated SELFROUTE entries the full reclaim window the operator asked for. The -K value is recovered by re-parsing /proc/self/cmdline with getopt_long on a minimal option table: zebra_di.gr_cleanup_time is static in main.c and not exported, re-invoking frr_getopt would re-run frr_opt() with its side effects (exit on -h/-v, double-allocation of modules, errors++ path), and zrouter.t_rib_sweep loses the -K value once it has fired, so none of those routes are reliable. /proc/self/cmdline always holds the original argv. When -K was passed without a value the plugin falls back to ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME exactly as zebra itself does; non-GR startups keep delay = 0 (fire immediately) to match kernel-dplane wipe semantics.

With these pieces in place, zebra's existing SELFROUTE machinery works uniformly over grout-injected entries:

  • Reclaim: clients reconnecting over zapi take over matching SELFROUTE entries via rib_compare_routes; the fresh entry loses SELFROUTE and has a current uptime.
  • GR_RUN (metaQ sub-q 10): per-client early cleanup triggered by ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE works automatically because our entries carry the real protocol type (via origin2zebra) and a backdated uptime (older than client->restart_time).
  • rib_sweep_route catch-all: reprogrammed to fire after the full grout sync chain completes, purges any SELFROUTE entry nobody reclaimed, and the subsequent DPLANE_OP_ROUTE_DELETE ctxs flow through grout_add_del_route to remove the routes from grout's FIB (the install guard above is install/update only).

Problem

When zebra is killed and restarted while grout remains running, grout's FIB retains routes installed by the previous FRR instance. The plugin’s startup sync currently drops self-originated routes returned by the grout dump so zebra’s RIB never sees them; without a reclamation path these prefixes persist in grout and can blackhole traffic.

Technical changes

Route handling during startup sync

  • Thread a startup boolean through grout_route4_change(), grout_route6_change(), and grout_route_change() (true for initial dump, false for async notifications).
  • For dump-path entries, set ZEBRA_FLAG_SELFROUTE on the created route_entry and backdate re->uptime to zrouter.startup_time so they match the sweep predicate (SELFROUTE && uptime <= startup_time) and are reclaimable when protocol clients reannounce.
  • Preserve existing self-origin skip behavior for notification-path entries.

Dplane operations

  • In grout_add_del_route(), short-circuit DPLANE_OP_ROUTE_INSTALL and DPLANE_OP_ROUTE_UPDATE when the dplane context carries ZEBRA_FLAG_SELFROUTE (return success without redundant install/update round-trips).
  • Leave DPLANE_OP_ROUTE_DELETE unchanged so rib_sweep_route can uninstall unreclaimed SELFROUTE entries.
  • Add a targeted delete short-circuit for the SRTE end-of-replay sync marker to avoid needless uninstall round-trips.

Nexthop handling

  • When startup is true, lookup corresponding nexthop-hash entries and backdate their uptime to zrouter.startup_time; log a warning if the nexthop entry cannot be found.

Sweep rescheduling and sync marker

  • Append an IPv6 self-route sync marker (::/128) tagged with GROUT_SYNC_MARKER_TAG during replay to detect end-of-replay visibility.
  • After the last VRF in grout_sync_routes completes, cancel any prior zrouter.t_rib_sweep and re-arm it with a delay equal to the operator’s -K value (re-parsed from /proc/self/cmdline once via getopt_long and cached; fallback to ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME if -K provided without a value; non-GR startups keep delay = 0).
  • Defer arming rib_sweep_route until the marker is observed (or max retry reached), so backdated SELFROUTE entries receive the intended reclaim window; delete the marker once observed.

API/signature changes

  • grout_route4_change(bool new, struct gr_ip4_route *gr_r4) → grout_route4_change(bool new, struct gr_ip4_route *gr_r4, bool startup)
  • grout_route6_change(bool new, struct gr_ip6_route *gr_r6) → grout_route6_change(bool new, struct gr_ip6_route *gr_r6, bool startup)
  • grout_route_change(...) now accepts a bool startup parameter.
  • Add public constant GROUT_SYNC_MARKER_TAG.

Tests and infra

  • New smoke tests:
    • smoke/frr_restart_graceful_frr_test.sh: validates graceful restart with -K5, ensuring orphan routes are swept while configured routes persist and are reclassified in zebra.
    • smoke/frr_restart_sweep_no_k_frr_test.sh: validates sweep behavior when zebra is started without -K.
    • smoke/grout_restart_resync_frr_test.sh: validates grout self-resynchronization after grout restart while FRR remains running.
  • smoke/_init_frr.sh: keep flog variable non-local so tests can inspect log file after start.

Behavioral outcomes

  • Reconnecting protocol clients reclaim matching SELFROUTE entries via rib_compare_routes() (SELFROUTE removed, uptime updated to current).
  • Per-client early cleanup functions operate correctly because entries carry protocol type and backdated uptime.
  • rib_sweep_route, after the rescheduled delay, purges unreclaimed SELFROUTE entries and causes DPLANE_OP_ROUTE_DELETE flows that remove routes from grout’s FIB.

Status

  • Work in progress: unit tests still in development; author notes upstream FRR PR Startup after crash issues FRRouting/frr#21550 addresses similar kernel-dplane startup cases and recommends auditing NHG cleanup across zebra restarts as follow-up.

@maxime-leroy maxime-leroy marked this pull request as draft April 22, 2026 10:43
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a startup boolean to grout route-change handlers (grout_route4_change, grout_route6_change, internal grout_route_change) and passes it from call sites to distinguish replay/full-sync from notification-driven updates. During startup, self-originated routes and nexthops have their uptime backdated to zrouter.startup_time. Implements a replay-completion sync marker (an IPv6 selfroute with GROUT_SYNC_MARKER_TAG) that is injected, polled for, deleted on observation, and used to defer arming rib_sweep_route. Adds startup short-circuits on dplane up-path for selfroutes and an SRTE-marker delete short-circuit. Adds grout_read_k_from_cmdline(), header macro, smoke tests, and makes flog global in the test init script.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@frr/zebra_dplane_grout.c`:
- Around line 166-178: The getopt_long call is using the shortopts string
"-:K::" so the 'K' option is declared with an optional argument, which means
space-separated args like "-K 30" won't populate optarg; change the shortopts to
require an argument (use "K:" in the shortopts, e.g. "-:K:") so both attached
("-K30") and separated ("-K 30") forms are accepted, keep the existing long
option entry in lo, and continue setting ret = atoi(optarg) inside the
getopt_long loop when c == 'K' and optarg is non-NULL to parse the timeout
correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 357a74cc-6952-473e-b474-fc3c6704842c

📥 Commits

Reviewing files that changed from the base of the PR and between 46807d1 and ddc7d5d.

📒 Files selected for processing (3)
  • frr/rt_grout.c
  • frr/rt_grout.h
  • frr/zebra_dplane_grout.c

Comment thread frr/zebra_dplane_grout.c
@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch 2 times, most recently from ac471d2 to 9bd72e3 Compare April 22, 2026 11:00
@grout-bot grout-bot force-pushed the cleanup_fib_routes_frr_startup branch from 9bd72e3 to ca3e506 Compare April 23, 2026 10:00
@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

I am still working on a unit test, please don't merge it.

@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

Note: FRRouting/frr#21550 covers the same startup-after-crash class on the kernel-dplane side. It also includes a proto-owned NHG refcount fix. NHG cleanup across zebra restarts should be audited for grout as well (follow-up).

@maxime-leroy maxime-leroy marked this pull request as ready for review April 23, 2026 10:39
@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch from ca3e506 to 8e53f72 Compare April 23, 2026 15:04
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@frr/zebra_dplane_grout.c`:
- Around line 207-213: zebra_dplane_grout currently treats
grout_read_k_from_cmdline() return value `0` as absent and substitutes
ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME, which breaks explicit `-K0`; update the logic
around zrouter.graceful_restart to distinguish "no value" (-1) from a valid 0 by
using a check like (k >= 0) ? k : ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME so that an
explicit 0 yields immediate sweep; adjust the block that sets `delay` before
calling event_add_timer(zrouter.master, rib_sweep_route, NULL, delay,
&zrouter.t_rib_sweep) and keep references to grout_read_k_from_cmdline(),
ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME, zrouter.t_rib_sweep, and rib_sweep_route.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9affa067-4cf8-477c-88ce-06717c97538d

📥 Commits

Reviewing files that changed from the base of the PR and between ca3e506 and 8e53f72.

📒 Files selected for processing (4)
  • frr/rt_grout.c
  • frr/rt_grout.h
  • frr/zebra_dplane_grout.c
  • smoke/frr_restart_sweep_frr_test.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • frr/rt_grout.h

Comment thread frr/zebra_dplane_grout.c
@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

WIP STILL

@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch from 8e53f72 to 81d3080 Compare April 24, 2026 12:37
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
smoke/frr_restart_graceful_frr_test.sh (1)

108-110: ⚠️ Potential issue | 🔴 Critical

Duplicate of the marker-tag mismatch — grep value is wrong here too.

tag 1735157248 is 0x676C6600, not GROUT_SYNC_MARKER_TAG (0x67727400 = 1735554048). The guard never matches, so the "marker still in zebra RIB after sweep" assertion is inert in this test as well. Fix together with the sweep-no-k smoke test (see the detailed comment on smoke/frr_restart_sweep_no_k_frr_test.sh lines 101-103).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smoke/frr_restart_graceful_frr_test.sh` around lines 108 - 110, The grep is
using the wrong numeric tag (1735157248) for the sync marker; replace that
literal with the correct GROUT_SYNC_MARKER_TAG value (0x67727400 / 1735554048)
or reference the shared constant/variable used elsewhere so the check matches
the actual marker; update the same change in the paired sweep-no-k test to keep
both guards consistent (search for the grep invocation or the string "tag
1735157248" in this script and in smoke/frr_restart_sweep_no_k_frr_test.sh and
swap to GROUT_SYNC_MARKER_TAG or 1735554048).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@smoke/frr_restart_sweep_no_k_frr_test.sh`:
- Around line 101-103: The test greps for the wrong hard-coded decimal tag;
replace the literal "1735157248" with a decimal derived from the correct hex
macro (GROUT_SYNC_MARKER_TAG == 0x67727400U). Compute the decimal in the shell
(e.g. marker_dec=$((0x67727400)) or convert the macro value to unsigned via
printf/%u) and use that variable in the grep check (grep -qE "$marker_dec") so
the test actually matches the tag printed by vtysh; update both occurrences (the
checks in frr_restart_sweep_no_k_frr_test.sh and
frr_restart_graceful_frr_test.sh) to use this computed value.

---

Duplicate comments:
In `@smoke/frr_restart_graceful_frr_test.sh`:
- Around line 108-110: The grep is using the wrong numeric tag (1735157248) for
the sync marker; replace that literal with the correct GROUT_SYNC_MARKER_TAG
value (0x67727400 / 1735554048) or reference the shared constant/variable used
elsewhere so the check matches the actual marker; update the same change in the
paired sweep-no-k test to keep both guards consistent (search for the grep
invocation or the string "tag 1735157248" in this script and in
smoke/frr_restart_sweep_no_k_frr_test.sh and swap to GROUT_SYNC_MARKER_TAG or
1735554048).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 965dfbed-b4b1-4d60-b04c-53fa3fce363d

📥 Commits

Reviewing files that changed from the base of the PR and between 8e53f72 and 81d3080.

📒 Files selected for processing (7)
  • frr/rt_grout.c
  • frr/rt_grout.h
  • frr/zebra_dplane_grout.c
  • frr/zebra_dplane_grout.h
  • smoke/_init_frr.sh
  • smoke/frr_restart_graceful_frr_test.sh
  • smoke/frr_restart_sweep_no_k_frr_test.sh
✅ Files skipped from review due to trivial changes (1)
  • frr/zebra_dplane_grout.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • frr/rt_grout.h

Comment thread smoke/frr_restart_sweep_no_k_frr_test.sh Outdated
@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch from 81d3080 to 93e0d98 Compare April 24, 2026 15:16
@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

This PR must be carefully review it before any merge please.

@christophefontaine @rjarry @vjardin

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
frr/zebra_dplane_grout.c (1)

283-291: ⚠️ Potential issue | 🟠 Major

Explicit -K0 is silently replaced by the default GR window.

grout_read_k_from_cmdline() returns -1 for absent/value-less -K and 0 only when the operator explicitly attached -K0 (e.g., -K0 in zebra_options). The > 0 check conflates the two, so -K0 ends up extending the stale-route reclaim by ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME instead of sweeping immediately — the opposite of what was requested. Use >= 0 to distinguish "parsed zero" from "absent".

🔧 Suggested fix
 	long delay = 0;
 	if (zrouter.graceful_restart) {
-		long window = (grout_ctx.gr_cleanup_time > 0) ?
+		long window = (grout_ctx.gr_cleanup_time >= 0) ?
 			grout_ctx.gr_cleanup_time :
 			ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME;
 		time_t now = monotime(NULL);
 		time_t deadline = zrouter.startup_time + window;
 		delay = (deadline > now) ? (deadline - now) : 0;
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frr/zebra_dplane_grout.c` around lines 283 - 291, The grace-period handling
incorrectly treats an explicit zero as "absent" because it checks
grout_ctx.gr_cleanup_time > 0; change that check to >= 0 so that an explicit -K0
(parsed as 0 by grout_read_k_from_cmdline) results in delay=0; adjust the
conditional using the symbols grout_ctx.gr_cleanup_time,
ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME and the existing delay calculation in the block
guarded by zrouter.graceful_restart to use >= 0 instead of > 0.
smoke/frr_restart_graceful_frr_test.sh (1)

114-116: ⚠️ Potential issue | 🔴 Critical

Same wrong marker-tag decimal as in the sweep-no-K test — assertion unreachable.

0x67727400 is 1735554048, not 1735157248. This grep never matches the tag vtysh prints, so the "still in zebra RIB after sweep" check can't actually fail. Derive the decimal from the hex macro to keep the check honest (see fix in frr_restart_sweep_no_k_frr_test.sh).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smoke/frr_restart_graceful_frr_test.sh` around lines 114 - 116, The grep uses
a wrong hardcoded decimal tag (1735157248) so the check is unreachable; update
the assertion to derive the decimal from the hex marker (0x67727400) instead of
using the incorrect literal—replace the hardcoded "tag 1735157248" with a
shell-derived decimal (e.g. using $((0x67727400)) or printf conversion) or the
same variable/expansion pattern used in frr_restart_sweep_no_k_frr_test.sh so
the grep matches the actual tag emitted by vtysh.
smoke/frr_restart_sweep_no_k_frr_test.sh (1)

96-98: ⚠️ Potential issue | 🔴 Critical

Grep pattern will never match — assertion is a silent no-op.

GROUT_SYNC_MARKER_TAG is 0x67727400 = 1735554048, not 1735157248 (that would be 0x676C6600). vtysh prints the tag in decimal, so the current regex never fires and any regression leaving the marker in zebra's RIB after the sweep passes silently — exactly the thing this assertion is meant to catch.

🔧 Derive the decimal from the macro so it stays correct
-vtysh -c "show ipv6 route ::/128 detail" 2>/dev/null \
-	| grep -qE "tag 1735157248" \
-	&& fail "sync marker still in zebra RIB after sweep"
+marker_tag=$(printf '%u' 0x67727400)
+vtysh -c "show ipv6 route ::/128 detail" 2>/dev/null \
+	| grep -qE "tag $marker_tag" \
+	&& fail "sync marker still in zebra RIB after sweep"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@smoke/frr_restart_sweep_no_k_frr_test.sh` around lines 96 - 98, The grep is
using the wrong decimal tag (1735157248) so the check never fires; update the
grep to derive the correct decimal from the GROUT_SYNC_MARKER_TAG hex value
instead of hardcoding the wrong number. Locate the vtysh check (vtysh -c "show
ipv6 route ::/128 detail" | grep -qE "tag 1735157248") and replace the literal
with a shell expression that converts the macro hex to decimal (e.g. use
$((0x67727400)) or printf "%d" 0x67727400) so the regex matches the actual tag
printed by vtysh.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@frr/zebra_dplane_grout.c`:
- Around line 283-291: The grace-period handling incorrectly treats an explicit
zero as "absent" because it checks grout_ctx.gr_cleanup_time > 0; change that
check to >= 0 so that an explicit -K0 (parsed as 0 by grout_read_k_from_cmdline)
results in delay=0; adjust the conditional using the symbols
grout_ctx.gr_cleanup_time, ZEBRA_GR_DEFAULT_RIB_SWEEP_TIME and the existing
delay calculation in the block guarded by zrouter.graceful_restart to use >= 0
instead of > 0.

In `@smoke/frr_restart_graceful_frr_test.sh`:
- Around line 114-116: The grep uses a wrong hardcoded decimal tag (1735157248)
so the check is unreachable; update the assertion to derive the decimal from the
hex marker (0x67727400) instead of using the incorrect literal—replace the
hardcoded "tag 1735157248" with a shell-derived decimal (e.g. using
$((0x67727400)) or printf conversion) or the same variable/expansion pattern
used in frr_restart_sweep_no_k_frr_test.sh so the grep matches the actual tag
emitted by vtysh.

In `@smoke/frr_restart_sweep_no_k_frr_test.sh`:
- Around line 96-98: The grep is using the wrong decimal tag (1735157248) so the
check never fires; update the grep to derive the correct decimal from the
GROUT_SYNC_MARKER_TAG hex value instead of hardcoding the wrong number. Locate
the vtysh check (vtysh -c "show ipv6 route ::/128 detail" | grep -qE "tag
1735157248") and replace the literal with a shell expression that converts the
macro hex to decimal (e.g. use $((0x67727400)) or printf "%d" 0x67727400) so the
regex matches the actual tag printed by vtysh.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 387be13f-6954-42f6-9ba5-187e630da105

📥 Commits

Reviewing files that changed from the base of the PR and between 81d3080 and 93e0d98.

📒 Files selected for processing (8)
  • frr/rt_grout.c
  • frr/rt_grout.h
  • frr/zebra_dplane_grout.c
  • frr/zebra_dplane_grout.h
  • smoke/_init_frr.sh
  • smoke/frr_restart_graceful_frr_test.sh
  • smoke/frr_restart_sweep_no_k_frr_test.sh
  • smoke/grout_restart_resync_frr_test.sh
🚧 Files skipped from review as they are similar to previous changes (2)
  • smoke/_init_frr.sh
  • frr/rt_grout.c

@vjardin
Copy link
Copy Markdown
Contributor

vjardin commented Apr 24, 2026

what's happen if in between, the marker is injected, there is a crash, then restart with another marker injected (while the former one is not removed) ?

@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

what's happen if in between, the marker is injected, there is a crash, then restart with another marker injected (while the former one is not removed) ?

Grout crash:

The old marker does not survive a grout restart because the reconnect path wipes zebra’s RIB through "grout_ns_reset() -> vrf_terminate()". The marker is removed along with all the other routes before the next sync injects a new one.

Zebra crash:

The marker is not installed into the FIB, only kept in zebra’s RIB for polling visibility. After a zebra crash, that RIB state is gone, so there is no stale marker left when zebra restarts.

@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch 9 times, most recently from 4f9c7d7 to c56c82f Compare April 27, 2026 15:57
@vjardin
Copy link
Copy Markdown
Contributor

vjardin commented Apr 28, 2026

LGTM

@maxime-leroy : it seems OK for me. I feel a bit lost by the over verbose commit logs.

For sure, such feature turns the usage of FRR/zebra with Grout to a nice product ready system wich independent services that can be upgraded isolated. It should help during runtime and development too. It was a real missing piece.

Cosmetic: I'd prefer a shorter commit log that summarizes the commits's intends. Or if we want to keep such long commit logs which has some values, to have 2 sections into the commit log: a TL;DR summary at the beginning with the intends and then the descriptions you did.

@maxime-leroy maxime-leroy force-pushed the cleanup_fib_routes_frr_startup branch 4 times, most recently from fcc414b to d546057 Compare April 29, 2026 12:53
When grout dies, the plugin wipes zebra's RIB via vrf_terminate()
and re-syncs from scratch. But rib_add_multipath does not attach
routes synchronously: it enqueues a zebra_early_route in
META_QUEUE_EARLY_ROUTE, and rib_link runs only when that sub-queue
drains. ere's still pending at vrf_terminate time can attach to a
freed route_table or leak into the recreated VRF.

Introduce a barrier marker the plugin can poll to detect drain
completion: a dummy ::/128 SRTE entry with tag GROUT_SYNC_MARKER_TAG
and distance INFINITY. rib_process skips it (no SELECT, no
redistribute, no dplane install) and FIFO ordering within
subq[EARLY_ROUTE] (a single global queue, not per-VRF) guarantees
every prior ere is attached when the marker becomes observable via
route_node_lookup.

In grout_reconnect, inject the marker before vrf_terminate; once
polling observes it, run grout_ns_reset and re-schedule grout_sync.
Stale markers from a previous cycle are flushed via
rib_meta_queue_early_route_cleanup + rib_delete to keep the matched
poll unambiguous.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
When zebra is killed and restarted while grout keeps running, every
route and nexthop-group from the previous FRR instance stays in
grout's FIB as an orphan. The startup sync drops self-originated
routes returned by the dump, so zebra's RIB never sees them and
nothing garbage-collects them. Stale prefixes can blackhole traffic
to dead nexthops indefinitely.

The kernel dplane already handles this through the metaQ. Routes are
not attached synchronously: rib_add_multipath enqueues a
zebra_early_route in META_QUEUE_EARLY_ROUTE, and rib_link only runs
when that sub-queue drains. The kernel side re-injects kernel-read
routes as ZEBRA_FLAG_SELFROUTE placeholders, protocol clients reclaim
them via rib_compare_routes, and rib_sweep_route purges anything
unreclaimed at zrouter.startup_time + K. Donald Sharp's 9e74dda819bf
("zebra: Delay some processing until after startup is finished") and
1cd1d048407 ("zebra: Fixup some startup issues") stamp startup_time
after route_read drains, so the operator's K window is measured from
the actual end of startup rather than process-start.

Transplant the mechanism. grout_sync runs from frr_config_post, well
after zebra_main_router_started has stamped startup_time and armed
t_rib_sweep, so the plugin has to glue around the upstream timing.

Drain barrier. Reuse the marker primitive: inject ::/128 SRTE at the
tail of the dump, poll until observable, then re-stamp startup_time
= monotime(NULL) and arm rib_sweep_route at delay = K. FIFO ordering
in META_QUEUE_EARLY_ROUTE guarantees every prior SELFROUTE injection
is attached and visible to rib_sweep_route by the time the marker
becomes observable.

Native sweep. zebra_main_router_started arms t_rib_sweep before our
dump completes. Pre-arm zrouter.t_rib_sweep with a long placeholder
delay at frr_late_init: event_add_timer's *t_ptr!=NULL early-return
(lib/event.c:1430) makes the native arm a no-op. The marker observe
path then cancels and re-arms with the real -K.

-K itself is re-parsed once from /proc/self/cmdline:
zebra_di.gr_cleanup_time is static in main.c and zrouter.t_rib_sweep
loses the value after firing.

Nexthop-groups follow the route lifecycle: rib_sweep_route only
sweeps routes, but dump-injected NHEs lose their refcount when the
last referencing route is swept and are then GC'd by FRR's
nexthop-group keep timer.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
Validate the SELFROUTE lifecycle for a zebra crash while grout keeps
running. Two staticd prefixes drive the test:

- prefix_orphan: installed via vtysh runtime only, never persisted.
  After SIGKILL+restart, staticd has no memory of it; the plugin's
  SELFROUTE placeholder is never reclaimed, rib_sweep_route wipes
  it at T+K.

- prefix_kept: installed via vtysh AND appended to frr.conf. At
  restart, mgmtd replays the integrated config, staticd re-pushes
  via zapi within -K, rib_compare_routes promotes the SELFROUTE
  entry to ZEBRA_ROUTE_STATIC, the sweep predicate no longer
  matches.

start_frr's $flog is dropped from the local declaration so tests
sourcing _init_frr.sh can grep the FRR log file directly.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
Variant of frr_restart_graceful_frr_test.sh that runs zebra with the
default command line (no -K). graceful_restart=false and
zebra_main_router_started arms rib_sweep_route at delay=0, racing
the plugin's async replay chain. The pre-arm placeholder in
zd_grout_plugin_init is what neutralises that race; the test
asserts exactly one rib_sweep firing as proof it worked.

The NH deletion event is not asserted: without -K the sweep races
mgmtd's zapi apply of "nexthop-group keep 1" and the keep-around
timer defaults to 180s, exceeding the smoke test window. This is a
pre-existing FRR limitation unrelated to the plugin fix.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
Validate the reverse direction: zebra survives, grout is SIGKILL'd
and respawned with an empty FIB. The plugin detects the broken
socket, calls grout_ns_reset which wipes zebra's RIB via
vrf_terminate, recreates the default VRF, and re-syncs from the
repopulated grout. staticd (still alive across the crash) re-pushes
its route via zapi when the nexthop becomes resolvable through the
recreated port.

Poll grcli's JSON snapshot rather than the events stream: the
original grcli events subscriber died with the first grout instance
and is not respawned by the smoke framework.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
@grout-bot grout-bot force-pushed the cleanup_fib_routes_frr_startup branch from d546057 to e16cf66 Compare April 29, 2026 13:30
@christophefontaine christophefontaine marked this pull request as draft April 29, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants