Skip to content

[Bug]: orphaned channel recovery overwrites active ChannelManager with stale RN backup, causing crash on every launch #479

@piotr-iohk

Description

@piotr-iohk

What happened?

After updating from v2.0.6 to v2.1.0 via TestFlight, the app entered a crash loop on first launch. The new fetchOrphanedChannelMonitorsIfNeeded code (introduced in v2.1.0 via commits d90d7d2e and 1345585d) ran for the first time because the rnChannelRecoveryChecked UserDefaults key did not exist in v2.0.6.

The recovery code fetched a stale channel manager and channel monitor from the old RN remote backup (channel 5e89998d... at update_id 5, commitment numbers 3/2). This data was passed as a ChannelDataMigration to lightningService.setup(), which fed it to ldk-node's builder. The builder unconditionally wrote the stale data to VSS, overwriting the current ChannelManager that tracked two active channels (d87c2879... and f32667f2...).

On peer connect, LDK detected the node had "fallen behind" for channel 5e89998d... (peer proved knowledge of commitment states far beyond update_id 5) and called log_and_panic!("We have fallen behind..."). With panic = "abort" in the release profile, this killed the process instantly — no crash trace in the Swift logger.

The app crashed on every manual relaunch (13 attempts between 07:54–08:12 UTC). Each time, LDK loaded the stale ChannelManager from VSS, connected to peer, and panicked again. The last attempt was offline, which prevented the peer connect and thus the panic — but the app remains broken whenever it goes back online.

Chain of events in code

  1. WalletViewModel.start() line 141: channelMigration == nil, falls through to fetchOrphanedChannelMonitorsIfNeeded
  2. fetchOrphanedChannelMonitorsIfNeeded line 259: isChannelRecoveryChecked == false (never set in v2.0.6), guard passes
  3. MigrationsService.fetchRNRemoteLdkData(): fetches stale channel_manager + 1 monitor from RN backup server
  4. Back in start() lines 143-147: stale data packaged as ChannelDataMigration, passed to lightningService.setup(channelMigration:)
  5. ldk-node builder.rs lines 1646-1654: unconditionally writes stale monitor to VSS KVStore — no freshness check
  6. LDK loads stale ChannelManager, only knows channel 5e89998d... at update_id 5
  7. Peer sends channel_reestablish proving it knows commitment states far beyond → LDK panics ("fallen behind")
  8. isChannelRecoveryChecked was set to true on line 151, so migration doesn't re-run, but VSS already has stale data

Expected behavior

The orphaned channel recovery check should not apply stale RN backup data when the native LDK node already has an active, up-to-date ChannelManager with its own channels. The recovery was intended for users who migrated from RN and had channels that were never picked up — not for users whose native node has been running successfully for weeks.

Steps to Reproduce

  1. Run Bitkit iOS v2.0.6 with an active Lightning channel (migrated from RN previously, with the RN backup still on the remote server)
  2. Update to v2.1.0 via TestFlight
  3. Launch the app
  4. App crashes shortly after startup (during LDK peer connect)
  5. Every subsequent launch crashes the same way

Logs / Screenshots / Recordings

bitkit_logs_2026-03-06_08-12-04.zip

Key log lines from first crash session (bitkit_foreground_2026-03-06_07-54-20.log):

[07:54:28.320] Prepared 1/1 channel monitors for migration - Migration
[07:54:28.323] Found 1 monitors on RN backup for pre-startup recovery - WalletViewModel
[07:54:31.299] Migrating channel monitor: e1a200a95611d0490b2de78328ed706bee6619258283d6fadf8458378d99895e_0
[07:54:31.371] Applied channel migration: 1 monitors
[07:54:37.998] Successfully loaded channel 5e89998d... at update_id 5
[07:54:38.000] Queueing monitor update to ensure missing channel d87c2879... is force closed
[07:54:38.000] Queueing monitor update to ensure missing channel f32667f2... is force closed

Then the log just stops — the Rust panic with abort kills the process before Swift can log anything.

Contrast with the last healthy session from v2.0.6 (bitkit_foreground_2026-03-05_11-52-58.log):

[11:53:07.649] Reconnected channel d87c2879... with no loss

Bitkit Version

v2.1.0 (build 181) — crashed on update from v2.0.6

Device / OS

iPhone (physical device, TestFlight)

Reproducibility

Always

This will happen to any user who:

  • Migrated from RN to native iOS
  • Has the old RN backup still on the remote server
  • Had active channels on the native app
  • Updates from v2.0.6 (or earlier) to v2.1.0

Additional context

  • Migrated from RN, had been running native iOS since at least Feb 17
  • Lightning channels were healthy and active for weeks before the update
  • The fetchOrphanedChannelMonitorsIfNeeded feature was introduced in v2.1.0 (commit d90d7d2e, refactored in 1345585d)
  • Issue recover force-closed channel funds lost during RN migration #459 is the original request that led to this recovery code
  • Issue UI feedback for channel_monitors migration failures #428 is related (UI feedback for migration failures)
  • Two bugs compound here:
    1. Bitkit iOS: No guard to skip orphaned channel recovery when the native node already has active channels
    2. ldk-node: builder.rs migration code writes unconditionally to VSS without checking if newer data exists (no update_id comparison)
  • User impact: App crashes on every launch when online. The two previously-active channels (d87c2879..., f32667f2...) are now being force-closed. Channel 5e89998d... is irrecoverably stale and will need peer-side force-close for fund recovery.

Recommended fixes

  1. Immediate (iOS): Skip orphaned channel recovery if the native LDK node already has a valid ChannelManager in VSS storage. Alternatively, set isChannelRecoveryChecked = true during app migration completion so it's never false for already-migrated users.
  2. Defense (ldk-node): Add update_id / freshness comparison in builder.rs before overwriting existing channel monitors.
  3. Recovery (this user): Stale ChannelManager in VSS needs to be cleared. Peer should force-close 5e89998d... for on-chain fund recovery.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions