-
Notifications
You must be signed in to change notification settings - Fork 2
Description
What happened?
After updating from v2.0.6 to v2.1.0 via TestFlight, the app entered a crash loop on first launch. The new fetchOrphanedChannelMonitorsIfNeeded code (introduced in v2.1.0 via commits d90d7d2e and 1345585d) ran for the first time because the rnChannelRecoveryChecked UserDefaults key did not exist in v2.0.6.
The recovery code fetched a stale channel manager and channel monitor from the old RN remote backup (channel 5e89998d... at update_id 5, commitment numbers 3/2). This data was passed as a ChannelDataMigration to lightningService.setup(), which fed it to ldk-node's builder. The builder unconditionally wrote the stale data to VSS, overwriting the current ChannelManager that tracked two active channels (d87c2879... and f32667f2...).
On peer connect, LDK detected the node had "fallen behind" for channel 5e89998d... (peer proved knowledge of commitment states far beyond update_id 5) and called log_and_panic!("We have fallen behind..."). With panic = "abort" in the release profile, this killed the process instantly — no crash trace in the Swift logger.
The app crashed on every manual relaunch (13 attempts between 07:54–08:12 UTC). Each time, LDK loaded the stale ChannelManager from VSS, connected to peer, and panicked again. The last attempt was offline, which prevented the peer connect and thus the panic — but the app remains broken whenever it goes back online.
Chain of events in code
WalletViewModel.start()line 141:channelMigration == nil, falls through tofetchOrphanedChannelMonitorsIfNeededfetchOrphanedChannelMonitorsIfNeededline 259:isChannelRecoveryChecked == false(never set in v2.0.6), guard passesMigrationsService.fetchRNRemoteLdkData(): fetches stalechannel_manager+ 1 monitor from RN backup server- Back in
start()lines 143-147: stale data packaged asChannelDataMigration, passed tolightningService.setup(channelMigration:) ldk-node builder.rslines 1646-1654: unconditionally writes stale monitor to VSS KVStore — no freshness check- LDK loads stale ChannelManager, only knows channel
5e89998d...at update_id 5 - Peer sends
channel_reestablishproving it knows commitment states far beyond → LDK panics ("fallen behind") isChannelRecoveryCheckedwas set totrueon line 151, so migration doesn't re-run, but VSS already has stale data
Expected behavior
The orphaned channel recovery check should not apply stale RN backup data when the native LDK node already has an active, up-to-date ChannelManager with its own channels. The recovery was intended for users who migrated from RN and had channels that were never picked up — not for users whose native node has been running successfully for weeks.
Steps to Reproduce
- Run Bitkit iOS v2.0.6 with an active Lightning channel (migrated from RN previously, with the RN backup still on the remote server)
- Update to v2.1.0 via TestFlight
- Launch the app
- App crashes shortly after startup (during LDK peer connect)
- Every subsequent launch crashes the same way
Logs / Screenshots / Recordings
bitkit_logs_2026-03-06_08-12-04.zip
Key log lines from first crash session (bitkit_foreground_2026-03-06_07-54-20.log):
[07:54:28.320] Prepared 1/1 channel monitors for migration - Migration
[07:54:28.323] Found 1 monitors on RN backup for pre-startup recovery - WalletViewModel
[07:54:31.299] Migrating channel monitor: e1a200a95611d0490b2de78328ed706bee6619258283d6fadf8458378d99895e_0
[07:54:31.371] Applied channel migration: 1 monitors
[07:54:37.998] Successfully loaded channel 5e89998d... at update_id 5
[07:54:38.000] Queueing monitor update to ensure missing channel d87c2879... is force closed
[07:54:38.000] Queueing monitor update to ensure missing channel f32667f2... is force closed
Then the log just stops — the Rust panic with abort kills the process before Swift can log anything.
Contrast with the last healthy session from v2.0.6 (bitkit_foreground_2026-03-05_11-52-58.log):
[11:53:07.649] Reconnected channel d87c2879... with no loss
Bitkit Version
v2.1.0 (build 181) — crashed on update from v2.0.6
Device / OS
iPhone (physical device, TestFlight)
Reproducibility
Always
This will happen to any user who:
- Migrated from RN to native iOS
- Has the old RN backup still on the remote server
- Had active channels on the native app
- Updates from v2.0.6 (or earlier) to v2.1.0
Additional context
- Migrated from RN, had been running native iOS since at least Feb 17
- Lightning channels were healthy and active for weeks before the update
- The
fetchOrphanedChannelMonitorsIfNeededfeature was introduced in v2.1.0 (commitd90d7d2e, refactored in1345585d) - Issue recover force-closed channel funds lost during RN migration #459 is the original request that led to this recovery code
- Issue UI feedback for channel_monitors migration failures #428 is related (UI feedback for migration failures)
- Two bugs compound here:
- Bitkit iOS: No guard to skip orphaned channel recovery when the native node already has active channels
- ldk-node:
builder.rsmigration code writes unconditionally to VSS without checking if newer data exists (noupdate_idcomparison)
- User impact: App crashes on every launch when online. The two previously-active channels (
d87c2879...,f32667f2...) are now being force-closed. Channel5e89998d...is irrecoverably stale and will need peer-side force-close for fund recovery.
Recommended fixes
- Immediate (iOS): Skip orphaned channel recovery if the native LDK node already has a valid ChannelManager in VSS storage. Alternatively, set
isChannelRecoveryChecked = trueduring app migration completion so it's neverfalsefor already-migrated users. - Defense (ldk-node): Add
update_id/ freshness comparison inbuilder.rsbefore overwriting existing channel monitors. - Recovery (this user): Stale ChannelManager in VSS needs to be cleared. Peer should force-close
5e89998d...for on-chain fund recovery.