Skip to content

RTC: Tag persisted-doc init with dedicated origin to prevent false dirty state on load#77503

Closed
Gustavo-Hilario wants to merge 1 commit into
WordPress:trunkfrom
Gustavo-Hilario:fix/rtc-persisted-doc-init-origin
Closed

RTC: Tag persisted-doc init with dedicated origin to prevent false dirty state on load#77503
Gustavo-Hilario wants to merge 1 commit into
WordPress:trunkfrom
Gustavo-Hilario:fix/rtc-persisted-doc-init-origin

Conversation

@Gustavo-Hilario
Copy link
Copy Markdown

@Gustavo-Hilario Gustavo-Hilario commented Apr 20, 2026

What?

See #76402 — related to RTC's sticky unsaved-changes behavior.

Adds a dedicated transaction origin (PERSISTED_DOC_INIT_ORIGIN) for the initial Y.applyUpdateV2 call that hydrates a post's live Y.Doc from its persisted _crdt_document post_meta. The record observer (onRecordUpdate) now skips editRecord dispatch for transactions carrying that origin, so hydration of the persisted doc no longer masquerades as a peer edit. The reconciliation path that heals any drift between the persisted doc and the REST record is unchanged.

Why?

With RTC enabled, opening a post whose persisted CRDT document has drifted from the REST record (e.g. after publish, when the server mutates status / slug / link, or after an out-of-band edit like WP-CLI) falsely puts the editor into a dirty state:

  • Save button is active on load with no user edits.
  • The browser's beforeunload warning fires when navigating away.
  • On drafts, the primary action flips from Publish to Save, requiring users to "save" before they can publish.

This has been surfacing via customer reports on WordPress.com Atomic sites running Gutenberg ≥ 22.8 (when RTC became enabled by default in #75739).

Root cause: _applyPersistedCrdtDoc calls Y.applyUpdateV2(targetDoc, update) without an origin. The record observer then treats the resulting transaction as a peer update and dispatches editRecord with whatever fields differ between persisted CRDT and REST record. Those diffs land in the editor's non-transient edit bucket (since status, slug, link are not marked transient), and hasEditsForEntityRecord flips to true.

#75975 closed an adjacent staleness path (pending Y.Doc updates at save time) but did not cover server-side mutations that arrive after the CRDT doc is persisted — that's the window this PR addresses.

How?

Three changes in @wordpress/sync, one additive test:

  1. packages/sync/src/config.ts — export a new PERSISTED_DOC_INIT_ORIGIN string constant, documented as internal sync-manager use only.

  2. packages/sync/src/manager.ts

    • Import the new constant.
    • In _applyPersistedCrdtDoc, pass the origin as the third argument to Y.applyUpdateV2. Leaves the existing warning comment about not wrapping in a local-origin transact() intact; the third-arg origin doesn't advance the state vector as a local client (the polling provider already uses the same pattern for its own origin tagging).
    • In onRecordUpdate, add an early return when transaction.origin === PERSISTED_DOC_INIT_ORIGIN, placed before the existing transaction.local && !(origin instanceof Y.UndoManager) guard. An inline comment explains why the explicit guard is kept even though the subsequent one would also catch it — the intent needs to be clear and the suppression must survive any future change to Yjs locality semantics on the apply.
  3. packages/sync/src/test/manager.ts — new regression test in the "persisted CRDT doc behavior" suite: sets up a drifted persisted doc, loads the entity, flushes the microtask queue, asserts editRecord was not called, and asserts positively that the reconciliation path still ran (getChangesFromCRDTDoc / applyChangesToCRDTDoc / persistCRDTDoc each invoked once with the expected drift-healing payload). The positive assertions are there specifically to prevent the test from passing vacuously if a future refactor short-circuits _applyPersistedCrdtDoc.

Peer-originated updates from providers continue to dispatch editRecord normally — provider transactions carry their own origins, not this one — so real-time collaborative typing is unaffected.

Testing Instructions

Prerequisite: RTC enabled (Settings → Writing → "Enable real-time collaboration" is on).

False dirty state on load

  • Open any previously-published post in the block editor.
  • Save button is disabled on load.
  • Closing the tab does not trigger a beforeunload warning.

Post-publish state

  • Create a new draft, add a title and a paragraph block.
  • Click Publish and confirm.
  • Save button stays disabled immediately after the publish completes (no reload needed).
  • Save button remains disabled after reloading the page.

Publish button on drafts

  • Create a new draft with content.
  • The top-right primary action reads Publish (not Save).

Collaboration still works (regression)

  • Open the same post in two browsers signed in as different users.
  • Type in browser A.
  • Browser B receives the updates.
  • Browser B's Save button becomes active (correct — there are real unsaved edits from the peer).

Reconciliation still heals drift (regression)

  • With RTC off, publish a post (creates divergence between its _crdt_document and the row).
  • Re-enable RTC and reload the editor.
  • No false dirty state appears.
  • The next save/load cycle brings _crdt_document back in sync with the REST record.

Testing Instructions for Keyboard

No UI-visible changes; existing keyboard interactions are unchanged. Keyboard testing is not required.

Screenshots or screencast

Before After
Before fix — Save button active and beforeunload warning firing on reload with no changes After fix — Save button disabled on reload with no changes
Loading or reloading a published post with no user changes still shows the Save button as active, and the browser warning appears when trying to reload or exit the editor. Loading or reloading the same post shows the Save button as inactive. No spurious beforeunload warning.

Use of AI Tools

This PR was developed with Anthropic's Claude as a pair-programming assistant, under my review. The root-cause analysis, code changes, and regression test were reviewed manually and through an additional automated code review pass (Claude + OpenAI Codex) before committing. All code has been read, understood, and tested by me.

…y state

When RTC is enabled, opening a post whose persisted `_crdt_document` has drifted
from the REST record (after publish, or any server-side mutation that the CRDT
doc predates) caused the editor to falsely enter a dirty state: Save button
active on load, beforeunload warning firing, and Publish button replaced by Save
on drafts.

Root cause: in `_applyPersistedCrdtDoc`, `Y.applyUpdateV2(targetDoc, update)` was
called without an origin. The record observer (`onRecordUpdate`) therefore
treated the apply as a peer-originated change and dispatched `editRecord` with
whatever fields differed between the persisted CRDT state and the REST record,
writing those diffs into the editor's edit bucket and flipping
`hasEditsForEntityRecord` to true.

Fix: introduce `PERSISTED_DOC_INIT_ORIGIN` in `@wordpress/sync/config` and pass
it as the third argument to `Y.applyUpdateV2` at the initial apply. The record
observer now skips `editRecord` dispatch for transactions carrying that origin.
The reconciliation path below the apply still runs and heals any persisted-doc
drift via `persistCRDTDoc`, so self-healing behavior is preserved.

Peer-originated updates (from providers) retain their own origins and continue
to dirty the editor correctly, so real collaborative typing still works.

A regression test was added to the sync manager test suite.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 20, 2026

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: Gustavo-Hilario <gustavohappyeng@git.wordpress.org>
Co-authored-by: alecgeatches <alecgeatches@git.wordpress.org>
Co-authored-by: mmtr <mmtr86@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions github-actions Bot added [Package] Sync First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository labels Apr 20, 2026
@github-actions
Copy link
Copy Markdown

👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @Gustavo-Hilario! In case you missed it, we'd love to have you join us in our Slack community.

If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information.

@Gustavo-Hilario
Copy link
Copy Markdown
Author

@chriszarate @ellatrix — tagging you for review since you own the RTC subsystem and Core Trac #64622. This targets a specific post-publish dirty-state scenario that #75975 didn't close. Happy to iterate on approach, naming, or test shape.

@alecgeatches
Copy link
Copy Markdown
Contributor

@Gustavo-Hilario Thank you for taking a look at this issue! I think your fix here makes sense, but I'd like your help in scoping down the "Testing Instructions" section a bit. I first tried reproducing some of these scenarios on Gutenberg trunk and most didn't show an existing bug:

False dirty state on load

  • Open any previously-published post in the block editor.
  • Save button is disabled on load.
  • Closing the tab does not trigger a beforeunload warning.
1-false-dirty-state.mov

I didn't see a dirty state on load or beforeunload warning in trunk.


Post-publish state

  • Create a new draft, add a title and a paragraph block.
  • Click Publish and confirm.
  • Save button stays disabled immediately after the publish completes (no reload needed).
  • Save button remains disabled after reloading the page.
2-post-publish-shows-bug.mov

After step 3 the Save button is dirty, which appears to be a bug. After reloading, the save button is correctly disabled, so step 4 doesn't appear to be part of the reproduction.


Publish button on drafts

  • Create a new draft with content.
  • The top-right primary action reads Publish (not Save).
3-publish-button-drafts.mov

I'm seeing the same behavior on trunk.


Collaboration still works (regression)

  • Open the same post in two browsers signed in as different users.
  • Type in browser A.
  • Browser B receives the updates.
  • Browser B's Save button becomes active (correct — there are real unsaved edits from the peer).
4-collaboration-works-shows-bug.mov

This appears to show a real bug. After typing in one browser, the other browser's "Save draft" button does not become active. However, I also see this same bug in the current branch (fix/rtc-persisted-doc-init-origin):

4-collaboration-works-shows-bug.mov

Is this bug addressed here, or maybe I'm missing something?


Reconciliation still heals drift (regression)

  • With RTC off, publish a post (creates divergence between its _crdt_document and the row).
  • Re-enable RTC and reload the editor.
  • No false dirty state appears.
  • The next save/load cycle brings _crdt_document back in sync with the REST record.
5-reconciliation-drift.mov

In trunk I'm seeing the described behavior as well.


Overall, can you help me locate a specific reproduction scenario that this PR addresses for testing? There are some related tests here that are difficult to verify what they're showing, and they're adding noise to the testing instructions. Thank you!

@Mamaduka Mamaduka added [Type] Bug An existing feature does not function as intended [Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration labels Apr 21, 2026
@mmtr
Copy link
Copy Markdown
Contributor

mmtr commented Apr 21, 2026

Finally managed to reproduce the issue on a vanilla WP site running latest Gutenberg. The problem is that a meta registration change can create orphaned data in existing CRDT documents:

  • Enable RTC in Settings -> Writing
  • Add this to a mu-plugin:
add_action( 'init', function () {
  register_post_meta( '', 'my_custom_meta', array(
    'show_in_rest'  => true,
    'single'        => true,
    'type'          => 'boolean',
    'default'       => false,
    'auth_callback' => '__return_true',
  ) );
} );
  • Go to Posts -> Add New
  • Add a title and a paragraph block
  • Convert the paragraph block to a reusable patten
  • Save the post as draft
  • Remove the mu-plugin
  • Open the draft in the editor
  • ❌ Publish button is missing

@mmtr
Copy link
Copy Markdown
Contributor

mmtr commented Apr 21, 2026

This PR fixes the above scenario for me, but according to the Claude Code session I've been running to debug this, it seems that a better approach would be to just ignore meta entries undefined in the REST API:

PR #77503 would only partially mask this — it fixes the persisted doc init path, but the polling provider could still trigger the same editRecord call.

The real fix is either cleaning up orphaned meta from CRDT docs, or making the meta comparison in getPostChangesFromCRDTDoc skip fields that are undefined in the REST record.

diff --git a/packages/core-data/src/utils/crdt.ts b/packages/core-data/src/utils/crdt.ts                                                               
index 6b674623c43..f1a2b3c4e5a 100644                                                                                                                  
--- a/packages/core-data/src/utils/crdt.ts  
+++ b/packages/core-data/src/utils/crdt.ts                                                                                                             
@@ -342,7 +342,12 @@ export function getPostChangesFromCRDTDoc(                                                                                        
                              case 'meta': {                                                                                                           
                                      allowedMetaChanges = Object.fromEntries(                                                                         
                                              Object.entries( newValue ?? {} ).filter(                                                                 
-                                                     ( [ metaKey ] ) =>                                                                               
-                                                             ! disallowedPostMetaKeys.has( metaKey )                                                  
+                                                     ( [ metaKey ] ) => {
+                                                             if ( disallowedPostMetaKeys.has( metaKey ) ) {                                           
+                                                                     return false;                                                                  
+                                                             }                                                                                        
+                                                             // Skip meta keys not present in the REST record.
+                                                             // These are orphaned entries from previously registered                                 
+                                                             // meta fields that are no longer valid for this post type.                            
+                                                             if ( currentValue == null || ! ( metaKey in currentValue ) ) {
+                                                                     return false;                                                                    
+                                                             }                                                                                        
+                                                             return true;                                                                             
+                                                     }                                                                                                
                                              )                                                                                                      
                                      );      

@mmtr
Copy link
Copy Markdown
Contributor

mmtr commented Apr 22, 2026

I think we can close this one in favor of #77529. Thanks @Gustavo-Hilario for your work here! It really helped identifying the root issue.

@mmtr mmtr closed this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository [Package] Sync [Type] Bug An existing feature does not function as intended

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants