RTC: Tag persisted-doc init with dedicated origin to prevent false dirty state on load#77503
Conversation
…y state When RTC is enabled, opening a post whose persisted `_crdt_document` has drifted from the REST record (after publish, or any server-side mutation that the CRDT doc predates) caused the editor to falsely enter a dirty state: Save button active on load, beforeunload warning firing, and Publish button replaced by Save on drafts. Root cause: in `_applyPersistedCrdtDoc`, `Y.applyUpdateV2(targetDoc, update)` was called without an origin. The record observer (`onRecordUpdate`) therefore treated the apply as a peer-originated change and dispatched `editRecord` with whatever fields differed between the persisted CRDT state and the REST record, writing those diffs into the editor's edit bucket and flipping `hasEditsForEntityRecord` to true. Fix: introduce `PERSISTED_DOC_INIT_ORIGIN` in `@wordpress/sync/config` and pass it as the third argument to `Y.applyUpdateV2` at the initial apply. The record observer now skips `editRecord` dispatch for transactions carrying that origin. The reconciliation path below the apply still runs and heals any persisted-doc drift via `persistCRDTDoc`, so self-healing behavior is preserved. Peer-originated updates (from providers) retain their own origins and continue to dirty the editor correctly, so real collaborative typing still works. A regression test was added to the sync manager test suite.
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @Gustavo-Hilario! In case you missed it, we'd love to have you join us in our Slack community. If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information. |
|
@chriszarate @ellatrix — tagging you for review since you own the RTC subsystem and Core Trac #64622. This targets a specific post-publish dirty-state scenario that #75975 didn't close. Happy to iterate on approach, naming, or test shape. |
|
@Gustavo-Hilario Thank you for taking a look at this issue! I think your fix here makes sense, but I'd like your help in scoping down the "Testing Instructions" section a bit. I first tried reproducing some of these scenarios on Gutenberg
1-false-dirty-state.movI didn't see a dirty state on load or
2-post-publish-shows-bug.movAfter step 3 the Save button is dirty, which appears to be a bug. After reloading, the save button is correctly disabled, so step 4 doesn't appear to be part of the reproduction.
3-publish-button-drafts.movI'm seeing the same behavior on
4-collaboration-works-shows-bug.movThis appears to show a real bug. After typing in one browser, the other browser's "Save draft" button does not become active. However, I also see this same bug in the current branch ( 4-collaboration-works-shows-bug.movIs this bug addressed here, or maybe I'm missing something?
5-reconciliation-drift.movIn Overall, can you help me locate a specific reproduction scenario that this PR addresses for testing? There are some related tests here that are difficult to verify what they're showing, and they're adding noise to the testing instructions. Thank you! |
|
Finally managed to reproduce the issue on a vanilla WP site running latest Gutenberg. The problem is that a meta registration change can create orphaned data in existing CRDT documents:
|
|
This PR fixes the above scenario for me, but according to the Claude Code session I've been running to debug this, it seems that a better approach would be to just ignore meta entries undefined in the REST API:
diff --git a/packages/core-data/src/utils/crdt.ts b/packages/core-data/src/utils/crdt.ts
index 6b674623c43..f1a2b3c4e5a 100644
--- a/packages/core-data/src/utils/crdt.ts
+++ b/packages/core-data/src/utils/crdt.ts
@@ -342,7 +342,12 @@ export function getPostChangesFromCRDTDoc(
case 'meta': {
allowedMetaChanges = Object.fromEntries(
Object.entries( newValue ?? {} ).filter(
- ( [ metaKey ] ) =>
- ! disallowedPostMetaKeys.has( metaKey )
+ ( [ metaKey ] ) => {
+ if ( disallowedPostMetaKeys.has( metaKey ) ) {
+ return false;
+ }
+ // Skip meta keys not present in the REST record.
+ // These are orphaned entries from previously registered
+ // meta fields that are no longer valid for this post type.
+ if ( currentValue == null || ! ( metaKey in currentValue ) ) {
+ return false;
+ }
+ return true;
+ }
)
); |
|
I think we can close this one in favor of #77529. Thanks @Gustavo-Hilario for your work here! It really helped identifying the root issue. |
What?
See #76402 — related to RTC's sticky unsaved-changes behavior.
Adds a dedicated transaction origin (
PERSISTED_DOC_INIT_ORIGIN) for the initialY.applyUpdateV2call that hydrates a post's live Y.Doc from its persisted_crdt_documentpost_meta. The record observer (onRecordUpdate) now skipseditRecorddispatch for transactions carrying that origin, so hydration of the persisted doc no longer masquerades as a peer edit. The reconciliation path that heals any drift between the persisted doc and the REST record is unchanged.Why?
With RTC enabled, opening a post whose persisted CRDT document has drifted from the REST record (e.g. after publish, when the server mutates
status/slug/link, or after an out-of-band edit like WP-CLI) falsely puts the editor into a dirty state:beforeunloadwarning fires when navigating away.This has been surfacing via customer reports on WordPress.com Atomic sites running Gutenberg ≥ 22.8 (when RTC became enabled by default in #75739).
Root cause:
_applyPersistedCrdtDoccallsY.applyUpdateV2(targetDoc, update)without an origin. The record observer then treats the resulting transaction as a peer update and dispatcheseditRecordwith whatever fields differ between persisted CRDT and REST record. Those diffs land in the editor's non-transient edit bucket (sincestatus,slug,linkare not marked transient), andhasEditsForEntityRecordflips totrue.#75975 closed an adjacent staleness path (pending Y.Doc updates at save time) but did not cover server-side mutations that arrive after the CRDT doc is persisted — that's the window this PR addresses.
How?
Three changes in
@wordpress/sync, one additive test:packages/sync/src/config.ts— export a newPERSISTED_DOC_INIT_ORIGINstring constant, documented as internal sync-manager use only.packages/sync/src/manager.ts—_applyPersistedCrdtDoc, pass the origin as the third argument toY.applyUpdateV2. Leaves the existing warning comment about not wrapping in a local-origintransact()intact; the third-arg origin doesn't advance the state vector as a local client (the polling provider already uses the same pattern for its own origin tagging).onRecordUpdate, add an early return whentransaction.origin === PERSISTED_DOC_INIT_ORIGIN, placed before the existingtransaction.local && !(origin instanceof Y.UndoManager)guard. An inline comment explains why the explicit guard is kept even though the subsequent one would also catch it — the intent needs to be clear and the suppression must survive any future change to Yjs locality semantics on the apply.packages/sync/src/test/manager.ts— new regression test in the "persisted CRDT doc behavior" suite: sets up a drifted persisted doc, loads the entity, flushes the microtask queue, assertseditRecordwas not called, and asserts positively that the reconciliation path still ran (getChangesFromCRDTDoc/applyChangesToCRDTDoc/persistCRDTDoceach invoked once with the expected drift-healing payload). The positive assertions are there specifically to prevent the test from passing vacuously if a future refactor short-circuits_applyPersistedCrdtDoc.Peer-originated updates from providers continue to dispatch
editRecordnormally — provider transactions carry their own origins, not this one — so real-time collaborative typing is unaffected.Testing Instructions
Prerequisite: RTC enabled (Settings → Writing → "Enable real-time collaboration" is on).
False dirty state on load
beforeunloadwarning.Post-publish state
Publish button on drafts
Collaboration still works (regression)
Reconciliation still heals drift (regression)
_crdt_documentand the row)._crdt_documentback in sync with the REST record.Testing Instructions for Keyboard
No UI-visible changes; existing keyboard interactions are unchanged. Keyboard testing is not required.
Screenshots or screencast
Use of AI Tools
This PR was developed with Anthropic's Claude as a pair-programming assistant, under my review. The root-cause analysis, code changes, and regression test were reviewed manually and through an additional automated code review pass (Claude + OpenAI Codex) before committing. All code has been read, understood, and tested by me.