This repository was archived by the owner on Apr 26, 2024. It is now read-only.
Remove _get_events_cache check optimisation from _have_seen_events_dict#14161
Merged
Conversation
Checking this cache is currently an invalid assumption, as the _get_event_cache is not correctly invalidated when purging events from a room. Remove this optimisation for now as its causing more harm than good. We can re-add it after fixing _get_event_cache.
Member
Author
|
I wasn't really sure how to test this automatically... thus I did so manually. The steps I took were:
Before this change, Bob was met with a broken room. Here is how it looked in the database: and if we explore the room state in Element: There's only 2 events total! Bob's invite and his subsequent join. All other events were in the After this PR, we instead see (most) of the events we were expecting. All state events are present, but messages are missing. This is because backfilling events also checks the event cache. This case is fixed in #14164. |
erikjohnston
approved these changes
Oct 17, 2022
Member
Author
|
I successfully tested the following on my homeserver on develop (2c63cdc) + #14164 + #14161 (this PR):
Worked as expected 🎉 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Fixes #11521, credit to @richvdh for the idea!
When we join a room in Synapse (
/send_join), we receive the auth chain and the current set of state from the room. Those events are passed throughEventWorkerStore.have_seen_events()(viaFederationEventHandler._auth_and_persist_outliers), and any events that we think we've already seen, we drop. Seems sensible.As an optimisation,
_have_seen_events_dict(called fromhave_seen_events) checks the_get_events_cachebefore checking the database:synapse/synapse/storage/databases/main/events_worker.py
Lines 1498 to 1510 in 29269d9
Unfortunately, due to #13476, the
_get_event_cachedoes not have entries for events invalidated in case of a room purge. What that means is that you'll:_get_event_cache_get_event_cachehave_seen_eventsthinks those events are already persistedm.room.createevent!).Ideally we'd fix things so that entries in
_get_event_cacheare correctly invalidated when a room is purged. There is a WIP plan to do so, but it's a big job. For now, we can just remove this optimisation as a quick win, as it's causing more harm than good. (The optimisation was originally added in #9601).(If you look closely, you'll notice that
_have_seen_events_dicthas a cache as well. Not to worry, that cache is correctly cleared when a room is purged.)Note that when you backfill (for non-state events) from a remote homeserver, those also go through
_get_event_cacheand will still be dropped on the floor. #14164 is the fix for that part.