Fix resync remote devices on receive PDU in worker mode.#7815
Conversation
The replication client requires that arguments are given as keyword arguments, which was not done in this case. We also pull out the logic so that we can catch and handle any exceptions raised, rather than leaving them unhandled.
| """ | ||
|
|
||
| try: | ||
| await self.store.mark_remote_user_device_cache_as_stale(sender) |
There was a problem hiding this comment.
FWIW this function will cause the next iteration of the resync retry loop (introduced in #7453) to retry this device list, which means that we can have 2 resync happen at the same time for a given user. This may be fine, but at least we should be aware of it.
There was a problem hiding this comment.
Oh, interesting. I think its still correct to go and fire off a resync, as a) we don't want to wait for the loop to come round necessarily, b) its safe and c) I don't really want to assume that the loop will come round quickly or even retry this particular user (it might have backoff logic etc)
There was a problem hiding this comment.
Yeah it makes sense, again I think it's fine as long as we're aware of it :)
. o O ( we could avoid this issue by storing and checking an in-memory list of the users we're currently resyncing (i.e. update it in ignore me, that still wouldn't work, and it's not that big a deal anyway.user_device_resync) but that's probably out of scope here )
richvdh
left a comment
There was a problem hiding this comment.
could you give us some clues about the symptoms of this bug? was it throwing exceptions?
|
|
||
| if resync: | ||
| await self.store.mark_remote_user_device_cache_as_stale(event.sender) | ||
| run_in_background(self._resync_device, event.sender) |
There was a problem hiding this comment.
while you're here, can you fix this to be run_as_background_process please.
| return run_in_background( | ||
| self._device_list_updater.user_device_resync, event.sender | ||
| ) | ||
| async def _resync_device(self, sender: str): |
There was a problem hiding this comment.
| async def _resync_device(self, sender: str): | |
| async def _resync_device(self, sender: str) -> None: |
|
This produces the glorious stack trace of the following, I don't think it had any other observable behaviour. |
…x_worker_fderation_device_resync
* commit 'f1245dc3c': Fix resync remote devices on receive PDU in worker mode. (#7815)
The replication client requires that arguments are given as keyword
arguments, which was not done in this case. We also pull out the logic
so that we can catch and handle any exceptions raised, rather than
leaving them unhandled.